Fast Workstation. Still Has Headroom.
It’s not every day that a workstation with Dual Quadro RTX 8000 graphics crosses my desk, so it was not going to leave without running Resolve 16 tests. The bottom line: the workstation is fast and Resolve 16 leverages the dual GPU configuration well.
The workstation comes from Scan Computers in the UK. It is a dual Quadro RTX 8000 workstation with two Intel Xeon Silver 4216 CPUs. GPU memory is a combined 96 GB and it includes 196 GB of system memory. Storage consists of a 2 TB SSD and an 8 TB HDD. It is all powered by a 1200W power supply.
Before testing, I had two goals: to examine how the workload is distributed on the two GPUs, and to check the rendering performance against my long-time NLE tool, Premiere Pro. I liked results for the former, and I’ll be spending more time on DaVinci Resolve due to the latter.
Dual Quadro RTX 8000 GPUs connected with NVLink provide computing power for DaVinci Resolve.
DaVinci Resolve and Multi-GPU Configurations
Adding a second GPU is an investment. It can pay off quickly if productivity increases, and that depends on the workstation’s responsiveness and its raw performance. This review's tests and analysis are designed to look at workload distribution. It is not a head-to-head, single-vs-dual GPU comparison. Normally I have no qualms digging into the hardware, however, this is not the target for this workstation. This workstation is configured, tuned, and tested for AI and data science.
Given that priority and the the costs of the hardware involved, I decided not to perform GPU brain surgery on this workstation. Others have compared single-GPU vs dual-GPU performance and found that a second GPU can increase Resolve performance by 75%. That seems reasonable based on my workload observations.
For performance testing, I recreated a Premiere Pro 4K multi-stream project in Resolve. The project renders three 4K video streams simultaneously. Each video stream has different effects including color correction, corner-pinning, gaussian blur, scaling, and drop boxes.
Regardless of the overall workload, Resolve balances the GPU workloads evenly between the two GPUs. The spread in GPU utilization between the first and the second GPU remains less than 10% throughout editing and rendering.
Facial Refinement is one of many GPU-accelerated features in DaVinci Resolve 16
In this multi-stream, 4K rendering test, the first GPU runs consistently close to 50% capacity. The second GPU is loaded at a capacity level in the low 40’s. The percent utilization remains stable throughout the rendering. The utilization doesn't spike and then drop to zero which is what I typically see when testing Premiere Pro.
Resolve uses the processor in a similarly orderly fashion. For the CPU workloads, Resolve spreads the work evenly across one processor. While it seems to ignore the second CPU in the system, it loads all 32 logical cores in the first CPU in a clean, organized fashion.
In contrast, GPU utilization under Premiere Pro is a series of spikes. With CPU utilization Premiere Pro scatters the work across many logical cores in a way that evokes randomness more than orderliness.
In the end, Resolve renders this one minute test project in 76 seconds while Premiere Pro requires 184 seconds. Resolve is almost two and a half times faster. That is impressive. With this workstation having two of the most powerful GPUs you can find, I suspect that this performance difference can be explained by Resolve utilizing the dual Quadro configuration better than Premiere.
But a word of caution: it is only a single data point. My conclusion is only speculation. This is not a head-to-head Resolve vs Premiere comparison. A real comparison would need to test several different projects, different features, and different hardware configurations. Additionally, software evolves constantly. Shortly after running these tests, Adobe released a version of Premiere Pro with GPU-specific rendering optimizations.
None the less, with this workstation and its dual Quadro RTX 8000 GPUs rendering a non-trivial project, DaVinci Resolve crushed Premiere Pro.
DaVinci Resolve’s Neural Engine
Getting the most performance out of dual GPUs depends on the application being efficient and using Quadro RTX’s special features. The first RTX products launched in late summer of 2018. Today, applications like Blackmagic Design’s DaVinci Resolve 16 take specific advantage of the Quadro RTX architecture.
Resolve uses RTX-specific features to apply AI to video processing. Features including Speed Warp for clean slow motion, a clip analysis feature with facial recognition, an enhanced Auto Color function, and Object Removal use Blackmagic’s Neural Engine.
Since Resolve leverages the RTX architecture for AI, it is not surprising that these features would put a greater load on the GPUs. Here, I use Speed Warp on a short 4K slow-motion example. I take a traffic clip and slow it down to 20% of its original 25 fps. Of course the result would be unusable with a jerky, five frames per second sequence.
1) original clip at 20% speed 2) improved motion after applying Optical Flow 3) cleaned up better quality video image after using Speed Warp
So I use Resolve's Optical Flow which is a cool feature that calculates intermediate video frames to restore the original fps and provide smooth slow motion. It's been available for a couple of years and does a good job smoothing the slow motion effect. The motion is smooth, but there are still objects which are clearly distorted in the interpolated frames of my example.
Speed Warp uses Blackmagic Design's Neural Engine to clean up the distorted areas and improve the video quality. As you see in short example clip that compares the original clip at 20%, the result from Optical Flow, and the result from Speed Warp, the resulting video after applying Speed Warp looks much better. Speed Warp makes an unusable clip usable.
As for the hardware, the GPU load goes up during the Speed Warp analysis. In this test with Speed Warp, the dual GPU’s hit 60% - 70% utilization. Keep in mind, these are Quadro RTX 8000 GPUs!
A Final Perspective
Blackmagic Design’s DaVinci Resolve 16 has a history of good GPU utilization and especially bringing a performance benefit to users with multiple GPUs. Our testing, with two high-performance Quadro RTX 8000 GPUs, shows that Resolve can leverage even this extreme level of performance.
Additionally, Resolve brings Blackmagic Design’s Neural Engine to users with RTX-specific capabilities. The software loads the GPUs efficiently and evenly. The resulting benefits are useful, productive features with outstanding performance. Outstanding performance, I should add, when using powerful GPUs.
While the workstation configuration is not cheap, DaVinci Resolve 16’s ability to leverage this high level of computing power allows demanding users to increase productivity. There is not a one-size-fits-all answer to GPU ROI, and you will need to do your own calculations. However, Resolve extracts extra performance from a second GPU and if your projects are particularly demanding, then the extra performance could pay off big for you.
In short, the gain in performance makes the workstation more responsive, makes you more productive, and makes rendering times drop. If you have tough projects, tight timelines, and use DaVinci Resolve, investing in dual GPUs for your workstation could be a very smart move.