How much GPU power do you need for your Adobe video applications After looking at the Quadro K1200 and Quadro M4000, PW came up with an answer that surprised us.
We just had 2 new Skylake-based workstations in the office. True to form, they were both "entry-level" workstations, and these days, "entry-level" means "pretty darn powerful" with 3.6 GHz Skylake CPUs, a Quadro K1200 & M4000, up to 64 GB of RAM and up to 27 TB of storage.
As it turned out the test systems gave PW the opportunity to test new Maxwell-architecture NVIDIA Quadro GPUs - the K1200 and the M4000 - side by side.
One set of tests we run to benchmark hardware performance are rendering tests for Adobe Premier Pro CC and Adobe After Effects CC. For this article we were running the latest 2015 version.
When PW tests a system, it is certainly to measure performance, but it is also to investigate questions about performance. In this case, we wanted to investigate how the Mercury Playback Engine performs under a range of conditions. We set up our test video file with 4 tests. All video segments are HD and all video segments use color correction effects. The 4 tests render 1, 2, 3, and 4 concurrent video streams, respectively.
Any number of scenarios support testing multiple video stream conditions: titling, overlaying effects, merging multiple cameras, and inserting secondary images are a few examples. That said, many Premier Pro users probably use the first test case for most of their work: a single video stream at a time with color correction or a similar effect. At PW, we shoot video with 2 or 3 cameras and routinely have 3 or 4 over-lapping video streams to render out in addition to titles and over-laying extra video footage.
So as to provide relevant test measurements for you, the table results have been normalized to show the number of seconds required to render a second of video.
What stands out are several results.
- The GPU-accelerated rendering is excellent when using one or two video streams. It renders faster than realtime, and it performs at twice the speed of the Mercury Playback Engine using software (CPU) rendering only.
- The MPE performance drops dramatically when a third video stream is rendered simultaneously.
- The CPU rendering is faster than the GPU rendering in the MPE when rendering 3 or 4 simultaneous video streams
- The GPU-accelerated tests ran at the same speed on the Quadro K1200 and M4000.
|Premier Pro CC 2015|
|Board/Test||1 video stream||2 video streams||3 video streams||4 video streams|
* Rendering time (sec) / sec of video (25 fps) HD video
These are interesting results. First, it says that the Quadro K1200 is the right GPU for your Adobe video-editing workstation. PW would have expected the much more capable Quadro M4000 to out-pace the more modest Quadro K1200. The conclusion appears to be that the Mercury Playback Engine itself functions as the performance bottleneck. Whether the reason behind this is due to video management, storage, or other issues, is not clear.
For the most common video-editing scenarios, a Quadro K1200 is the perfect graphics card for Adobe Premier Pro
What is clear is this. The performance delta between 1 video stream and 2 concurrent video streams when using CPU rendering is what PW would expect - around twice the time for about twice the rendering work. It is impressive that the GPU rendering performance delta in the same circumstances imposes no more than a 28% penalty. This is, frankly a surprise. But it means that for the first and second test cases, which apply to most editing situations, the Quadro K1200 is an excellent upgrade for your system.
The extreme drop in performance between 2 and 3 concurrent video streams is inexplicable. Going from 2 HD video streams to 3 HD video streams increases the data by 50% yet the rendering times for CPU rendering jumped by more than 200% and the rendering times for the GPU rendering increased a remarkable 800% to 900%.
PW cannot explain this performance degradation, but we can give you an idea what it looks like. While testing, we monitored the GPU overhead and the CPU overhead. You can see in the image to the right, the CPU utilization (from left to right) for 2 video streams with GPU, 2 video streams with CPU, 4 video streams with GPU, and 4 video streams with CPU.
In the first column, the CPU is almost, but not fully utilized. The CPU usage follows a pattern of small peaks. The same pattern is mirrored on the GPU where GPU utilization swings from 0% to 30%. The second column is the CPU renderer. Each thread in each core goes to 100% and stays there. Compare this to the first column where the addition of the GPU renders the same video 2.5 times faster. The third column is the CPU utilization while rendering 4 video streams concurrently using the GPU renderer and the fourth column is the same video sequence using the CPU renderer. In the 3rd column, both the CPU and the GPU had a lower utilization, and this rendering combination was the slowest. In the fourth column is the CPU utilization for rendering 4 video streams with still exceptionally slow results. Unlike the second column, the CPU is not being fully utilized.
Unlike Premier Pro CC, Adobe After Effects CC doesn't really benefit from GPU computing
PW also benchmarked Adobe After Effects. With two new NVIDIA Quadro GPUs we planned to measure After Effects 3D ray-tracing renderer. This feature is apparently considered obsolete and is not supported by Adobe. New GPUs like the Quadro M4000 are not recognized, and the ray-tracing renderer returns an error.
Render times in After Effects are notoriously inconsistent. Due to this, we ran our PW logo rendering test 4 times and provide an average result. As with the Premier Pro CC testing, the normalized results represent the number of seconds required to render one second of video.
The rendering time for a second of After Effects video can be more than 100 times longer than the rendering time for Premier Pro and the GPU basically doesn't help at all for After Effects. An After Effects workstation needs extreme computing power. In fact a good After Effects CC workstation needs high performance subsystems like memory and storage, too. One hint of this can be seen in the CPU utilization levels which rise and fall in peaks that can easily range from 50% to 100% utilization. This indicates that the processor is waiting on the rest of the system a fair amount of the time.
|After Effects CC 2015 Render Test||Results in seconds: number of seconds to render one second of video|
|Test 1||Test 2||Test 3||Test 4||AVERAGE|
|Intel Xeon E3-1275 v5 @ 3.6 Ghz||113.33||100.00||123.00||123.33||114.92|
Peaks and valleys in CPU usage indicate possible bottlenecks in other parts of the workstation
The PW Perspective
Our original expectations were that the Quadro M4000 would provide the highest performance. If After Effects CC had still supported 3D ray-tracing and if the Mercury Playback Engine could have more efficiently use the GPU, this might have been true.
On the other hand, the Quadro K4000 not only offers excellent Premier Pro CC performance, the board supports 4K resolutions and can even support four 4K displays. And while it is not important to the results, all our testing has been done on a 4K monitor. (for more on 4K displays, see : ThinkVision Pro2840M: professional 4K display) And with these displays at affordable prices, a video editing workstation should have a minimum of two 4K displays.
This makes the NVIDA Quadro K1200 the first choice for an Adobe video-editing workstation.
NVIDIA's Quadro K1200 is based on the Maxwell GPU architecture. It is a half-height graphics board which supports 4 mini-DisplayPort outputs and can drive four 4K-monitors.
The NVIDIA Quadro M4000 has 8 GB of GDDR5 memory and fast 3D performance. Unfortunately, it doesn't add any more performance for Premier Pro than the Quadro K1200.