Wowza Community

Nvidia drivers - NVENC performance

Hi,

We are trying to add transcoding capacity to our servers by adding a second Nvidia Quadro K4000 card and I have some questions.

– Physical Hardware –

OS: CentOS 6.6

MOBO: Supermicro X9DRE-TF+

Processor: Dual Xeon E5-2620 2.10GHz

Cores/Threads: 12/24

Memory: 32GB RAM

Java: Java 8 (64-bit)

GPU/Acceleration: NVIDIA Quadro K4000 (x2)

#1: Are certain Nvidia drivers known to have reduced NVENC encoding performance with Wowza Streaming Engine?

When trying different Nvidia drivers, there seems to be a wide range of NVENC encoding performance as reported by the ‘nvidia-smi’ utility.

The older 340.46 driver consistently gives better performance (with WMS 4.1.1 and WMS 4.3.0).

At first, I thought the ‘nvidia-smi’ utility was misreporting the encoder load with newer drivers, but then I noticed entries in the Wowza access log pertaining to skipped frames when the encoder load hit 99% (the utility does not report loads > 99%).


DRIVER TEST RESULTS


All tests done with a single 7Mbps source stream and using a transcoder template with 500, 1000, 1500, 2000, 2500, 3500 bitrates.

Wowza StreamingEngine 4.1.1

nvidia driver

GPU ‘Encoder’ Load



340.46

35-40%

346.35

80-90%

346.59

50-60%

352.63

INCOMPATIBLE?

Wowza StreamingEngine 4.3.0

nvidia driver

GPU ‘Encoder’ Load



340.46

35-40%

352.63

50-60%

#2: Is there some added benefit to using the latest driver even though encoding performance is reduced?

#3: Could this be an issue only with certain cards?

Carl

Hello,

We recommend using the latest NVIDIA drivers. In our testing using 4.3.0.02, we have found that the latest drivers (352.63) allow higher GPU utilization percentages than older drivers and a substantial performance improvement.

Best regards,

Andrew

Hello,

We have seen in some instances that the GPU utilization has been limited when using Wowza Streaming Engine in combination with the older NVIDIA 340 drivers. Meaning with some version combinations, GPU utilization was unable to reach near or 100% capacity but would be limited to a much lower utilization percentage. This appears to be solved by using the latest version of the NVIDIA driver (352.63) with Wowza Streaming Engine 4.3.0.02.

Best regards,

Andrew

Hi Andrew,

Thanks for your response.

However, I’m unclear on what you mean by ‘allow higher GPU utilization percentages’.

Does that mean to say the GPU Encoder load can exceed 99% without performance issues, e.g. skipped frames?

That is not what I’m seeing.

I’ve repeated the above tests several more times with WMS 4.3.0 and the following Nvidia drivers: 340.46, 352.63, and 352.79.

This time I ran the tests using a Quadro K4200.

The newer drivers consistently show higher GPU Encoder load than 340.46.

In all cases, when the GPU Encoder load is consistently pegged at 99%, frame skip entries start to appear in the access log.

Or does ‘allow higher GPU utilization percentages…and a substantial performance improvement’ mean Wowza is aware the newer drivers do, in fact, cause increased GPU Encoder load, but there is a substantial qualitative improvement in the output video?

If so, can you specify what the performance improvements are?

I have not noticed any substantial qualitative performance improvements, but I will say that the 352.63 driver produced a subtle improvement in color rendering only in areas of macroblocking when testing high motion content. But this was only noticeable when the video was paused.

Quantitatively, using the newest drivers, I’m only seeing decreased performance (most importantly with respect to the number of simultaneous transcoding sessions that a single K4000 or K4200 can handle).

Has Wowza recently tested with 340.46 to do a performance comparison with the newer Nvidia drivers? With a K4000 or K4200?

Is it possible the newer drivers provide performance improvements only for newer generation Nvidia cards?

Sorry to press for more details, but something doesn’t add up and this information is important because we are using GPU Encoder load as an indicator for load balancing.