Hardware vs. Software Live Streaming EncodersAugust 9, 2022
If you’re producing a live stream using a program like OBS, vMix, or Wirecast, the x264 codec is almost certainly an output option. However, many computers also offer other options like NVIDIA or Intel hardware-based H.264 encoding. According to our tests, using these hardware-based codecs not only can free up CPU cycles for your video mixer but also can improve the quality of the video delivered to Wowza Video, Wowza Streaming Engine, or other live streaming or transcoding service. If you’ve never experimented with these options, now is definitely the time.
Table of Contents
Overview – Software vs. Hardware
Encoding a live stream into H.264 format is a challenging operation, one originally performed in dedicated hardware. As computers became more powerful, they became more capable of live encoding. Still, developers often had to limit the output quality from the codec to enable real-time operation.
Let’s take a step back. H.264 is a video standard, and x264 is an implementation of that standard available in FFmpeg, OBS, Wirecast, vMix, and most other live switchers. The developers of x264 use a system of presets to allow users to balance CPU usage and quality. These presets are shown in Figure 1, which is a screenshot from OBS.
As you probably can guess, presets like ultrafast and superfast use very little CPU, which enables real-time encoding, but the quality is much less than presets like medium (the default) and veryslow, which is what most on-demand producers use to encode for distribution. In OBS, the default preset for the x264 codec is veryfast, which allows real-time encoding with some sacrifice in quality.
Figure 2 shows how x264 presets vary in terms of encoding time, overall quality, and low-frame quality, the last simply the lowest VMAF score for any frame in the video, a predictor of transient quality. The numbers shown are the percentages of each measure; so, the veryfast preset delivers 95.6% of overall quality and 81.4% of low-frame quality in 6.86% of the encoding time of the placebo preset. The key takeaway is that there are meaningful quality differences associated with using lower quality presets, but also significant encoding time efficiencies.
Hardware codecs like those created by Intel and NVIDIA are also codec implementations of the H.264 standard, but they use dedicated hardware in the CPU (Intel) or GPU (NVIDIA) to drive encoding. This dedicated hardware is sufficiently powerful to encode in real-time without the quality trade-offs made by x264.
When these hardware codecs were originally launched, their quality trailed x264 considerably, even when using x264 presets like veryfast. In recent years, however, the quality of these hardware implementations has more than caught up. Now they exceed that available with x264/veryfast, and significantly reduce encoding-related CPU utilization, which makes for a more stable and robust live event.
To show this, I tested OBS on two computers, one comparing x264 and the NVIDIA codec, the other comparing x264 and Intels QuickSync H.264 codec. In both cases, I measured CPU encoding during the recording and the quality of the recorded video.
Test Bed 1: HP ZBook Studio G3
The first computer is an HP ZBook Studio running an Intel Xeon CPU E3-1505M v5 at 2.80GHz with 32 GB of RAM in Windows 10 Pro with an embedded NVIDIA Quadro M1000M GPU. The basic test was the same for both computers. I loaded a test video file into OBS and recorded 90 seconds of the video to disk using x264/veryfast, and then 90 seconds running the hardware H.264 codec, in this case NVIDIA.
I recorded CPU utilization in Windows Performance Monitor, and then compared the quality of the two files in the Moscow State University (MSU) Video Quality Measurement Tool (VQMT). You see the CPU comparison for the ZBook in Figure 3, right around 50% for x264, dropping to around 30%, and then 20% for NVIDIA.
The 50% utilization number for x264 is significant on this older notebook. Had it been in the 10-15% range, it would have made sense to try a higher-quality preset like fast or medium, seeking higher quality with x264. However, my personal comfort zone for CPU usage during a live event is 50-60%; any higher and I get concerned about dropping frames or other failures. On this computer, at least from my perspective, veryfast is the optimal preset choice.
After checking that the bandwidth of the two captured files was similar (OBS does a great job making this so), I compared quality. Figure 4 shows the Results Plot from the MSU VQMT tool, which is currently displaying the VMAF score for each of the 800 frames measured. The NVIDIA file is in red, and x264 in green, and you see right away that the red is higher than green, which means the NVIDIA file has an overall higher VMAF score.
You also see several green downward spike, indicating regions where x264’s quality dropped significantly during the recording. Given these limited tests, it was impossible to tell if this was incidental to the first few seconds of the recording or endemic.
Overall, as you can see in Table 1, the NVIDIA codec bested x264 in VMAF by 2.97% with a 2.58% advantage in PSNR.
Testbed 2: HP Mini-Workstation
The second test computer was a mini-HP workstation driven by an Intel Core i7-3770 CPU running at 3.40GHz with 16 GB of RAM on Windows 10 Pro. Here, I deployed the Intel QuickSync codec in OBS as compared to x264/veryfast.
Figure 5 shows CPU utilization, again with software-encoding first, then QuickSync. Here, the difference was even more striking, with x264 requiring about 40% CPU utilization as compared to just over 10% for QuickSync. Clearly, using QuickSync will free up significant CPU resources for other production elements.
Figure 6 shows the same graph from the Moscow State tool with Intel QuickSync in the higher quality red and x264 in green. Here we see one significant downward spike for Intel, though again it’s impossible to tell if this is a startup issue or a periodic problem.
Overall, however, as shown in Table 2, Intel outscored x264 by 1.85% in VMAF comparisons and 2.35% in PSNR. This advantage increased to 2.34% when measured by PSNR.
For perspective, both test computers are kind of old and slow. If I was working on a newer, zippier computer, I almost certainly could have deployed a higher quality x264 preset which may have leveled the playing field to some degree. On the other hand, the older hardware means that more recent codec implementations by Intel and NVIDIA are likely also both more efficient and produce higher quality.
There are several key takeaways from this article.
1. If you’re encoding on a new/fast computer, and using x264, you should experiment with higher quality presets until CPU utilization gets to a comfortable level, like 50% but not beyond. This will optimize the quality of your outgoing streams.
2. If you’ve avoided hardware encoding because of outdated concerns about comparative quality, it’s time to reconsider. It seems that almost irrespective the newness or speed of the live streaming platform, CPU utilization is almost always a concern. To the extent that you can free up resources by encoding in hardware, it’s almost certainly a good thing.
3. All that said, if your live streaming system is working perfectly, you may want to put off any experimentation like that described above. If it ain’t broke, don’t break it.
4. If you do decide to make a change, be sure to test it on a production length test cycle before you actually take it live.
5. Many producers who use software programs like Wirecast and vMix (and TriCaster for that matter) use an external hardware-based encoder for producing their live output streams, which totally removes the encoding load from your mixing station. In very high profile engagements, you should always consider this option as well.