H.264 Codec: Advanced Video Coding (AVC) Explained

June 14, 2021 by
Pixilated image to represent lossy compression of H.264/AVC codec.

 

This article provides a quick overview of the H.264 codec: what it is, how it performs, what it costs, and what it’s good for. We’ll conclude with a section on what you need to know to effectively deploy the H.264 codec, particularly within the Wowza Streaming Engine software and the Wowza Streaming Cloud service.

 

What Is the H.264 Codec?

H.264 is a video coding standard or codec. It was created by two standards bodies, the ITU-T Video Coding Experts Group, which called the standard H.264, and the ISO/IEC JTC1 Moving Picture Experts Group (MPEG), which called the standard Advanced Video Coding (AVC). H.264 and AVC refer to the same technology and are used interchangeably. Technically, AVC is defined in Part 10 of the MPEG-4 standard, and is different from the MPEG-4 codec, which was defined in Part 2 and never really got traction in the marketplace. H.264 first launched in May 2003 as the successor to MPEG-2, primarily for use in broadcast applications as the internet was nascent.

As a video standard, there are multiple implementations of the H.264 codec, including the x264 codec in FFmpeg, and the Beamr codec used in the Wowza Streaming Engine. Intel and NVIDIA both offer hardware-accelerated H.264 codecs that can be accessed from the Wowza Streaming Engine. Because H.264 is a standard, all video encoded by any H.264 codec should play on any H.264 decoder or player, whether in a browser, smartphone, or Smart TV.

 

Compatibility: H.264 vs. VP9, HEVC, and AV1

 
Table 1. Codec compatibility.
Codec Compatibility Browser Mobile Smart TV/OTT
H.264 Virtually all All All
VP9 Virtually all Android, iOS Most
HEVC Very little Android, iOS All
AV1 Edge, Firefox, Chrome, Opera Android Nascent
 

The streaming market was much simpler when H.264 first launched, with computers comprising the dominating platform for streaming video and one plug-in, Adobe Flash, the dominant video player for most websites. Once Adobe added H.264 support to Flash, the computer browser market shifted almost immediately to H.264. Today, the website Can I Use reports that H.264 playback is available on over 98.23% of global browsers.

Apple started supporting H.264 with the first iPod touch device and all subsequent iPod touch models, iPhones, and iPads. The Android platform added H.264 support with version 3, which shipped in 2011, wrapping up the two dominant mobile players. As a broadcast standard, H.264 was supported by many television sets and/or set-top boxes long before the video was delivered via IP.

Simply stated, the primary reason that H.264 continues to dominate the worldwide streaming markets is because it plays everywhere. That said, there are several submarkets where other codecs are making serious inroads.

 

Performance: H.264 vs. VP9, HEVC, and AV1

When gauging codec performance, you consider two aspects: (1) encoding complexity or speed, and (2) encoding quality. With most codecs, there’s an inverse relationship between the two. That is, the better the encoding quality, the longer it takes to encode. This translates to higher encoding and transcoding costs. Since no codec can completely replace H.264 at this point, these encoding or transcoding costs are additive to existing H.264 encoding-related costs.

 
Table 2. Codec performance.
Performance Compared to H.264 Encoding Complexity (Speed) Encoding Quality
H.264 Baseline Baseline
VP9 2–15x ~35%
HEVC 2–15x ~35%
AV1 15–30x ~50%
 

To explain table 2, encoding complexity is the encoding time as compared to H.264. So, for VP9, encoding should take between 2–15 times longer than H.264. Encoding quality describes the bitrate reduction the codec can achieve while delivering the same quality as H.264. So, streams encoded with VP9 and HEVC at 65% the data rate of a stream encoded with H.264 should have about the same quality.

Of course, better encoding quality offered by VP9, HEVC, and AV1 decreases bandwidth costs for delivery to compatible players, which can in turn offset the additional encoding and storage costs. Still, most companies (excluding the top tier of streaming broadcasters) tend to adopt new codecs only when they enable entry into new markets, not as a cost-saving measure.

 

Suitability: H.264 vs. VP9, HEVC, and AV1

We see in table 3 that codecs also vary in suitability for different functions. H.264 excels in live origination and transcode because of encoding speed; all other codecs need hardware support for both functions. Understandably, H.264 has received the lion’s share of attention in low-latency applications, whether HTTP-based or WebRTC.

 
Table 3. Suitability for different functions.
Codec Suitability Live Origination Live Transcode Low Latency 4K HDR
H.264 Excellent Excellent Excellent Poor Poor
VP9 Poor Poor WebRTC Excellent Poor
HEVC Good Good Nascent Excellent Excellent
AV1 Nascent Nascent WebRTC Excellent Nascent
 

On the other hand, if you compare H.264 vs. H.265 quality, you can see why H.264 is a poor choice for 4K video, since the ~35% bandwidth savings delivered by H.265 is significant and can make or break delivery to some homes. For this reason, though H.264 is technically capable of HDR support and is included in one Dolby Vision bitstream profile, the vast majority of Dolby Vision and other HDR streams are encoded with HEVC.

 

Royalty Status: H.264 vs. VP9, HEVC, and AV1

MPEG LA runs the H.264 patent pool, which charges for encoders and decoders in excess of 100,000 units per year. There are royalties on subscription services and pay-per-view; see the AVC Patent Portfolio License Briefing here (page 11).

 
Table 4. Royalty status.
Codec Royalty Status Encoder Decoder Paid Content Free Internet Content
H.264 Yes Yes Yes No
VP9 No Consumer device No No
HEVC Yes Yes Some Unclear
AV1 No Consumer Device No No
 

Though there have been several H.264 patent infringement suits by parties not included in the MPEG LA patent pool, these involved large hardware platform companies like Apple and Microsoft, not content publishers. At this point, other than the royalties charged by MPEG LA, there don’t appear to be any H.264 IP owners claiming royalties on content.

 

What You Need to Know to Produce H.264

Like all codecs, H.264 has dozens of configuration options that very few producers should ever touch. There are, however, three configuration options that all producers should understand.

 

Profiles

First are profiles, which are sets of encoding tools or algorithms that can be used to encode a file. You see this in table 5 from Wikipedia, which shows the tools and algorithms on the left and the profiles in the columns. The baseline profile (BP) uses relatively few encoding tools, making the bitstream lower in quality than the high profile (HiP) but easier to decode.

Why create profiles? To allow device manufacturers to deploy H.264 and still meet product cost and performance requirements. For example, in the first-generation iPod touch devices, Apple integrated silicon capable of playing the baseline profile to meet its target cost, size, and power consumption targets. Publishers producing video for that generation of devices encoded using the baseline profile; otherwise, the video wouldn’t play.

 
Table 5. The most commonly used profiles are baseline (BP), main (MP), and high (HiP).
Table 5. The most commonly used profiles are baseline (BP), main (MP), and high (HiP).
 

Today, virtually all newer smartphones, tablets, computers, OTT dongles, smart TVs, game platforms and other video playback devices can play the high profile, which is the highest profile used for streaming. Still, if you deliver streams to viewers watching on older devices you may want to ensure that you have one or two lower quality streams in your encoding ladder in the baseline or main (MP) profile. Because the profile is critical for playback compatibility, all H.264 encoders allow you to specify the profile in its user interface or scripting language.

If you’re delivering live or video on demand (VOD) streams to a service like Wowza Streaming Engine or Wowza Streaming Cloud, or transcoding streams via that service, check the services’ recommendations for profiles. For example, here’s what Wowza recommends in Set up and run Transcoder in Wowza Streaming Engine:

 
Figure 1. Profile recommendations for the Wowza Streaming Engine.
Figure 1. Profile recommendations for the Wowza Streaming Engine.
 

Levels

Levels apply additional encoding limitations on the profiles, like those shown in table 6, again from Wikipedia. To comply with Level 4.2, for example, the video can’t exceed 2048×1080@60fps or 50,000 Kbps. In the early days of delivering video to mobile devices, levels mattered. Today, though, most current devices can play any level that you can deliver to it.

Table 6. Level constraints from Wikipedia.
Table 6. Level constraints from Wikipedia.
 

Where levels remain important are requirements from service providers like Wowza. For example, figure 2 is from Encoding best practices for Wowza Streaming Cloud. Here you see that your incoming stream shouldn’t exceed level 4.1 for 1080p30 or 4.2 for 1080p60.

 
Figure 2. Level recommendations for the Wowza Streaming Engine.
 

Unlike Profiles, which typically have a configuration option in the encoder, you comply with level settings by configuring your video within the constraints of that level. You see this in the encoding preset from Wirecast shown in figure 3. While there is a specific configuration option for profile (set to high), there is no option for level. Instead, you have to set the resolution, frame rate, and data rate parameters within the restrictions set by the level. Since the parameters are 1080p30 @ 10000, the preset conforms to level 4.1 and will work with Wowza Streaming Engine.

 
Figure 3.  An encoding preset from Wirecast.
Figure 3.  An encoding preset from Wirecast.
 

Presets/Quality

The final H.264 encoding parameter that you should know about is the encoding preset, which is selected via the quality setting in figure 3. Presets change the configuration of certain codec options to allow streaming producers to choose their desired tradeoff between encoding quality and complexity. For VOD video, for example, you might choose a high-quality preset that lengthens encoding time but optimizes quality. For live video, you may have to sacrifice quality to achieve real-time encoding, and choose a lower-quality preset.

While profile and levels are the same for all H.264 encoders, presets vary by the specific codec. For example, figure 3 shows encoding with the x264 codec, which uses presets that range from Ultrafast to Placebo. Other H.264 codecs, like those from MainConcept or Beamr may use different presets, or may not offer presets at all.

Now that you know the basics, you should be able to configure your streams for optimum quality and compatibility. As always, when working with a service provider like Wowza, check the service’s recommendations to ensure that you deliver streams that comply with its requirements.

In the next article in this series, we’ll discuss the VP9 codec.

 

About Jan Ozer

Jan Ozer is a leading expert on H.264, H.265, and VP9 encoding for live and on-demand production. In his consulting practice, Ozer helps streaming publishers produce highly optimized and deliverable streams and to choose encoders, transcoders, and workflows that optimize… View more