Low-Latency CMAF for Live Streaming at ScaleMarch 6, 2019
We recently published a blog titled What Is CMAF, detailing the history and specifications of this format. Today, we’re looking closer at the part we’re most excited about: achieving low-latency streaming with chunked-encoded CMAF delivered via chunked transfer encoding.
Update July 5, 2019: Apple’s recent announcement about Low-Latency HLS has impacted low-latency CMAF. To learn more about this, please read our post, Apple Low-Latency HLS: What It Is and How It Relates to CMAF.
CMAF File Containers
A growing number of viewers abandon traditional satellite and cable services each year. But one feature has maintained the relevance of these broadcasts over streaming: the speedier delivery of live productions.
Today, the majority of streaming traffic takes the form of HTTP-based adaptive bitrate video. While this technology delivers a great user experience to large audiences, it lacks the real-time playback that viewers demand.
What’s more, competing protocols make the streaming landscape complex. The two most prolific HTTP-based packaging protocols are Apple HLS (HTTP Live Streaming) and MPEG-DASH (Dynamic Adaptive Streaming over HTTP). While HLS traditionally used the .ts format, DASH often took the form of .mp4 containers based on ISOBMFF.
Until recently, any content distributor wanting to reach users on both Apple and Microsoft devices had to encode and store the same data twice. This made reaching viewers across iPhones, smart TVs, Xboxes, and PCs both costly and inefficient.
Apple and Microsoft recognized this inefficiency. So, the two organizations went to the Moving Pictures Expert Group with a proposal. By establishing a new standard called the Common Media Application Format (CMAF), they would work together to reduce complexity when delivering video online.
HTTP Live Streaming With Fragmented MP4 (fMP4)
Apple and Microsoft agreed to move forward with the fragmented .mp4 format as their common media application. This can now be deployed using either HLS or DASH, which helps cut costs and improve CDN efficiency. And while the specification does little to impact latency, chunked-encoded and chunked-transferred CMAF does.
CMAF in itself is a media format. But by incorporating it into a larger system aimed at reducing latency, leading organizations are moving the industry forward.
This requires two key behaviors:
Used with CMAF, these technologies allow data to quickly move through each step of the live video delivery workflow. But this requires participation from vendors across the streaming ecosystem. Content delivery networks (CDNs), players, and encoders must be optimized accordingly.
From top CDNs like Akamai and Fastly to leading media players like JW Player and THEOPlayer, the list of vendors adding CMAF support grows each day. And here at Wowza, we’re ready to do the same.
Chunked-Encoded CMAF Files
Latency creeps in at every step of the streaming workflow. Censorship delays, player buffering logic, and encoding inefficiencies can all be to blame.
When paired with low-latency behavior across the ecosystem, chunked-encoded CMAF helps speed things up. This involves breaking the video into smaller chunks of a set duration, which can then be immediately published upon encoding. That way, near-real-time delivery takes place while later chunks are still processing.
Let’s clarify some of the terminology before diving any deeper.
- A chunk is the smallest unit.
- A fragment is made up of one or more chunks.
- A segment is made up of one or more fragments.
Traditionally, the encoder would wait to create a full segment before sending it to the CDN. Once the CDN had the full segment, it would give it to the player. With chunked-encoded CMAF, encoded data is transferred down the distribution chain immediately, with chunks sent and received independent of one another.
Breaking each segment into these shorter pieces means that the encoder can output each chunk for delivery as soon as it’s encoded. This helps decouple latency from segment duration. In other words, the same latency can be obtained from a ten-second segment as from a one-second segment.
Will Law, chief architect for media engineering at Akamai, explains:
“There is no fixed rule for how many frames are included in each chunk. Current encoder practice ranges from 1–15 frames. Taking the example of a four second segment at 30 fps with 1 frame per chunk, the media content within the chunk is released 3.967 seconds earlier than if the encoder had waited to product the last chunk of the segment before uploading the first. This early release leads to a direct reduction in overall latency by the same amount.”
While chunked transfer encoding has been around for some time, the use of chunked-encoded CMAF with this technology represents an industry-wide initiative to lower latency using it. Which brings us to the next step: ingest and distribution.
Chunked Transfer Encoding With Low-Latency CDNs
To quickly move these chunks through, the distribution channel must support chunked transfer encoding from end to end. The CDN’s ability to transfer data as quickly as possible makes for a more consistent throughput, rather than bursts and periods of zero across the network.
Because the server does not know the final size of the media being transferred, chunks can be sent as soon as they’re available. The transmission ends once a zero-length chunk is sent.
Low-Latency CMAF Players
Finally, the stream is pulled to the media player. The player is able to request yet-to-be-completed segments rather than buffering to wait for a fully available one. That way, playback begins while the encoder is still producing that very segment.
The CDN simultaneously caches the chunks travelling through it to build a representation of the complete segment. This makes the stream compatible with legacy players that don’t support low-latency CMAF.
The final concern is player catch-up. Should the stream fall behind due to throughput fluctuation, the player must have the ability to move back to real time.
Many low-latency media players include catch-up functionality via either rate control or jumping forward. The former involves playing the content back at a faster rate than it’s encoded, whereas jumping forward is exactly what it sounds like.
Low-Latency CMAF Use Cases
A variety of players and CDNs will soon support this non-proprietary solution. Unlike WebRTC — which was designed for real-time communications as opposed to video delivery at scale — low-latency CMAF lends itself to live sporting events, auctions, gambling, and more.
Chunked-encoded and chunked-transferred CMAF will allow OTT to compete directly with cable. And because the common media format offers cost savings, it should open up the possibility for additional live broadcasting opportunities.
Chunked-Encoded, Chunked-Transferred CMAF: Low Latency at Scale
When it comes to low-latency CMAF, it takes a village.
Akamai’s Will Law breaks it down.
“An end-to-end latency of three seconds might be taken up by a 500-millisecond encode time, a sub-500-millisecond CDN propagation time, and then a two-second player buffer. The goal is to extract all latency out of the CDN, leaving it in the encoder, where additional latency increases quality; and in the player, where latency protects against distribution perturbations.”
The good news is that many commercial vendors are adding support for chunked-encoded, chunked-transferred CMAF.
- Encoders: Anevia, GPAC Licensing, Cisco, Harmonic, Mediaexcel
- CDNs: Akamai, Fastly, Amazon Cloudfront
- Players: Bitmovin, CastLabs, JW Player, NexStreaming, THEOPlayer, Akamai, hls.js
What’s more, Wowza is expanding our supported formats to include CMAF and Apple Low-Latency HLS.