What Is CMAF? (Update)

The battle to reduce video latency while maintaining a quality viewer experience had been leading to widespread adoption of the aptly named Common Media Application Format (CMAF). This protocol is the product of a coordinated effort to promote efficiency and reduce latency industry-wide. Even with Apple’s recent contender threatening to undermine this effort, CMAF still holds a lot of potential for anyone looking to simplify and speed up the streaming process. So, what exactly is CMAF and how can it benefit your streaming efforts?  

 

What is CMAF?

Common Media Application Format or CMAF is a relatively new format for packaging and delivering various forms of HTTP-based media. This standard simplifies the delivery of media to playback devices by working with both the HLS and DASH protocols to package data under a uniform transport container file. It also employs chunked encoding and chunked transfer encoding to lower latency. This results in lower costs for most businesses thanks to reduced storage needs and reduced risk of lost business due to video latency. 

It’s worth noting that CMAF is itself not a protocol but rather a container and set of standards for single-approach video streaming that works with protocols like HLS and MPEG-DASH. In this way, the greatest accomplishment of CMAF streaming is more political than technical in that it represents an industry-wide effort to improve streaming across the board. 

What does CMAF mean?

CMAF (Common Media Application Format) is a streaming container standard jointly proposed by Apple and Microsoft in 2016 and published in 2018 to end the costly practice of packaging the same video twice – once in MPEG-TS for HLS and once in fragmented MP4 for DASH – by unifying both protocols under a single fMP4 container.

CMAF is not a protocol but a container format and set of standards that works alongside HLS and MPEG-DASH. By packaging data into one shared container instead of separate format-specific files, CMAF reduces the duplicate encoding and storage costs that previously forced content distributors to maintain two parallel streaming workflows.

CMAF Snapshot

Purpose:Goals:Timeline:
  • Use a common media format for video streams — thereby reducing costs, complexity, and latency.
  • Eliminate investments associated with encoding and storing multiple copies of the same content.
  • Simplify workflows and improve CDN efficiency.
  • Decrease video lag by with chunked-encoded and chunked-transfer CMAF.
  • February 2016: Apple and Microsoft proposed CMAF to the Moving Pictures Expert Group (MPEG).
  • June 2016: Apple announced support of fMP4 format.
  • July 2017: Specifications finalized.
  • January 2018: CMAF standard officially published.

Why Do We Need CMAF?

Competing codecsprotocols, media formats, and devices make the already complex world of live streaming infinitely more so. In addition to causing a headache, the use of different media formats increases streaming latency and costs. 

What could be achieved with a single format for each rendition of a stream instead requires content distributers to create multiple copies of the same content. In other words, the streaming industry makes video delivery unnecessarily expensive and slow. 

The number of different container files alone is exhaustive: .ts, .mp4, .wma, .mpeg, .mov… the list goes on. But what if competing technology providers could agree on a standard streaming format across all playback platforms? Enter the Common Media Application Format, or CMAF.

CMAF brings us closer to the ideal world of single-approach encoding, packaging, and storing. What’s more, it promises to drop end-to-end delivery time from 30-45 seconds to less than three seconds. 

How, why, and to what extent is this possible? Read on for the nitty-gritty.

Leading up to CMAF

Back in the day, RTMP (Real-Time-Messaging Protocol) was the go-to method for streaming video over the internet. This proprietary protocol enabled lickety-split video delivery but encountered issues getting through firewalls. 

As many browsers began to phase out support for Flash, the industry shifted to HTTP-based (Hypertext Transfer Protocol) technologies for adaptive bitrate streaming. For Apple, this took the form of Apple HLS (HTTP Live Streaming); whereas MPEG-DASH (Dynamic Adaptive Streaming over HTTP) became the international standard. 

But with different protocols came different file containers. HLS specifies the use of .ts format, while DASH almost uniformly uses .mp4 containers based on ISOBMFF.

That was a lot of acronyms, we know. The takeaway here is that technology providers support distinct containers for streaming – which impacts playback across the myriad of devices used today. 

The problem? Any content distributor wanting to reach users on both Apple and Microsoft devices must encode and store the same audio and video data twice. With users accessing streams across iPhones, smart TVs, Xboxes, and PCs, this obstacle didn’t go unnoticed.

Take Akamai’s word for it: “These same files, although representing the same content, cost twice as much to package, cost twice as much to store on origin, and compete with each other on Akamai edge caches for space, thereby reducing the efficiency with which they can be delivered.”*

The Advent of CMAF

CMAF came about from cross-company collaboration. In February of 2016, Apple and Microsoft came to the Moving Pictures Expert Group (MPEG) with a proposal. By establishing a new standard called the Common Media Application Format (CMAF), the two organizations would work together to reduce complexity when delivering video online.

Players moved fast. Apple announced that it would add fragmented MP4 support to HLS on June 15, 2016. This meant that Apple would use .mp4 delivery for video streams — the very container that Microsoft already uses.

By July of 2017, co-developers had finalized specifications for CMAF. And in January 2018, the standard was published.

The benefits of encoding, packaging, and caching a single container for video delivery go without saying. But CMAF seeks to do more than just reduce complexity. Even after taking the reins from RTMP years ago, HTTP-based video delivery still lacks the real-time delivery options that viewers demand. Could CMAF also improve latency? With chunked encoding and chunked transfer encoding, the new specification set out to do just that.

How CMAF Simplifies Streaming

How CMAF Works

Single Format: Before CMAF, Apple’s HLS protocol used the MPEG transport stream container format, or .ts (MPEG-TS). Other HTTP-based protocols such as DASH used the fragmented MP4 format, or .mp4 (fMP4).

Microsoft and Apple have now agreed to reach audiences across the HLS and DASH protocols using a standardized transport container — ISOBMFF in the form of fragmented MP4. Theoretically, this means that content distributors can deliver content using only the .mp4 container.

Chunked Encoding: CMAF streaming represents a coordinated industry-wide effort to lower latency with chunked encoding and chunked transfer encoding. This process involves breaking the video into smaller chunks of a set duration, which can then be immediately published upon encoding. That way, delivery can take place while later chunks are still processing. Think of it like getting your meal in courses instead of all at once. You could be enjoying a tasty appetizer while the kitchen prepares your next course instead of waiting hungrily while the meal “buffers.”

File Encryption and Digital Rights Management: Not unlike the issue of different media formats that CMAF sought to address, another incompatibility exists across the industry: digital rights management (DRM) and file encryption. Supporting multiple DRMs (FairPlay, PlayReady, and Widevine) means also supporting incompatible encryption modes.

Industry players have now agreed to reach a standard in this realm called Common Encryption (CENC), but standardization doesn’t happen overnight.

What is CMAF streaming?

CMAF streaming is the delivery of live or on-demand video using the Common Media Application Format, which breaks video into small, independently decodable chunks for low-latency playback over HLS and MPEG-DASH. CMAF streaming enables sub-three-second live latency through chunked encoding and chunked transfer encoding, while reducing storage and CDN costs by using one set of segments across both protocols.

CMAF streaming competes directly with low-latency approaches like Low-Latency HLS and WebRTC. CMAF streaming typically delivers latency in the 2-5 second range, slower than WebRTC’s sub-second delivery but with significantly better scalability and CDN compatibility. For live streaming use cases like sports, news, auctions, betting, and live commerce, CMAF streaming offers a balance of low latency, broad device support, and cost-efficient delivery.

What is the difference between MP4 and CMAF?

The difference between MP4 and CMAF is that MP4 (MPEG-4 Part 14) is a traditional container format designed for storing and downloading complete video files, while CMAF is a streaming-optimized standard built on fragmented MP4 that breaks content into small chunks for low-latency adaptive bitrate delivery across both HLS and DASH protocols.

MP4 was designed in the early 2000s as a general-purpose container for storing video, audio, and metadata in a single complete file, which makes MP4 ideal for progressive download and local playback but poorly suited to live streaming or adaptive bitrate delivery. CCMAF addresses those gaps by using fragmented MP4 (fMP4), a variation of the MP4 format that splits video into small, independently decodable segments instead of one continuous file.

What is the difference between DASH and CMAF?

The difference between CMAF and DASH is that DASH (Dynamic Adaptive Streaming over HTTP) is a streaming protocol that defines how a player requests video segments, while CMAF is a container format that defines how those segments are packaged. CMAF and DASH are not competing, CMAF was designed to work with DASH (and HLS) to eliminate protocol-specific packaging.

AttributeDASHCMAF
TypeStreaming protocolContainer format
FunctionDefines how segments are requested and playedDefines how segments are packaged and stored
ManifestXML-based playlist (.mpd)None: CMAF is the segment, not the manifest
File extension.mpd (manifest).m4s (segment)
Standardized2012 (MPEG-DASH)2018 (MPEG CMAF)
RelationshipDelivers CMAF segmentsPackaged inside DASH (and HLS) workflows

CMAF and DASH operate at different layers of the streaming stack. DASH controls the manifest and delivery logic through an XML-based playlist (.mpd) that tells the player which segments to fetch and when. CMAF controls the segment files themselves, packaging audio and video into standardized fragmented MP4 (.m4s) containers that any compatible protocol can deliver.

CMAF reduces the cost of running DASH by eliminating duplicate packaging. Before CMAF, content distributors had to package each video twice – once in fragmented MP4 for DASH and once in MPEG-TS for HLS – doubling storage costs and CDN cache pressure. With CMAF, the same .m4s segments can be referenced by both a DASH .mpd manifest and an HLS .m3u8 manifest, cutting packaging overhead in half while enabling sub-three-second live latency through chunked encoding.

CMAF and DASH work best together rather than as alternatives. CMAF is the packaging standard; DASH is one of the protocols that delivers it. Choosing DASH without CMAF means accepting legacy packaging inefficiencies; pairing CMAF with DASH (and HLS) is the current best practice for cost-efficient, low-latency adaptive bitrate streaming.

CMAF and HLS

Before we explore the distinction between CMAF and HLS streaming, we should first talk about HLS vs. Low-Latency HLS, the latter of which was developed by Apple after the advent of CMAF and seems to challenge the assumption that CMAF would become the industry-wide standard.

HLS (HTTP Live Streaming) is an Apple-based protocol that has historically favored stream reliability over latency concerns, clocking in at around a 5-20 second delay. It was Apple’s answer to problems with iPhone’s live streaming playback and successfully improved viewer experience with minimal buffering. CMAF can work with HLS for presentation profiles encrypted with ‘cbcs’ or ‘cenc’. The third presentation profile, unencrypted, is not supported.

Low-Latency HLS is a protocol extension for HLS that significantly reduces latency, answering many of the same needs as CMAF. Why Apple felt the need to extend the HLS streaming protocol when they already had the CMAF format working with it to reduce latency is a bit of a mystery and has some wondering how committed Apple is to the standardization effort. 

With all this in mind, your options when using HLS are threefold: 

  • Traditional HLS with both high reliability and latency 
  • HLS with the Low-Latency HLS protocol extension to improve latency
  • HLS with the CMAF format to improve latency and standardize container files

How fast is fast? The ranges for Low-Latency HLS and CMAF vary a bit. Low-Latency HLS typically runs about two to eight seconds behind. CMAF is said to run from roughly two to five seconds behind. In any case, the two are fairly neck-and-neck. However, CMAF has the added benefit of standardized container files (.fmp4), the widespread adoption of which will lead to fewer storage costs for all. 

CMAF and RTMP

RTMP (Real-Time Messaging Protocol) was originally developed by Macromedia to work with Adobe’s Flash player, putting it at the forefront of streaming technology. However, it has since been sidelined in favor of HTTP-based technology. 

Certainly, you can still use RTMP, but it works only as an ingest and is no longer really an option for playback. It also lacks the optimization of more recent technologies that make them scalable for higher quality viewing experiences. 

That said, at a latency of around five seconds, businesses have been using it to stream quickly and reliably for a long time. 

CMAF and WebRTC

Now let’s view the other end of the spectrum with the cutting-edge WebRTC (Web Real-Time Communications) protocol. It’s the fastest available live-streaming option and straight-forward to use. If CMAF streaming is low latency, then WebRTC is ultra-low latency (ULL) with sub-second latency. It’s also easy to adapt to different network requirements and free to use.

How does it accomplish all this? WebRTC adds real-time voice, text, and video communications between browsers and devices without the need for plugins. That said, it’s specifically designed for video conferencing and is therefore not scalable to larger audiences right out of the box. That’s where streaming platforms like Wowza can help. Our Real Time Streaming at Scale solution is functionally WebRTC with a workflow that helps it scale.

CMAF Summarized

One of the easiest ways to reduce screen-to-screen latency and the costs associated with storing multiple media file types is by simplifying video delivery. CMAF’s goals were threefold: 

  • Cut costs
  • Minimize workflow complexity 
  • Reduce latency

Have these been achieved? Yes-ish.

CMAF will streamline server efficiency for serving most endpoints. And in recent demos, this new alternative enables sub-three-second latency. 

That said, legacy devices and browsers that aren’t upgraded will still require unique container files for playback. Put another way, continuing to reach the broadcast audience possible requires additional accommodations (and costs) that CMAF can’t retrospectively address. 

That’s where Wowza comes in. Wowza Streaming Engine supports video delivery across a variety of formats, including MPEG-TS, Apple HLS, and Adobe FTMP. This ensures video scalability as the adoption of CMAF-compliant devices continues to grow. What’s more, Wowza Streaming Engine now supports CMAF streaming – with low-latency optimization on the roadmap.

And if you’re looking for a streamlined solution that includes access to an easy-to-use CMS, try out our new and improved Wowza Video streaming platform. 

About Sydney Roy (Whalen)

Sydney works for Wowza as a content writer and Marketing Communications Specialist, leveraging roughly a decade of experience in copywriting, technical writing, and content development. When observed in the wild, she can be found gaming, reading, hiking, parenting, overspending at the Renaissance Festival, and leaving coffee cups around the house.
View More

FREE TRIAL

Live stream and Video On Demand for the web, apps, and onto any device. Get started in minutes.

START STREAMING!
  • Stream with WebRTC, HLS and MPEG-DASH
  • Fully customizable with REST and Java APIs
  • Integrate and embed into your apps

Search Wowza Resources


Subscribe


Follow Us


Categories

Blog

Back to All Posts