About timed metadata in Wowza live streaming workflows

Timed metadata is the key to adding interactivity to live (and VOD) streams produced with Wowza™ streaming technologies. From source to playback, Wowza enables you to produce broadcasts that use timed metadata to offer dynamic, interactive viewing experiences that engage and delight audiences.

What is timed metadata?


Metadata is data that provides information about other data. Think of an iTunes library, where every song is a piece of data. Every song also has a name, an artist, an album, a genre, and many other descriptive properties, all of which constitute the song’s metadata.

Streams, too, contain metadata. The metadata travels with the stream as it's encoded and decoded, and it may be used along the way. Two examples of a stream's metadata are its codec and its resolution. The codec and resolution are specified at the source and used by a player in adaptive bitrate streaming to select a rendition to play for any given viewer's device and network conditions.

Timed metadata just means that a timecode accompanies the piece of data about the stream. When the stream is encoded, the timed metadata is synchronized to audio and video keyframes. The timed metadata flows through a server to the client where, during playback, the timecode serves as a cue point to invoke some action on the data. Perhaps a subtitle translation appears during a streamed opera, or statistics about a baseball team's winning record appear when it takes the lead in the ninth inning; maybe bidding for an auction item begins, or a game show host starts fielding answers to a question. Other common types of timed metadata include advertising and telemetry information.

Timed metadata formats


Many formats and types of timed metadata are used in streaming workflows. For example, SCTE-35 is used to designate where to place ads in MPEG-TS streams, and onTextData is used to display closed captions for RTMP streams. These formats, however, have highly structured implementations and specific use cases. For the broadest and most flexible streaming scenario—a source stream ingested and delivered for playback over HLS, MPEG-DASH, and WOWZ over WebSockets with custom, user-defined metadata—three formats are used to carry timed metadata with the stream: AMF, ID3, and Event Message (emsg).

About AMF

AMF stands for Action Message Format. It's a binary format developed by Adobe Systems for exchanging messages between servers. In a Wowza streaming workflow, AMF metadata can be received with an ingested source stream over RTMP or WOWZ. AMF metadata can also be injected directly into a Wowza Streaming Engine™ server application. AMF encapsulates the metadata in a packet with a header that defines the message length and type and a body that defines the custom metadata and an associated timecode.

Streaming over HLS or MPEG-DASH, however, requires different timed metadata formats: ID3 or Event Message (emsg).

About ID3

ID3, originating from the phrase "identify an MP3," is the tagging specification that has become the standard for defining metadata in MP3 audio files. Today, it's also used to carry timed metadata in HLS media streams.

An ID3 tag is a container that consists of one or more frames; each frame includes one or more bits of metadata information. The tag is inserted into every HLS media segment at a given time offset. Once the metadata is inserted into the segment, it stays there, so if a live stream is converted to VOD, the timed metadata becomes part of the VOD asset.

About Event Message (emsg)

An Event Message (emsg) is a box format in ISO BMFF files that contains timed metadata and is inserted into MPEG-DASH audio or video segments. This is the most common way to carry in-band metadata in MPEG-DASH streams.

An emsg contains timing information, a string payload, and an event scheme identifier that differentiates between types of metadata events (for example, captions, slide transitions, etc.). The MPEG-DASH manifest identifies event schemes with the InbandEventStream element and scheme URI attribute. The player then knows which metadata to retrieve from the stream.

Using timed metadata in Wowza workflows


Wowza Streaming Engine, Wowza Video, and Wowza Flowplayer all support timed metadata. Most third-party players also support timed metadata. The specific ways each Wowza product handles AMF, emsg, or ID3 vary, offering wide-ranging and flexible options for using timed metadata across Wowza products.

Using timed metadata in Wowza Streaming Engine

Wowza Streaming Engine can receive AMF metadata from cameras and encoders that support it. Alternatively, you can inject AMF metadata directly into a Wowza Streaming Engine live application by creating a custom HTTP provider and using the Wowza Streaming Engine Java API. The com.wowza.wms.amf package of classes provides methods for working with AMF metadata, and the IMediaStream interface provides access to the stream objects.

Similarly, you can use a custom module and the Wowza Streaming Engine Java API to convert AMF metadata to ID3 for HLS streaming or to emsg for MPEG-DASH streaming. For HLS, you can send the stream with converted metadata to a stream target with Wowza Streaming Engine or to a Wowza CDN or custom HLS target in Wowza Video. For MPEG-DASH, to send streams with timed metadata to a CDN, the CDN must pull the stream from a Wowza Streaming Engine Live HTTP Origin application; sending streams with emsgs to CDN destinations using Stream Targets (push publishing) is not supported.

Note: Wowza Streaming Engine currently only supports converting AMF metadata to emsg metadata with live streams—not with VOD or nDVR streams.

If you're using a third-party camera or encoder, check with the manufacturer to see if it supports AMF metadata and for instructions on how to include AMF metadata in live streams. For Wowza Streaming Engine to ingest it, the metadata must be wrapped in a top-level AMF data object (AMFDataObj).

For instructions on injecting or converting metadata in Wowza Streaming Engine, see:

Although the following articles don’t address metadata directly, they provide instructions on how to send streams that can include metadata from Wowza Streaming Engine to Wowza Video:

Using timed metadata in Wowza Video

Wowza Video can receive AMF metadata from a source stream that originates from:

  • A Wowza Streaming Engine live application 
  • A third-party encoder that supports AMF

Unlike with Wowza Streaming Engine, AMF metadata can't be directly injected into a Wowza Video transcoder.

For Wowza Video to convert AMF metadata to ID3 for HLS playback, it must be included in a top-level AMF data object (AMFDataObj). Within that AMF data object, you can add key/value pairs that include AMF data of additional types, such as string, Boolean, list, or even a nested object. The AMF data object must include two properties: (1) a key called payload with a value that is a string of data to be converted and (2) a key called wowzaConverter with a value of basic_string.

You can use the Wowza Video REST API to convert AMF metadata to ID3 tags. When you enable the convertAMFData property in the REST API, Wowza Video listens for AMF data events in the source stream, parses the data events, maps the events to ID3 tags, and sends the ID3 tags with output renditions to either Wowza CDN or custom HLS stream targets.

For instructions on using timed metadata in Wowza Video, see:

If you're using a third-party camera or encoder, check with the manufacturer to see if it supports AMF metadata and for instructions on how to include AMF metadata in live streams.

Using timed metadata with Wowza Flowplayer

Wowza Flowplayer can listen for and process ID3 metadata tags. See the ID3 plugin for more information.

Note: If you need to determine if third-party players like JW Player support ID3 tags, see each player's documentation.

Workflow considerations


One of the primary decisions to make when working with timed metadata is where to inject it into the workflow. Should you inject it with the source stream at the encoder or camera? Or should you inject it directly into Wowza Streaming Engine? Either method works, but you'll want to ensure that the metadata and video are synchronized as closely as possible.

We recommend injecting the metadata as close as possible to the source—in the source encoder or camera, if possible. When the encoder injects metadata into the source stream, the metadata is associated with the specific video frame being processed at that moment. The source encoder then takes a few seconds to compress and send the stream and the synchronized metadata to Wowza Streaming Engine or Wowza Video. Despite any latency that may occur as the stream travels to the player, the metadata remains as closely aligned as possible with the video frame. That's why it's always best, if possible, to inject the metadata with the source at the encoder.

When you inject AMF metadata directly into a Wowza Streaming Engine stream, the AMF metadata is also associated with the specific video frame being processed at that moment. However, latency has typically occurred between the encoder and Wowza Streaming Engine. As a result, the metadata may be injected before the video frame you want to sync to is processed by Wowza Streaming Engine. This may cause the metadata to appear during playback before the desired video frame.

To prevent directly-ingested metadata from getting ahead of the video, either inject the metadata at the source or delay the direct injection into Wowza Streaming Engine: watch RTMP or WOWZ playback of the stream from Wowza Streaming Engine, and initiate the API call to inject the metadata as soon as you see the frame you want to sync to. (There's less latency with RTMP and WOWZ playback, so you can see the desired frame before it appears for HLS playback.)

Workflow limitations


Finally, when implementing timed metadata with live streams using Wowza technologies and services, keep in mind these limitations:

  • Wowza doesn't provide back-channel communications to data servers. Communications between Wowza and servers that store and process the metadata must be developed separately.
  • An AMF-to-ID3 workflow doesn't work with MPEG-TS.