Wowza Community

ID3 PRIV tag in AAC chunks for HLS streaming - is it necessary?

Greetings,

When doing on-demand audio-only streaming of M4A (AAC) files via HLS protocol, wowza puts AAC chunks into the m3u8 playlist, and it inserts an ID3 PRIV tag “com.apple.streaming.transportStreamTimestamp” at the beginning of each chunk. This tag is described at

http://tools.ietf.org/html/draft-pantos-http-live-streaming#section-6.2.4

in the section “6.2.4. Providing variant streams”.

The presence of this ID3 tag apparently causes some delay in the playback on some non-iOS devices (Panasonic ViERA TVs, 2012 models). Apparently, when removing the ID3 tag, it is possible to achieve quite perfect gapless playback, but otherwise there’s an audible delay between the chunks.

The section which mandates to use this tag is titled “Providing variant streams”. Does this mean that it is not needed when not using variant streams in HLS? Or, is it used somehow to provide meta-information about the necessary exact padding to achieve gapless playback?

I don’t know the exact purpose of this tag, whether it is related in any way with the gapless playback, or not. See the article http://en.wikipedia.org/wiki/Gapless_playback#Compression_artifacts

Lossy audio compression schemes that are based on overlapping time/frequency transforms add a small amount of padding silence to the beginning and end of each track. These silences increase the playtime of the compressed audio data. If not trimmed off upon playback, the two silences played consecutively over a track boundary will appear as a pause in the original audio content. Lossless formats are not prone to this problem.

For some audio formats (e.g. Ogg Vorbis), where the start and end are precisely defined, the padding is implicitly trimmed off in the decoding process. Other formats may require extra metadata for the player to achieve the same. The popular MP3 format defines no way to record the amount of delay or padding for later removal. Also, the encoder delay may vary from encoder to encoder, making automatic removal difficult. Even if two tracks are decompressed and merged into a single track, a pause will usually remain between them.

The wiki page also mentions that for MP3 files, “LAME-encoded MP3 can be gapless with players that support the LAME Mp3 info tag.” (and as I understand, when wowza splits MP3 into chunks, it uses this or a similar mechanism to embed the delay/padding amounts inside the chunks).

I thought that this ID3 tag is used to embed the delay/padding amount inside the file to allow gapless playback. Although as I’ve said, I’m not sure that this is needed in case of AAC files (the wiki page says that “AAC in MP4 encoded with iTunes (and Nero)” is gapless in certain players).

To summarize, here are the questions I’d like to get answers to:

  1. I’m not sure if this ID3 tag is used in the case of AAC chunks for the purpose of embedding the exact padding amounts for gapless playback, or only for variant streams functionality. Could you please clarify what is the case here?

  2. If this tag is not needed for gapless playback, but only for variant streams, is it possible to switch off generation of this ID3 tag in wowza via some config parameter or API call?

Best wishes,

Vladimir

Vladimir,

I’ll see what I can find out and get back to you

Richard

Vladimir,

One option is to force Wowza to generate TS chunks rather than AAC chunks. This should avoid this problem:

https://www.wowza.com/docs/how-to-configure-wowza-server-to-stream-audio-only-apple-hls-using-transport-stream

Richard

We will add this to things to watch at this point. No plans to make any changes, at least until there is more indication.

Richard

Thanks, I’ll wait for your reply. Is there a hope that if this ID3 PRIV tag is used only for variant streams, it will be possible to switch off its generation? This will allow the Panasonic 2012 models to play the HLS streams without delays between AAC chunks.

Hi Richard,

Thanks for reply. But the HLS player in Panasonic TVs apparently requires ADTS container (for the HLS audio chunks referenced in the m3u8 playlist), not MPEG-TS.