Wowza Community

Video begins on first keyframe while audio begins at start of file

I have noticed that Wowza video-on-demand begins video playback at the first keyframe, rather than the actual start of the file, but audio begins at the start regardless of keyframe. This creates sync issues. For example, in our workflow we might:

  1. live stream and record for two hours at our desired streaming settings (frame rate 30, keyframe every 150 frames)

  2. use QuickTime Pro to delete the first 30 minutes from the resulting file (maybe this was just padding before our event, or maybe there were consecutive events recorded to one file, and we want to stream the second event on-demand)

  3. use QuickTime Pro to quickly export the trimmed video to a MP4 container (passing through the existing video/audio rather than re-exporting)

  4. copy this MP4 to a streaming server for on-demand streaming

When streamed on-demand in Wowza (by any method, either RTMP, RTSP, or HTTP), the audio and video begin simultaneously when you hit play, but audio starts precisely from the trim point, while the video starts from the first keyframe (which might be up to 5 seconds later). So the audio and video are out of sync.

This does not happen when streaming the same file through Darwin streaming server (RTSP) – the first few seconds might show “green” video until the first keyframe comes, but the video and audio are in sync.

(Sorry for the lack of demo files/streams to illustrate this – I can provide some later if necessary.)

I have tried to work around this issue by trimming exactly to a keyframe point instead of an arbitrary time (using MPEG Streamclip, which is the only free software I know which locates keyframe times), but this doesn’t help us when we record two video sources for parallel streaming playback in a SMIL file (for example, a camera feed next to a laptop feed). Even though both source streams/files get the same specified frame and keyframe rates in the encoder (Wirecast or QuickTime Broadcaster), no encoder seems able to maintain a precise enough frame rate to keep the keyframes aligned across the two files, even just for a few minutes. (Again, Darwin Streaming Server has no problem streaming two files in parallel in SMIL with different keyframe intervals – green video is simply shown whenever starting or seeking at a non-keyframe point.)

Is there any way to correct this behavior in Wowza, just for RTSP? I know I could re-encode the trimmed video to create new keyframes, but that can take a lot of time and will degrade the video quality. I haven’t had a chance to look yet, but I’m guessing this throws off parallel SMIL sync in Wowza even when I don’t trim the beginning of the files.

With all that post-processing, I’m not sure this will help, but you can try adding this Property setting to the

Application.xml /Streams /Properties

<Property>
 <Name>StartOnPreviousKeyFrame</Name>
 <Value>true</Value>
 <Type>Boolean</Type>
</Property>

Richard

Correction, this property

<Property>
<Name>recordWaitForVideoKeyFrame</Name>
<Value>true</Value>
<Type>Boolean</Type>
</Property>
[/code

Yes, I think audio before the first key frame can be the problem. I’m not sure how to fix on the encoding side with your workflow. Try recording with Wowza that is on the same machine or lan as the encoder (to avoid artifacts from data loss) with that Property setting

Richard

Richard – thanks. I assume that only affects when the Wowza server starts recording the file? I had been recording locally using Wirecast (or QuickTime Broadcaster), then editing and uploading to the Wowza server, so I don’t think this would affect me.

I am recording with Wowza too, but mainly as a backup, although I could try streaming that file instead (back in Wowza 2 I thought those recorded files showed a lot of artifacts etc. but maybe that has improved – I am on a high speed reliable LAN). Is there an easy way to send a reset/restart command to Wowza, through JMX, so I can keep the stream running but start/restart the recorded file at the desired point?

Back to the original issue: can this starting to play at keyframe issue indeed affect audio sync, if there isn’t a keyframe at the very beginning of the file?

Thanks for the links Randall.

I had never heard of the “seekTarget” variable before, but changing it to “enhanced” did not fix the issue. I definitely see the “enhanced” seek behavior now, but it seems the audio still begins and stays out of sync unless there is a keyframe at the beginning of the file in RTMP. Same for RTSP too. (Is there any difference between “enhanced” or “audio” as seekTarget values?)

Here is an example:

Source file: http://160.94.17.23/demo/test2.mp4

Darwin: rtsp://128.101.240.131/test2.mp4

Wowza RTSP: rtsp://160.94.17.17/vod/test2.mp4

Wowza RTMP: http://160.94.17.23/demo/test2.html

Not the best content, sorry, but after a few seconds, it should be evident whether the audio and video are in sync.

QuickTime, Darwin, VLC, streamed or played locally: duration of 20 seconds, everything in sync

Wowza: duration of 24 seconds, includes 4 seconds of “extra” video at beginning, but audio is still only 20 seconds long and not in sync

FFMPEG reports “Duration: 00:00:20.48, start: -4.156667” which seems to roughly match what Wowza is thinking too

The original source was a Wirecast m4v file. Then I used QuickTime Pro to trim a random small segment from the middle and export to an MP4 container without re-encoding. (I tried trimming with FFMPEG too but I got similar poor results.) If I instead trim a segment beginning at a keyframe, everything works fine.

My point about keyframe alignment is that if I have two synchronized movies (live recordings started and stopped at the same time) with different or drifting keyframe intervals, Wowza can’t maintain sync when RTSP streaming them in parallel (side-by-side) in a SMIL file. Even if they both start on a keyframe, they drift out of sync when seeking, presumably as their keyframes drift.

So I’m stuck using a separate Darwin server to do this RTSP stuff, in addition to Wowza. Would love to finally consolidate if I can get this sorted out.

Thanks Randall. I did that ffmpeg command and copied it to my server – the video now has the correct 20 second duration in Wowza, but the audio still streams out of sync:

Stream: http://160.94.17.23/demo/out.html

Download: http://160.94.17.23/demo/out.mp4

It plays fine locally, but it played fine locally before doing this too.

If I do a re-encode (ffmpeg -i test2.mp4 -vcodec libx264 -acodec aac -strict -2 test2reencode.mp4), I get a 24 second file where the first ~4 seconds are silent when played locally so the audio stays in sync. Still out of sync in Wowza, however:

Stream: http://160.94.17.23/demo/test2reencode.html

Download: http://160.94.17.23/demo/test2reencode.mp4

I have to do a re-encode with the -ss option (ffmpeg -i test2.mp4 -ss 4.156667 -vcodec libx264 -acodec aac -strict -2 test2reencodess.mp4) to finally get streaming audio sync in Wowza:

Stream: http://160.94.17.23/demo/test2reencodess.html

Download: http://160.94.17.23/demo/test2reencodess.mp4

Remember, that temp2.mp4 file was just a random segment trimmed with QuickTime Pro from a larger file recorded with keyframes every 5 seconds. So it sounds like the start point I randomly picked was about four seconds after a keyframe, and roughly one second before the next (hence why FFMPEG reports a -4 start time, and your examination found the next keyframe around +1). VLC, QuickTime, Darwin, etc. all play/stream temp2.mp4 beginning at my non-keyframe start point just fine. FFMPEG starts the video at the -4 keyframe and waits until 0 to start the audio. But Wowza wants to simultaneously start the video from the -4 keyframe and the audio from 0 and thus won’t be in sync.

I was hoping to do edits within files and stream them in Wowza without re-encoding, but I’m finding that’s just not possible unless my edit start point lands on a keyframe. I’ve tried QuickTime Pro, MPEG Streamclip, and FFMPEG – is there some other tool I should be using?

Lisa, thanks for the reply about edit lists. Sure enough, I see an edit list atom in my file using Apple’s Atom Inspector:

http://wiki.multimedia.cx/index.php?title=QuickTime_container#elst

https://developer.apple.com/downloads/index.action?q=atom%20inspector

I’ve mostly got my workflow set so I can avoid this type of editing now. It would be nice to have this option in the future, though.

susta,

“the audio and video begin simultaneously when you hit play, but audio starts precisely from the trim point, while the video starts from the first keyframe”

Charlie answered this question here: Turn on enhanced seek by editing conf/Streams.xml and change seekTarget to enhanced or audio.

“Is there an easy way to send a reset/restart command to Wowza, through JMX, so I can keep the stream running but start/restart the recorded file at the desired point?”

The LiveStreamRecord module does this. A search revealed this user submitted code.

You said, “but this doesn’t help us when we record two video sources for parallel streaming playback in a SMIL file (for example, a camera feed next to a laptop feed). no encoder seems able to maintain a precise enough frame rate to keep the keyframes aligned across the two files, even just for a few minutes”

I don’t understand why you need keyframe alignment here. Keyframe alignment applies to multibitrate switching which doesn’t apply when you have different video sources. Maybe you’re just switching between videos? Then enhanced seek should work for you.

Hey susta,

Your test2.mp4 actually does start on a video keyframe (I-Frame):

frame,video,1,-12470,-4.156667,-12570,-4.190000,25418,1280,720,yuv420p,1:1,I,0,0,0,0,0,3

frame,video,0,-12370,-4.123333,-12470,-4.156667,73775,1280,720,yuv420p,1:1,B,3,0,0,0,0,0

frame,video,0,-12270,-4.090000,-12370,-4.123333,73066,1280,720,yuv420p,1:1,B,2,0,0,0,0,3

But there is no keyframe at dts 0:

frame,video,0,-170,-0.056667,-270,-0.090000,342611,1280,720,yuv420p,1:1,B,124,0,0,0,0,0

frame,video,0,-70,-0.023333,-170,-0.056667,333574,1280,720,yuv420p,1:1,P,121,0,0,0,0,3

frame,video,0,30,0.010000,-70,-0.023333,350759,1280,720,yuv420p,1:1,B,127,0,0,0,0,0

frame,video,0,130,0.043333,30,0.010000,349210,1280,720,yuv420p,1:1,B,126,0,0,0,0,3

The audio starts around dts 0:

frame,video,0,1130,0.376667,1030,0.343333,362100,1280,720,yuv420p,1:1,P,133,0,0,0,0,3

frame,audio,1,-154,-0.003492,-154,-0.003492,387816,s16,1024

frame,audio,1,870,0.019728,870,0.019728,388187,s16,1024

frame,audio,1,1894,0.042948,1894,0.042948,388558,s16,1024

The next I frame is about a second later:

frame,video,0,2430,0.810000,2330,0.776667,472179,1280,720,yuv420p,1:1,P,149,0,0,0,0,0

frame,video,1,2530,0.843333,2430,0.810000,475796,1280,720,yuv420p,1:1,I,150,0,0,0,0,3

frame,audio,1,22374,0.507347,22374,0.507347,527394,s16,1024

Take a look at sample.mp4. It starts exactly at 0 and the packets are interleaved, making for nice seek/playback:

frame,audio,1,0,0.000000,0,0.000000,36,s16,1024
frame,audio,1,1024,0.021333,1024,0.021333,1113,s16,1024
frame,audio,1,2048,0.042667,2048,0.042667,1203,s16,1024
frame,audio,1,3072,0.064000,3072,0.064000,1361,s16,1024
frame,video,1,0,0.000000,0,0.000000,132,424,240,yuv420p,N/A,I,0,0,0,0,0,0
frame,audio,1,4096,0.085333,4096,0.085333,1566,s16,1024
frame,audio,1,5120,0.106667,5120,0.106667,2251,s16,1024
frame,video,0,1,0.041667,1,0.041667,1190,424,240,yuv420p,N/A,P,1,0,0,0,0,0
frame,audio,1,6144,0.128000,6144,0.128000,2583,s16,1024
frame,audio,1,7168,0.149333,7168,0.149333,2884,s16,1024
frame,video,0,2,0.083333,2,0.083333,1553,424,240,yuv420p,N/A,P,2,0,0,0,0,0

This fixes it, and interleaves the packets nicely:

ffmpeg -i test2.mp4 -ss 4.156667 -vcodec copy -acodec copy out.mp4

Regarding seek, you can either re-encode setting a strict keyframe interval with ffmpeg. Or use enhanced seek to seek to specific timecodes using flash. Definitely look into recording the two streams with Wowza using the LiveRecordModule to set specific record times.

We do not currently support edit lists which are used to synchronize the audio and video when they are written unsynchronized. This means the files will play back unsynchronized. We may address in the future, but I don’t know when. I’ll pass along your feedback to our product management team.

-Lisa

Hi,

We can take a closer look at this, please send a sample file so we can troubleshoot.

You can send the file to support@wowza.com, if small enough to email,

otherwise share the file and link to it in the email.

Please reference this post in your email.

Daren

I’m having a very similar issue.

We currently use ffmpeg to create video clips from existing files, using the -vcodec copy command, which as previously stated does not necessarily start the clip on an I-frame (it actually almost never will when using the accurate seek method stated here: http://ffmpeg.org/trac/ffmpeg/wiki/Seeking%20with%20FFmpeg)

The clips play fine locally as well as just using the standard HTML5

Using Flash and Wowza’s RTMP stream however starts the video instantly at the nearest keyframe, and the audio at the beginning of the actual stream, so they are out of sync, and the audio will end before the video.

I was just wondering if there have been any updates or added features/variables to Wowza in the last year that may be able to alter this behavior? This is the only thread I was able to find that accurately explains the issue.

We are currently running version 3.6.2.

Thanks