My journey into the world of multi-bit rate adaptive/dynamic streaming for Flash Video on Demand (Is that a mouth full or what?):
Three months ago I opened a new internet based media production company in Japan and while I have had seven years of video experience, I realized I was clueless when it came to streaming. If this was Japanese martial arts, I would have been below a 1st Dan, maybe a “Level 0: Youtube uploader apprentice.” Now while I still do not understand everything, I can hold a conversation with a streaming professional, and as a business owner, this very important.
My starting point for this research came from a simple and very innocent question: “Where can I find the multiple bit-rate (multi bitrate, multibitrate) presets I need to process our video content?” I never imagined the several weeks of research that would ensue.
First, if you are new to dynamic streaming, let me save you some time by answering my first question:
You are not going to find the presets you are looking for, because everyone has unique variables to factor in. The presets you will find in the encoding programs and on the internet are merely a starting point. The good news about this, is once you master creating your very own presets, you will have something specific to your project or company and that will help you define future success; it will be something that can set you apart from the next guy. It’s like building a custom race car or a boat. Think of it as a fun challenge.
Variables you may consider:
-The “quality” of the video you want to stream:
-type of content (slow talking heads versus fast moving action)
-Who you will stream to:
-high end HD broadband users, with content embedded into a web page
-slow internet speed u
-mobile “on the goers”
Time, overhead, and costs to prepare you material for streaming:
-how many different multi bit-rate versions to make?
-how much will server storage cost?
-what is my audience size?
So now you know there are no magical presets, and knowing is half the battle.
But! Don’t just take my word for it.
I am posting the following information with links to the original information. Do yourself a favor. Follow the links, read the sources, and then use my notes as a quick reference. I used these notes as a check list when we finally moved to the encoder to build our presets. My encoder of choice is Sorenson Squeeze, but that is a topic for another day.
Hopefully this will save you tons of time, or should I say kbps of time.
Start by reading this h.264 Primer. Warning though, after reading, you will want to take a vacation.
Source: http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/video/articles/h264_primer/h264_primer.pdf (November 17th, 2009)
Kush Amerasinghe did a great job putting this together, and his illustrations are beautiful.
Next check out Abhinav Kapoor’s article and see how easily you can follow:
Source: http://www.adobe.com/devnet/flashmediaserver/articles/dynstream_on_demand.html (January 12th, 2009)
Here are my notes:
-Adobe recommends using the bit rates given in Table 1 (see below) for dynamic streaming on demand. If the intended target users are towards the higher end of the bandwidth spectrum with at least a DSL connection, then the first couple of bit rates could be skipped. The frame rate for videos below a bit rate of 100 Kbps could be set to lower values such as 15 fps, but at bit rates higher than 300 Kbps, a frame rate of at least 25 fps and ideally 30 fps is recommended.
-Keep the audio bit rates and sampling rates the same.
-For lower bit rate streams, encoding the channel as mono and maintaining the same bit rate (that is half the bit rate of their stereo equivalents).
-The video frame size should not exceed the maximum video frame size.
-Keeping a constant video keyframe interval.
-Keep the timeline of the multiple encoded files identical.
-FMS looks for the keyframes in the new stream in chunks equal to the client’s buffer size (NetStream.bufferTime), so having a client buffer larger than the keyframe interval of the stream would help with a fast switch response time from the server.
most optimal settings:
• Keyframe interval: 5 sec.
• Client-side buffer: 6–10 sec
-Keep the server’s outgoing buffer small by keeping the ServerToClient bandwidth limit smaller. Keeping the value too low, however, would limit the server’s ability to push more data when the client’s bandwidth increases again.
-Ideally, this bandwidth limit should be set to a value slightly above the maximum bit rate of the streams being sent. For example, if a content has streams of 500 Kbps, 800 Kbps, 1.2 Mbps, 1.8 Mbps, and 2.4 Mbps, then setting the ServerToClient bandwidth limit to 2.5 Mbps or 327,680 bps would provide the maximum stability. This can either be set in Application.xml at the server or, better yet, set on the NetConnection object from the client-side application.
-If the playback CPU requirements are not very low, use of B-frames with an interval value of 2 or 3 is recommended. There is a 5–10% increase in CPU usage, and B-frames are not supported with the Baseline profile.
-Using CABAC (context-adaptive binary arithmetic coding) provides better quality, but at a cost of approximately 3–5% of CPU usage.
Adobe has provided a batch of presets to be used as a starting point for Adobe Media Encoder (see chart below). I used their presets as one factor in deciding my own. You can read Jan Ozer’s article about it here:
Source: http://www.streaminglearningcenter.com/articles/adobe-provides-adaptive-streaming-presets-for-adobe-media-encoder.html (December 27th, 2010)
You can download the presets here:
Source: http://blogs.adobe.com/ktowes/2010/10/new-encoding-settings-for-multiple-screens-using-adobe-media-encoder-cs5.html (October 25th, 2010)
Don’t worry if you are lost at this point. Eventually it will all start making sense.
I read this next article three times, from top to bottom, over the course of three weeks (not counting the times I just peaked at it). It was a good feeling, realizing I was understanding more each time. This article is very technical but a good way to judge what level you are on.
Source: http://download.macromedia.com/flashmediaserver/http_encoding_recommendations.pdf (October 19th, 2010)
Here are my notes:
-GOP stands for “Group of Pictures.”
-Keyframe distance fixed across all streamed segments and bit rates.
-Keyframe distance no less than 3 seconds for content over 1 minute long. no less than 2 seconds for content under 1 minute long. Increasing keyframe interval increases quality improvement time for the compression.
-Scene cut detection and keyframes at fixed intervals from start-of-content CAN be okay.
-Reference frames: 2-4, closer to 2.
-Frame rate: constant across all bit rates.
-2 pass CBR is best. 2/multi pass VBR can work too, but exceeding 10% above minimum set bit rate is not recommended.
***** My theory is this percentage only needs to be lower than the difference in bit rate between files, not sure.
-initial buffer fullness: 70%
-final buffer fullness: 100%
-Use same encoder for multi bit rate versions of content.
-Fixed frame size across all switching bit rates. (Probably not doing this.)
-Bit rate as variable component across all switching bit rates.
-Same video duration for all switching bit rates.
-Avoid scaling down from larger screen size to smaller frame size. (1280x720 scaled down to 768x432)
-Audio must maintain fixed bit rate across all encoded files.
-HE-AAC v2 is the recommended codec.
-MP4 and moov atom / movie atom: move to the front of the file. “progressive download,” “fast start,” “streaming mode,” are settings used to move the moov atom.
mp4creator and mp4 faststart (http://datagoround.com/lab/) can move the moov atom, but best practice is to get it in the right place during encoding.
-moov atom location is crucial if using a CDN (content delivery network).
-for HTTP Dynamic streaming, having a broken edts atom (inside the trak atom of a moov atom inside an mp4 file) can interfere with smooth and stable switching of HTTP streams. FLVcheck, MP4creator, and AtomicParsley can help locate and remove these.
-This article then lists seven variants for encoding possibilities, both in a quick view chart and detailed chart:
Jan Ozer also wrote a nice article on streaming in the field, but some points seemed to contradict what I read over at Adobe. Maybe it’s just “real world” versus “perfect settings world.”
Source: http://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=73017&PageNum=4 (December 2010/January 2011)
Here are my notes:
-Main/High profiles won’t play on pre 4g iphone/ipads. (In my testing, it will. Strange.)
-Keyframes are NOT okay at cuts. (Adobe says keyframes ARE okay with certain combinations of settings.)
-40kbps audio is okay.
-Everyone but Apple is doing keyframe distance of 2. (Adobe states that a minimum keyframe of 3 seconds should be observed for content longer than 1 minute.)
One point that caught my interest:
"Which Configurations Were Most Watched The next question relates to the streams actually watched by each site’s viewers. Goldstein at MTV had the most detailed statistics to share, including that 45% of viewers start at the 768x432 stream (at 1.7Mbps) and remain there for the duration of the stream.*
An additional 19% click into full-screen mode and view at firstname.lastname@example.orgMbps, while 8.5% adaptively drop down to email@example.comMbpsfrom the 720p stream due to bandwidth or CPU limitations. This means that more than 70% of MTV’s viewers are watching at 1.7Mbps or above. Beyond this, about 7% watch at 640x360 with the rest of the viewers evenly scattered in the bottom three stream categories."
Check out Alex Zambelli’s smooth Streaming Multi-Bitrate calculator here. (Jan’s link seems to be broken):
Source: http://alexzambelli.com/WMV/MBRCalc.html (no date)
I believe Youtube uses progressive download. Again Jan Ozer has done the research:
Source: http://www.onlinevideo.net/2011/05/streaming-vs-progressive-download-vs-adaptive-streaming/ (May 13th, 2011)
As a comparison to dynamic streaming, you can see Youtube’s settings on Wikipedia:
Source: http://en.wikipedia.org/wiki/YouTube (August 12th, 2011)
I also checked out JW Player’s recommendations for RTMP bit rate switching of widescreen MP4 videos.
Once you get your presets figured out, check out some other interesting information related to dynamic/adaptive Flash Video on Demand (VOD) streaming:
Source: http://www.wowza.com/demos/flash_http.html (no date)
Source: https://www.wowza.com/docs/how-to-encode-video-on-demand-content (no date)
How to play a video on demand file:
Source: https://www.wowza.com/docs/how-to-set-up-video-on-demand-streaming (October 10th, 2010)
How to control access to an HTTP stream (Cupertino, Smooth, and San Jose)
Source: http://www.wowza.com/ide.html (no date)
One last helpful hint: If securing your content is a top priority, go with RTMPe. Wowza’s (San Jose) HTTP Dynamic Streaming, does not yet have the same level of security as does Flash, with their Flash Access. You can read my post about that here:
Source: http://www.wowza.com/forums/showthread.php?14432-San-Jose-Flash-HTTP-Security&p=73311&highlight=#post73311 (August 8th, 2011)
Information on Wowza’s MediaSecurity AddOn Package (SecureToken, RTMP & RTSP Authentication and more) is here:
Source: https://www.wowza.com/docs/media-security-overview) (October 2nd, 2010)
Feel free to correct any mistakes I have made, or to give your own personal advice. Talk about your presets too if you’d like or the factors that went in to deciding. I built my racing boat. Build yours, and let’s race!