• Wowza Transcoder AddOn Performance Benchmark

    Wowza Transcoder AddOn offers both accelerated and un-accelerated video encoding, depending on your hardware configuration. This article presents performance benchmark numbers captured for software (default) encoding, Ivy Bridge (Quick Sync) accelerated encoding, and CUDA accelerated encoding. These numbers are for guidance only and your results may vary depending on network traffic, source file composition, configuration, overall operating system overhead, an so on.

    All tests were conducted using the methodology of incrementally transcoding incoming streams of video until the server reached approximately 65 percent CPU utilization. We recommend that when running in a production environment that the transcoding operation not take more than 50-55 percent of the total CPU resources of the machine. This will leave sufficient CPU resources available for streaming the transcoded streams. Before executing the tests, Wowza Media Server (version 3) was tuned using the Performance Tuning Guidelines.

    Note: For use with Wowza Media Server 3.

    Test Servers


    • Server1
      • Processor: Single Intel® Xeon® CPU E3 V2 1275 @ 3.50GHz
      • Cores/Threads: 4/8
      • Memory: 8 GB
      • Motherboard: ASUS P8C WS
      • OS: Windows 7 Home Premium - 64-bit
      • Java: Java 7 64-bit
      • GPU/Acceleration: Built-in HD4000 with Intel Quick Sync

    • Server2
      • Processor: Single Intel® Xeon® CPU E3 1275 @ 3.40GHz
      • Cores/Threads: 4/8
      • Memory: 8 GB
      • Motherboard: ASUS P8B WS R
      • OS: Windows 7 Home Premium - 64-bit
      • Java: Java 7 64-bit
      • GPU/Acceleration: CUDA test with GTX 580 1.5 GB RAM

    • Server3
      • Processor: Dual Intel Xeon CPU X5650 @ 2.66GHz
      • Cores/Threads: 12/24
      • Memory: 12 GB
      • Motherboard: SuperServer 7046GT-TRF 4U Xeon DP 4xGPU Ready
      • OS: Windows 7 Enterprise - 64-bit (Some Windows 7 SKUs are not compatible with multi CPU hardware)
      • Java: Java 7 64-bit
      • GPU/Accleration: Single NVidia Tesla C2070 with 6GB RAM

    • Server4
      • EC2 Instance: Extra Large Instance - m1.xlarge
      • Memory: 15 GB
      • 8 EC2 Compute Units: 4 virtual cores with 2 EC2 Compute Units each
      • Java: Java 7 64-bit
      • OS: Amazon Linux
      • EC2 AMI: m1.xlarge
      • GPU/Acceleration: None

    • Server5
      • EC2 Instance: High-CPU Extra Large Instance - c1.xlarge
      • Memory: 7 GB
      • 20 EC2 Compute Units: 8 virtual cores with 2.5 EC2 Compute Units each
      • Java: Java 7 64-bit
      • OS: Amazon Linux
      • EC2 AMI: c1.xlarge
      • GPU/Acceleration: None

    • Server6
      • Processor: Dual Intel Xeon CPU X5650 @ 2.66GHz
      • Cores/Threads: 12/24
      • Memory: 12 GB
      • Motherboard: SuperServer 7046GT-TRF 4U Xeon DP 4xGPU Ready
      • OS: Fedora 15 64-bit
      • Java: Java 7 64-bit
      • GPU/Acceleration: None


    Input Test Stream


    • Transrate
      • Video Codec: H.264
      • Video Profile: Main
      • Video Level: 3.1
      • Video Frame Size: 1280x720
      • Video Frame Rate: 23.98 fps
      • Video Bitrate: 1.8 Mbps

      • Audio Codec: AAC
      • Audio Sample Rate: 48 kHz
      • Audio Channels: Stereo
      • Audio Bitrate: 97 kbps

    • Transcode
      • Video Codec: MPEG-2
      • Video Frame Size: 1280x720
      • Video Frame Rate: 23.98 fps
      • Video Bitrate: 3.0 Mbps

      • Audio Codec: MPEG-1 Layer 2
      • Audio Sample Rate: 48 kHz
      • Audio Channels: Stereo
      • Audio Bitrate: 128 kbps



    Transrate


    Input Output Server 1: default Server 1: QuickSync Server 2: CUDA Server 3: default Server 3: CUDA Server 4: default Server 5: default Server 6: default
    1 x 720p @ 1.8 Mbps
    • 1 x 720p (passthru)
    • 1 x 360p
    • 1 x 240p
    • 1 x 160p
    15% 8% 11% 4% 3% 52% 18% 8%
    2 x 720p @ 1.8 Mbps
    • 2 x 720p (passthru)
    • 2 x 360p
    • 2 x 240p
    • 2 x 160p
    22% 14% 15% 8% 5% 72% 38% 16%
    3 x 720p @ 1.8 Mbps
    • 3 x 720p (passthru)
    • 3 x 360p
    • 3 x 240p
    • 3 x 160p
    33% 17% 21% 12% 8%
    __
    58% 23%
    4 x 720p @ 1.8 Mbps
    • 4 x 720p (passthru)
    • 4 x 360p
    • 4 x 240p
    • 4 x 160p
    48% 21% 29% 18% 11%
    __
    77% 32%
    5 x 720p @ 1.8 Mbps
    • 5 x 720p (passthru)
    • 5 x 360p
    • 5 x 240p
    • 5 x 160p
    65% 27% 39% 25% 15%
    __
    __
    41%
    6 x 720p @ 1.8 Mbps
    • 6 x 720p (passthru)
    • 6 x 360p
    • 6 x 240p
    • 6 x 160p
    __
    33% 54% 32% 19%
    __
    __
    46%
    7 x 720p @ 1.8 Mbps
    • 7 x 720p (passthru)
    • 7 x 360p
    • 7 x 240p
    • 7 x 160p
    __
    44% 61% 38% 23%
    __
    __
    52%
    8 x 720p @ 1.8 Mbps
    • 8 x 720p (passthru)
    • 8 x 360p
    • 8 x 240p
    • 8 x 160p
    __
    48% 68% 44% 27%
    __
    __
    58%
    9 x 720p @ 1.8 Mbps
    • 9 x 720p (passthru)
    • 9 x 360p
    • 9 x 240p
    • 9 x 160p
    __
    54%
    __
    51% 30%
    __
    __
    66%
    10 x 720p @ 1.8 Mbps
    • 10 x 720p (passthru)
    • 10 x 360p
    • 10 x 240p
    • 10 x 160p
    __
    __
    __
    65% 34%
    __
    __
    __


    Transcode


    Input Output Server 1: default Server 1: QuickSync Server 2: CUDA Server 3: default Server 3: CUDA Server 4: default Server 5: default Server 6: default
    1 x 720p @ 3.0 Mbps
    • 1 x 720p
    • 1 x 360p
    • 1 x 240p
    • 1 x 160p
    17% 12% 12% 6% 3% 60% 22% 13%
    2 x 720p @ 3.0 Mbps
    • 2 x 720p
    • 2 x 360p
    • 2 x 240p
    • 2 x 160p
    32% 16% 18% 12% 7%
    __
    51% 25%
    3 x 720p @ 3.0 Mbps
    • 3 x 720p
    • 3 x 360p
    • 3 x 240p
    • 3 x 160p
    56% 23% 28% 24% 11%
    __
    73% 35%
    4 x 720p @ 3.0 Mbps
    • 4 x 720p
    • 4 x 360p
    • 4 x 240p
    • 4 x 160p
    79% 29% 39% 31% 15%
    __
    __
    46%
    5 x 720p @ 3.0 Mbps
    • 5 x 720p
    • 5 x 360p
    • 5 x 240p
    • 5 x 160p
    __
    36% 51% 44% 19%
    __
    __
    53%
    6 x 720p @ 3.0 Mbps
    • 6 x 720p
    • 6 x 360p
    • 6 x 240p
    • 6 x 160p
    __
    45% 63% 52% 22%
    __
    __
    63%
    7 x 720p @ 3.0 Mbps
    • 7 x 720p
    • 7 x 360p
    • 7 x 240p
    • 7 x 160p
    __
    54%
    __
    59% 26%
    __
    __
    69%
    8 x 720p @ 3.0 Mbps
    • 8 x 720p
    • 8 x 360p
    • 8 x 240p
    • 8 x 160p
    __
    62%
    __
    67% 31%
    __
    __
    __
    9 x 720p @ 3.0 Mbps
    • 9 x 720p
    • 9 x 360p
    • 9 x 240p
    • 9 x 160p
    __
    __
    __
    __
    37%
    __
    __
    __
    10 x 720p @ 3.0 Mbps
    • 10 x 720p
    • 10 x 360p
    • 10 x 240p
    • 10 x 160p
    __
    __
    __
    __
    44%
    __
    __
    __




    Comments 14 Comments
    1. larc25 -
      Hi,

      is there any chance to have the CUDA testing result soon?

      Thanks.
    1. rrlanham -
      Performance results are published as soon as available

      Richard
    1. smokee -
      Could you please make some tests with 1080p input stream at 6Mbps, and it would be nice to have benchmarks for output profiles for 576p and 480p for IPTV and web.
      Can you give us encoding parameters for encodings above (360p, 240p, 160p)?
    1. withkbsi -
      pls. check this.
      DELL R710 : intel xeon dual x5660, 64GB RAM, 300 SAS HDD(mirror), win 2008 r2 64bit
      in this system all video sources are 1080i.
      10 1080i vidoe sources will be served concurrently.
      MAX : concurrently 20 user connected to wowza with rtsp protocol.
      And also 10 vidoe sources(1080i) will be transcoded less than 720p resolution.
      So, approximately 20 video sources are served at same time(1080i and transcoded 720p)
      Then, this system holds 20 video sources streaming and 10 transcoding

      I have wowza 3 perpetual license(ungraded from wowza 2) without TRANSCode license.
      what if I install 10 transcode license, what happened on this system??

      Thanks.
    1. rrlanham -
      You can use the numbers we post for comparison to get an idea, but for good numbers on a particular system, you have to test that system.

      Capacity, limitations and results will have much to do the bitrate of the 10 source streams and the number and bitrate of the output streams.

      Richard
    1. withkbsi -
      Thanks, Richard.

      for your convenience, let me tell you my situation.

      10 1080i video streams' bitrates are probably 6Mbps respectively.
      these streams are comming from hardware encoder to wowza.
      with these 10 streams wowza transcode 10 720p streams.
      and then the transcoded 10 streams also go to wowza.
      10 720p transcoded video streams are 2Mbps for each.
      So, the total streams are 20(10 1080i, 10 720p)
      also the 10 transcoding procedures(1080i -> 720p)
    1. smokee -
      Could you please answer couple of questions:Is it possible to combine CUDA/QuickSync/default encoding on one box?Are there any limitations on number of transcoding sessions/server?Could you post CUDA GPU load by CPU load in CUDA accelerated sessions?Is there any specific reason why some CUDA tests have only 6 simultaneous sessions?Is it possible to change framerate in transrate/transcode process?What about 1080p input, can you make some performance tests?
    1. rrlanham -
      You have to use one or the other of CUDA or QuickSync per Template /Encode /Video setting. But the next /Encode /Video in the same template can use the other, as far as what seems possible in the template.

      There are benchmarking results for Transcoder here:
      http://www.wowza.com/forums/content....ance-Benchmark

      Ricard
    1. Haris1 -
      What are the respective output bitrates and other H264 parameteres for 720p, 360p, 240p, 160p in the case of trasrate and transcode?
      Could you please post the Wowza transcoder configuration files of this benchmark test?
    1. daren_j -
      The settings used for each of those encodes were the default settings provided in the transrate.xml and transcode.xml files.

      You'll find those files in [install-dir]/transcoder/templates.

      Daren
    1. broaska -
      Richard
      Not sure where i should put this question. I have recently updated our Wowza to 3.5.2. Since this update our SD streams are very jerky. Audio seems fine just freeze frames a lot on picture. We were not seeing this in the previous release. Any ideas on what might be causing this. I have tried JDK 7 u13 through u17 and am getting the same results.
    1. rrlanham -
      Did you overwrite any previous configuration changes with the patch?

      Richard
    1. broaska -
      Yes, we copied all the folder contents over and re-created our VHost.xml etc. I have just rolled it back to 3.5.0 and its working as expected now. Not sure if our encoder and the new transcoder setup on 3.5.2 get along. Also, I just tried setting up a clean 3.5.2 with the newest Java on the same hardware and am getting the same result. I am guessing this is likely an issue between our encoder and the transcoder change on 3.5.2. For now I will roll back to 3.5.0.
    1. spiderman2013 -
      Hi,

      I noticed from above the list that the minimum memory is 7GB for linux server without GPU acceleration.
      Is there a recommended minimum memory requirement for transcoding streams with multi-rate stream output.

      We are currently using a virtual linux server with 4GB ram and Dual Core CPU for this wowza transcoding purpose.
      Will this likely to cause any issues if we were to transcode more than 1x incoming H.264 streams?