A quick preface: I submitted a lengthy bug report ticket about this a week before Christmas, and had a back-and-forth with a few support staff. The support staff then very unhelpfully closed the ticket due to me not responding within 72 hours during the Christmas break (Merry Christmas, guys!). So, I’m posting this here now (Happy New Year).
Ever since we updated from 4.7.5 to 4.7.7, we occasionally see extremely high cpu usage for no apparent reason. E.g. with less than 4 connections CPU usage goes from 2%, and shoots up to +50%.
I was peering inside the VM, and found one thread taking up an entire core:
As you can see above one particular thread is suddenly doing a whole lot of work for no apparent reason.
- This was my 5th or 6th time seeing this behaviour since we updated.
- It happens at low loads (1-4ish connections), it happens at high loads (200-400 connections). I can’t find a pattern.
- It happens with Java 1.8.77 that the AMI is born with, and also with 1.8.191 that I upgraded to.
- There seems to be no way to fix it except restarting the WSE; something we’d rather not do when we’re live.
We’re currently using an Amazon M5.large instance (2 cores, 8GB ram), which is covering our needs quite nicely. We’re seeing something like 40% usage during peak times.
We’re ingesting RTSP and have an old-fashioned RTMP stream, and a WebRTC stream for modern browsers. We’re transcoding audio from AAC to Opus for WebRTC purposes.
We’ve been testing the WebRTC implementation right from the beginning of the beta, and this is a new issue we’ve been seeing.
The symptoms are a lot like this guy’s from September: https://www.wowza.com/community/questions/48969/is-it-normal-to-have-high-cpu-usage-with-no-connec.html
I don’t have a thread dump, and I can’t reproduce the issue with any kind of consistency.