Dan Siemon & Scot Loach (Co-founders) from Preseem had the opportunity to go live on ISP radio’s latest show with hosts Steven Grabiel and Dennis Burgess on 4th April, 2018. This post is a written summary of the segment where Scot discusses Netflix and how it actually works (hint: it’s not a stream) for the benefit of network owners and managers.
Why talk about Netflix and why is it important for network managers to understand how Netflix works?
Netflix is by far the most popular application on the internet today. About a third of traffic on a typical Internet Service Provider (ISP) network is Netflix. It’s unreal that on average, 1 out of 3 bytes flowing through an ISP’s network is Netflix. From our perspective, as a networking solutions provider, we have a rare expertise in understanding Netflix given our product history with NightShift – a unique add-on service that allows satellite internet users to preload Netflix content using off-peak data and watch content in full HD, without buffering. Unless network managers and operators understand how Netflix traffic works on a typical ISP network, it’ll be hard for them to manage their network given the immense pressure it can potentially put on bandwidth as well as it’s importance from a subscriber point-of-view. You can’t optimize something you don’t understand. So let’s get started.
What happens when Netflix plays a video?
There are two parts to this – what it does when it plays and what it does on the network?
When it plays a video, it starts from an understanding that it’s at second X of show Y, then looks at its internal buffer where it stores videos and audio segments and finally take the next one out of that, decode it and put it on the screen. What fills that buffer is the interesting part and the part focused on below.
What happens on the network is when someone hits play inside the streaming app, it authenticates with their servers and then the app requests for the manifest file for this video. The manifest file is basically a big list of different files encoded in different bit rates and different resolutions that the device being used can play. It has three HTTPS links for each file and those come from three different locations on the internet that is three different CDN (Content Delivery Network) nodes. Those CDN nodes are selected by the app based on factors like BGP distance, throughput and latency. The app will establish several TCP connections to different nodes and then it’ll retrieve those file chunks using regular HTTP over TLS. Here’s an example of how one of these looks like.
Connections are encrypted on the network but it’s basically a big obfuscated string requesting part of the file and it does it chunk by chunk. It’s not a big complete file download. It’s chunk by chunk requests. Basically, what it tries to do is to keep the internal buffers full. So, there’s a big buffer and if there’s room in the buffer, the app will download the next chunk. If there’s no room left, it won’t download the next chunk.
The picture above shows some of the internal diagnostics of Netflix. If you press Ctrl+Al+Shift+D in any browser, it’ll bring up a similar diagnostic screen and you can see what’s going on. In this example/image, you can see that the app is using a large buffer – about 125 Mb of video is buffered – that’s about 5 mins ahead of playback. Just in case something happens to the network, the app has plenty of content buffered to keep playing through before it has to go fetch additional files. It also shows throughput in the bottom which is interesting. This is Netflix’s calculation of the network throughput based on the speed it’s able to get when it downloads chunks from the network. If the app is unable to keep the buffer full, it’ll switch to a lower resolution video and start a new buffer, and then sometime after that, the bitrate it’s actually playing will switch down to that lower resolution/lower quality. Similarly, if the throughput is high enough that it can play a higher resolution than currently being played, the app will start a new buffer and it’ll start getting chunks from the high-resolution file and sometime in the future playback, switch to the higher resolution.
The details of how it does this algorithm is proprietary to Netflix and this is their special secret sauce, if you want to call it. What it’s trying to do is to maximize the quality of video the user sees and keep it as close to the available capacity of the network so that the user gets the most high-def experience they can have. It’s also trying to minimize the number of up-shifts or down-shifts in resolution during playback as it may impact user’s overall experience. Base intention behind this algorithm remaining to keep the buffer full to avoid the much-hated “Netflix buffering” messages which users detest and can impact their preference for Netflix.
Important Takeaway for WISPs
– A shaper configured with bursting can actually cause Netflix to misestimate the throughput when the video starts. If a WISP offers, say, a 5 Mb plan and lets subscribers burst to 6 Mb for a little bit, Netflix will detect that higher limit and start downloading above the 5 Mb actual plan speed. This’ll result in some extra shifting up and down of resolution during the playback which otherwise wouldn’t have happened. It can typically go from lower quality > higher > lower quality during playback. Since it can be noticed by the subscriber, this might end up impacting their QoE and overall experience with the ISP.
Netflix isn’t a stream actually
It’s important to understand that Netflix isn’t a real time stream. Referencing the following Wireshark, if Netflix was a typical video stream of say 5 Mb, it’d ideally be a flattish stable line along the 5 Mb line (like a Skype stream). That’s not how Netflix works. It’s downloading chunks of files and trying to keep the buffer full. You can see in this image that during the beginning of the video, there’s high usage when the app tries to fill the buffer. You can see it using 3 different flows (color coded) to maximize throughput. It goes ideal for a little bit when video plays. When the buffer gets some space, it shows spiky behavior trying to use the TCP streams (mostly at the same time) to fill the buffer. This example is based on one device on one network. It’s is important to understand that the behavior may vary with different devices and network conditions and overtime as Netflix does more research and changes its algorithm.
Netflix will top out the available capacity
This is an experiment we did. We started a shaper at 3 Mb and then started a Netflix show. After a few minutes, we increased the shaper rate to 5 Mb and in this case, Netflix didn’t choose the higher encoding – it didn’t make the up shift decision so we kept playing at a bit rate under 3 Mb. We were trying to see what will happen it was filling the buffer with video well below the available capacity in the network.
You can see that it still uses all the available link capacity when it’s fetching chunks of video but it’s not consistent. You can see that the buffer fills in the beginning and then plays for a little while, gets room in the buffer, goes onto 5 Mb, goes down again. The point being while it’s filling the buffer for 10s of seconds at a time, it’s not leaving room for interactive traffic like VoIP or gaming to get through.
This confirms the known QoE insight that Netflix won’t allow interactive packets from VoIP call or gaming to get through and those packets will continue to wait behind Netflix traffic in the pipe.
Conclusion – key learnings for Network operators
- Netflix uses multiple TCP connections and uses TLS therefore it’s not possible to limit the number of devices or streaming sessions even with DPI based platforms.
- Netflix videos are variable bitrate encoded (and dependent on genre of movie amongst other things) therefore it’s not possible to limit resolution (like standard def) with network policy.
- Netflix downloads in short bursts at full link rate, which can negatively impact other traffic like gaming packets or VoIP. A strategy to fix QoE problems associated with Netflix behavior is to leverage modern queuing technologies such as FQ-CoDel.
In the end, it’s important to realize that the more capacity Netflix can use, the better resolution/picture quality it’ll offer to the subscriber. From an operator or business point-of-view, while you want to provide a network that offers a great Netflix experience, you do have network constraints in terms of available capacity and subscriber plans. However, artificially limiting the capacity for Netflix and other applications isn’t the right way to go from a QoE perspective. Latest queuing techniques like FQ-CoDel allow ISPs to offer a good experience to their subscribers within their enforced plan while ensuring that interactive applications aren’t negatively impacted by high-bandwidth applications like Netflix or device updates. Happy Netflix binging!
Preseem is a QoE monitoring and optimization platform for WISPs. For more details, click here.