Network Latency

Why Does Network Latency Matter So Much?

Or to put it another way: Why does a small change in network latency have such a large effect on how long it takes to do things?

Latency is everywhere - it's built into the universe - and it affects everything. Just ask my friend Dr Matt Bray about how latency affects music.

But back to the network conversation. I have a rule of thumb that I use which goes along the lines of "for every millisecond of round-trip latency it will take one second longer to transfer a megabyte of data". That sounds outrageous and there are cases where it isn't that bad - but there are a lot of situations where it can be that significant or worse. Go with me here...

If we assume that the average payload of a packet is 1,000 bytes (yes, 1,500 is the normal MTU but it's almost always a little lower - we'll come back to this); then it takes 1,000 packets to transfer a megabyte. Nice round numbers. In reality, it's 1,024 bytes to a kilobyte and 1,024 kilobytes to a megabyte. That we have an extra 500 bytes (or so) in the packet here to play with means the numbers seem a little off but that is compensated for by other overheads - so the numbers still stack up in a nice (human consumable) way.

Assuming we're dealing with a "chatty" protocol (I'm looking at you, CIFS - but there are many others) then each packet that is sent from A to B requires a response to be sent back from B to A before the next packet is sent from A to B. Yes, there are many optimisations that can happen here but let's look at the worst case first. It's probably going to be a TCP ACK but it could also be that there is an application-layer ACK too - that adds to the latency (because now two round trips are sometimes or most times required) but for the purposes of this discussion, let's assume TCP-layer only and therefore one round-trip.¹

If A and B are directly connected to each other; back-to-back cables then we will assume (again for the sake of argument) that the latency between them is zero. Perfect, ideal circumstances. This actually never happens (because there are operating system overheads, etc.) but if it did then our megabyte of data takes zero seconds to transfer. It actually doesn't because there are serialisation delays as well but for this example I'm going to ignore them because they will be the same delay regardless of the latency between A and B and we're purely interested in that A-to-B latency.

If we now increase the latency (distance, perhaps? operating system overhead? routing/switching delays when forwarding packets? firewall inspection delays? serialisation delays? TLS negotiation overhead? File transfer protocol negotiation overhead?) so that the round trip time is a single (just one!) millisecond then it now takes a whole second for that transfer of the single megabyte to happen. 1,000 packets multiplied by 1 millisecond = 1 second. In the real world, this means a one-way latency of half a millisecond and I'm assuming zero processing time for the ACK to be returned. And remember that 1 millisecond of round-trip time is only half a millisecond one-way - which seems like not very much at all.

If we increase the round-trip latency to two milliseconds, it now takes two seconds. 20 milliseconds = 20 seconds. And by most measures, 20 milliseconds is not a lot of time.

If we are transferring 100 megabytes and the latency is 10 milliseconds (neither of which are unreasonable - in fact, both numbers are small) then it will take 1,000 seconds (just under 17 minutes). Ouch.

In the real world these numbers may not stack up. First, (as mentioned earlier) we can put more data in each packet. Moving from 1,000 bytes in each packet to 1,500 bytes means we get a 50% jump in efficiency. Excellent - and that's not even using jumbo (8,500-9,000 byte) packets - but you can't count on that, especially over the Internet but there are plenty of places where the MTU is lower than 1,500 bytes. If we do TCP sliding window (providing we're using TCP) then we can have multiple packets in flight before requiring an ACK - much more efficient. And if the application is moderately intelligent we can be transferring data in multiple parallel streams (assuming we have the bandwidth available). A good example of this is how the AWS tools manage parallel transfers with S3.

All of those things means that for an ideal protocol/application that can fill the pipe my calculation is way off base. And in the real world these things are definitely doable so better throughput is totally achievable.

However, also in the real world it is very rare to see this type of perfect behaviour. You're usually doing something like transferring lots of files with NFS or CIFS or something else which isn't optimised for higher latency; requires acknowledgement of each packet or there could be packet loss which leads to retransmission which leads to more time taken. And I'm not talking about situations where packets are corrupted - although that's reasonably rare even though the outcome (retransmitted packets) is the same.

Take a case where many small files are being transferred to a storage service (AWS S3 is once again a good example). There might be millions of files that are all only a few kilobytes in size. So lots of data - but that means lots of individual TCP/TLS/protocol sessions to be negotiated. Each file means TCP setup; TLS setup; AWS authentication and authorisation and only then can you start transferring the file. All of those things are affected by the latency.

You could easily spend hundreds of milliseconds and dozens of kilobytes before any "real work" is done. And that's time that is not spent transferring data. Some protocols (HTTP/2 and later as an example) keep the TCP session alive which is very helpful. But those protocols that create a new TCP session every time waste a lot of bandwidth but nothing can be done about it unless you want to design your own protocol which sends multiple files through an already established TCP/TLS session - or use an existing protocol that already does that.

But for those protocols that already exist and are not optimised my off-the-cuff calculation is much closer to reality.

Summary: In ideal circumstances where large files are being transferred using a well-designed, efficient file transfer protocol you'll get far better results than I specify in my rule of thumb. But there will be cases where it's much closer to being accurate than you might think and this happens a lot more often than you'd expect.

Addendum: The common belief is that light travels at 300,000 kilometres per second and that's how to calculate latency. In fibre-optic and copper cabling, light travels about a third slower than that. Network paths are generally a combination of both and it's difficult (ok, impossible) to figure out what is being used on any particular non-trivial link but in general anything more than a hundred metres is probably going to be fibre but with a bunch of copper in between. For cross-country or international links fibre will be the majority. Remember that every network device in the path (router, switch, firewall, etc.) adds latency too and that not all of them are visible to you when you're doing things like traceroute.

Plus, network links aren't always in a straight line. They may meander around significant geological artefacts; or traverse oceans in ways that you don't know; or a network outage may be sending packets the long way around the globe (permanently or temporarily). All of this things can easily affect latency - and the latency may not be static; it could change at any time. If you're using a satellite service like StarLink or ~~Kuiper~~ Amazon Leo your latency will vary from minute to minute as the path your packets take across the satellite mesh changes as the mesh itself is moving from your perspective.

If you need a tool to estimate what a change in latency will do for you, check out this one built by my friend Byron Pogson.

If the application does do it's own ACK then the dire predictions start to look quite realistic because the ACK the application forces another (maybe more than one) TCP round-trip. ↩