Connections Over UDP

We’ve now merged something pretty cool: @AudriusButkevicius’ work with integrating the KCP connection type into Syncthing! This gives us connections over UDP for the first time and I’d like to share what you can expect to take away from that.

NAT Traversal

This is the one and only reason for doing this. With UDP and STUN we can establish direct connections between devices that are both behind NAT gateways without any incoming ports open. This means more connections will be direct instead of through relay and get established faster.

Performance

There is a recurring theme that connections over UDP will give better performance than TCP. This is not the case - not in general, and very much not in our case. I know this will come up, so I want to take the time to walk through why it is not the case for us.

Certainly there are cases where a protocol over UDP will outperform TCP. It will do so in specific environments unsuited for TCP, and when the UDP-encapsulating protocol is optimized for exactly that environment. For example in very high performance supercomputer clusters, and in situations with very high latency and packet loss. Those are two vastly different environments though, and the protocols suited for one will perform very badly or not at all in the other.

However, TCP is typically faster for most things inbetween. There are a number of reasons:

  • Lack of a magic bullet. UDP is “faster” than TCP because it does not do ACKs or resends and will stream packets at whatever pace you decide instead of using windowing and congestion control. However, we need the resends, and the windowing, and the congestion control, and so on to make sure data arrives safely and to decide the pace at which we can send packets without dropping them immediately. This means that essentially everything TCP does must be implemented again, but now on top of UDP.

  • Optimization. TCP has been studied, tweaked and optimized by literally hundreds of people for the last 35 years. The on-top-of-UDP protocol we use has had maybe a person year or two of engineering put into it, best case.

  • Optimization, again. TCP is implemented in the kernel and, in part, in the network card. When using TCP we can make one call to the kernel to hand it a large buffer of data and say “segment and package this as you see fit”. When using UDP, because we have to do that segmenting and packaging ourselves, we hand each packet individually the kernel. Each call to the kernel is expensive and the overhead becomes very significant, especially in Go. The kernel’s TCP stack is also highly optimized for performance while the kernel’s UDP stack is typically not.

All in all, you should expect KCP connections to have lower performance than TCP. Syncthing still defaults to TCP when it can be established, falling back to KCP when that is the only option. A KCP connection will however be significantly better than using a relay and being able to transparently connect through NAT devices increases Syncthing’s “fire and forget” ability significantly.

Usage

The KCP code will be in v0.14.25-rc.1, but disabled by default. It’s immature and has several parameters that will need tuning before it’s ready for general consumption. Don’t use this in production. If you want to participate in the testing you are however very welcome to do so. You will need to feel comfortable editing advanced settings:

  • defaultKCPEnabled: Enable to enable KCP support.
  • kcpCongestionControl: Probably don’t disable this.
  • kcpFastResend, kcpNoDelay, kcpReceiveWindowSize, kcpSendWindowSize, kcpUpdateIntervalMs: These may require tweaking for performance.

You may also need to force a KCP connection by setting the device address statically to kcp://<ip>:22020 or something similar - Syncthing will by default prefer TCP if it’s able to establish a TCP connection, for all the reasons mentioned above.

We will phase in KCP in a future release candidate and then stable release as it proves itself working and has been properly tuned for Syncthing.

8 Likes

We should probably trial smux as the multiplexer again, as yamux had a 27mbps cap. I’ll send a PR sometime for you to run benchmarks, hopefully before the RC.

1 Like

There is weird stuff going on anyway with the bit rates at the moment. I updated the kcp library yesterday as you recommended, and it reduced the rate significantly. Or at least, reduced the “ramp up” rate - a transfer for me starts very slowly, ramps up to hundreds of KB/s then and MB/s over a minute or two. Lowering the update interval seemed to get some of that back, but still way slower than it was before. Probably need to make some graphs or the “sum total” benchmark will be completely misleading.

But really, I expect to iterate this with it off by default over at least several RC:s so no rush to patch everything up now.

That would be awesome and then we could bisect kcp

This is the pull request. It was not linked above, so I add it here:

1 Like

Wat. I was thinking more like the usual perf output fed into R or gnuplot. :expressionless:

That would do too I guess :blush:

And yet while completely off topic here, would be amazing for a non-GUI management panel… :wink:

1 Like

@calmh now that QUIC is becoming a thing, would that be something to look into instead of kcp? Considering it’ll be standardised and all.

Sure

Also the go quic library now supports what we need, which is dialing and listening on the same underlying connection I think.