Is there a specific reason why syncthing explicitly turns off TCP_NODELAY?

Just genuinly curious what the practical effects of turning it on would be (if any).

I think this disables Nagles which I think improves throughput on high latency links.

That would be the case for setNoDelay(true): net - The Go Programming Language

I don’t know if it’s still the case, but I think at one point we used to do several small network writes (think message header, then message). There’s no interactivity requirement here so letting the OS pack this into larger packets is generally advantageous (nodelay = false).

You’d have to ask me in 2014, and I don’t remember more than above.

Is there a specific reason why you’re concerned about this?

I was curious how syncthing deals with high latency networks, started digging in the code and found this.

Wouldn’t this slow down the transfer if syncthing sends lots of small messages on high latency networks or are our writes usually big enough not to cause any problems here?

I don’t think so, because what generally happens is that we send out a bunch of requests and process data as it comes in, trying to keep several tens of megabytes of data in flight at any given moment. I guess it’s possible that if all we need is a single 200 byte block this might get delayed a few hundred milliseconds or whatever, but I doubt this is a common scenario.

You could of course analyze it and come to another conclusion.

I’m going to run a test using a build with TCP_NODELAY turned on at the weekend.

This would also affect relay connections or am i wrong?

All connections get set the same settings.

Compiled 1.13.1 with nodelay and started up a node with it. No noteable differences so far for connections over LAN or WAN. I’ll test relay connections tomorrow.

Edit: nodelay seems to be a little bit faster if the node is the receiver(~142-149Mbps vs 148-154Mbps).

1 Like

I ran another test syncing a larger file to my phone over 4G. Regular syncthing without nodelay was able to saturate my uplink while the modified build was a little bit slower.

So @calmh point that Syncthing is doing small network writes might still be the case. I guess that this might also negatively affect QUIC transfer?

Maybe? tcpdump or wireshark to clarify. :slight_smile:

(digging up the old thread for context)

Interesting article: https://withinboredom.info/blog/2023/01/03/the-cargo-cult-of-tcp_nodelay-when-to-use-it/

Further, when mixed with Delayed Acknowledgements, if both ends of the connection are using Nagle’s, then things will go badly.

We have it enabled on both sides and TCP_CORK is AFAIK the default at least on Linux.

1 Like

Previous article: Golang is evil on shitty networks – Somewhere Within Boredom

It seems like you should either turn off TCP_NODELAY on one side or disable it for both sides while buffering at the application layer.

My old test showed no difference on low latency connections. I wonder if turning on TCP_NODELAY would improve performance on high latency connections.

If we’re lazy, the current setup sounds good to me from what I’m reading. If we were to be non-lazy we enable NODELAY again and make sure to buffer protocol writes optimally.

I only glossed over it, but doesn’t it say it only matters if you have sub-MTU sized chatter? That’s super rare in Syncthing, relatively only happens for stuff that’s not speed sensitive (pings, cluster-configs, stuff like that). Actual file transfer should always cause bigger packets in which case there’s no difference, right?

Turns out that Wikipedia covers this quite nicely:

I’m also not sure how this works with our protocol. What happens if a BEP message fills multiple packets but not the last one? Do we then have to wait for the ACK before we can send the next message?

And if we don’t properly buffer currently, how does this affect QUIC?

All good questions which someone could look into.