Syncthing seemingly using jumbo frames

logan893 · May 25, 2022, 11:43am

I’m using Syncthing via SyncTrayzor on two Windows 10 machines separated by long distance (>250ms RTT). Was working well to push data from Secondary to Primary over a Wireguard VPN tunnel, and path MTU discovery was working well and speeds were decent (several MB/s).

Now trying to sync some other data in the opposite direction, from Primary to Secondary, there’s something odd happening. The client on the Primary side is way too frequently sending very large packets above the MTU limit that will obviously be fragmented and due to the high RTT are often reordered or dropped. Speeds are at a crawl, typically in the 30-40 kB/s range.

Unclear why TCP packets sent are so large. Wireshark shows frequently huge packets, even 10-14k byte TCP packets being sent.

I tried to bypass the Wireguard VPN tunnel and go with a straight port forwarding, but the outcome is essentially the same.

The Primary side’s Windows 10 machine has two network interfaces, one of which was is for a special secondary LAN for file sharing. This network was using jumbo frames but it could not be used by Syncthing for any data transfer. I have removed the jumbo frame configuration from this network and all interfaces, without any change in behavior from Syncthing.

What could be causing Syncthing to send such large packets and completely ruining throughput for me?

calmh · May 25, 2022, 11:51am

I suspect you are mixing up TCP segments, IP packets, and actual on-the-wire ethernet frames. Though in either case, this is outside of Syncthings control – we just open a normal TCP connection.

logan893 · May 25, 2022, 12:10pm

Wireshark trace on receiving side shows frequent reassembly and some “TCP out-of-order”.

From a trace taken from sending side, sorted by packet length:

Looks like sending side packets are at multiples of 1460 bytes, plus 54 byte overhead, when MTU is 1500.

calmh · May 25, 2022, 12:17pm

Yeah, so I expect those are TCP segments.

logan893 · May 25, 2022, 12:29pm

The packets are large as captured in Windows and they are not fragmented yet. They will need to be fragmented somewhere (likely before Windows transmit them to the router with MTU 1500 on its local interface), which is what I want to avoid. You’re saying then that this is a purely a Windows 10 TCP issue?

Reducing MSS only reduces the multiple by which the packets are created, e.g. MSS at 1400 bytes creates packets at multiples of 1360 plus overhead. The result is still shitty performance.

calmh · May 25, 2022, 2:48pm

I don’t know, especially not on Windows, but we don’t do anything in particular to affect this in one direction or the other.

AudriusButkevicius · May 25, 2022, 4:02pm

Why do you want to avoid that? Many 1500 byte sized syscalls is much more expensive than getting the kernel to fragment a large 20kb packet.

To be honest, I am not even convinced this happens in the kernel, I suspect the the kernel just fills the NIC buffers and let it do its thing.

I don’t think you’ll find an application that ever deliberately passes sub-1500 bytes to the kernel.

logan893 · May 26, 2022, 4:48am

Perhaps this “inexpensive” way which results in package fragmentation is good on low latency and low drop rate links. I am not convinced that it’s equally optimal for high latency links with a higher risk for dropped packets.

With some more checking, it appears that my Secondary side also sends these large packets while achieving several MB/s towards Primary, so it’s odd that the low throughput is only in one direction, from Primary to Secondary.

Both run Windows 10, same TCP settings as far as I have checked. The main differences are Secondary is a physical machine and it has TcpWindowSize registry key set to 2097152, while Primary is virtualized in VMware ESXi. I mirrored the TcpWindowSize configuration on the Primary system and there’s not much difference.

AudriusButkevicius · May 26, 2022, 1:14pm

I guess I am not sure this is a syncthing issue anymore.

I don’t know much about VMWare, but I know in KVM, unless you are using VirtIO drivers, network performance will be abysmal, exactly for that reason, as the virtualized driver will pass down 1500 byte packets down to the real driver, one by one, causing large amounts of context switching (not your usual process context switching, but something that does various checks that you are not trying to escape the kvm context etc).

system · June 25, 2022, 1:14pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.