Syncthing crippling internet speeds? (again)

As the last thread I raised is now locked, I needed to start a new one. I have taken everything said in the last thread (Syncthing crippling internet speeds?) onboard, however since November, we had terrible Exchange server issues. Lots of socket errors, emails failing to sync. I went through everything I could think of. Then on Friday due to comments about slow internet on site, I throttled Syncthing. Since then no email issues.

I’m convinced there is something with the QUIC protocol that at full speed causes router issues. I have now seen the same thing happen at three different sites, three different routers but all showing the same issues.

Maybe on a one St device to one St device sync, it’s fine, however I sync to and from multiple offsite devices and I think that’s probably why the routing of the packets are in effect, creating a DOS scenario.

I may be targeting the QUIC protocol unfairly as it might be something else entirely (bug maybe?), but having used St for many years and not had any speed problems, certainly the last few months have seen a noticeable increase of speed related calls. The threads on here which also talk of speeds bare this out too.

I don’t know what the answer is, all I can do is report to everyone what I am currently experiencing and maybe this will help others.

We have pretty much ruled out QUIC as the culprit in your previous thread. It can be seen directly in your own screenshot:

Connection Type: TCP WAN

Could you test your connection with https://fast.com/ ? The service is provided by Netflix and therefore difficult for ISPs to trick.

I did the test. This site has two WAN connections with load balancing enabled.

I accept that maybe Quic isn’t responsible. I only offered it due to other internet search’s on quic giving responses such as high traffic load. But there is something going on that was introduced in the last few months. The systems that are having issues have been unchanged for 2~3 years, then in the last few months i’m getting offices complaining of bad internet speeds so I have had to throttle right back to improve their speeds.

It’s as if there’s far more network chatter than actual file transfer. For example, on small files, it has very little impact on speeds, but get into very large files and everything crawls to a stop as if St is sending huge volumes of metadata repeatedly (if metadata is the requests for actual data) and very little in terms of actual file transfer.

I think a one to one test over the internet won’t show much, but maybe several test devices on different WANs and 800Gb+ file syncs might show the issue?

Could you also hit the “Show more info” button? The upload speed is crucial.

I’m still not convinced by your theory. Syncthing only uses a normal TCP connection. Have you noticed such a discrepancy between the speed reported by Syncthing and that of the Taskmanager?

All my probable thoughts on what’s directly causing the speed issues are speculation but are wholly related to Syncthing. Have St running with no limits and I see significant bandwidth issues. Throttle back St and those issues go away.

But the mail server issue I was getting felt like the (Draytek) router wasn’t able to handle the volume of data packets going through it to the two St devices located on the LAN.

So my first thread was - why does the download speed get to be so badly affected when I have no throttling on the upload. This thread now seems to suggest that when Syncthing is going 100%, there’s such a lot of (fragmented?) packet data that the router might not be able to handle such a lot of requests and ends up slowing the overall bandwidth.

Of the affected office / routers, the mail server one is around 9 months old. The one in the first thread was weeks old when I posted and I have recently replaced a router at another site that complained of speed on the 5th, but the only sure way for the offices to have fast download speeds is to throttle St to at least half of the rated upload speed (usually 9Mbps). Something I have never had to do.

Little discrepancy between the St speed and Network speed

Does the router have a traffic analysis capabilities?

Nothing in depth. I might drop a wireshark on an affected office in the morning, see if it shows anything.

Ran a wireshark on one of the sites so I can reproduce the speed issue. I can upload the log if required. But it’s mostly

TCP Out of order / TCP retransmission / TCP Dup ACK

Which might explain excess packets and little actual data. There’s also a lot of UDP packets too, so is quic and tcp being used together to sync? eg, UDP for metadata, TCP for file data?

Ive just reran wireshark on a throttled sync, it’s mostly UDP packets and virtually no errors

It’s either a TCP connection or a QUIC connection which should be displayed in the UI.

Did you apply a filter to only capture traffic of the related clients? A mixture of QUIC/TCP might just be an indication that multiple clients are at work or you’re capturing other traffic.

To narrow things a bit down, i’d run an iperf3 connection test between the two syncthing hosts which seem to cripple your bandwidth. I’d bet you can replicate the result without syncthing. This all smells like a general network problem.

I will look at iperf shortly, but the wireshark shows this when St is unlimited, 192.168.5.200 is the RO end, 51.148.154.142 is the SO end. Only St traffic would be flowing between these IPs

when I throttle back…

image

I don’t think there’s a network problem as it’s happening at 3 sites that are on different ISPs and different branded routers

No filters applied

One is a TCP connection apparently suffering massive packet loss, the other is a QUIC connection. This seems to at least indicate it’s nothing related to QUIC – rather, that when there’s a lot of traffic the connection quality collapses and everyone suffers.

1 Like

set up iperf, there is some TCP drops, but nothing on the scale of St. I have ran the test several times and this is about average

Not as many retries, but this time seems to be in pulses, most likely due to the tests…

image

But I agree with Jakob, more traffic means the quality drops. But with my new toy i’m going to have a play on different networks to see if it will help highlight the issues, eg, faulty switch(s) etc

1 Like

I’m a bit puzzled why the connection switches from TCP to QUIC. Did you configure a port forwarding on your router?

I did port forwarding for both protocols on the RO end just to see what was passing through, But I have often see a connection drop, go to QUIC, run for a few minutes and then drop, and go back to TCP. This was not on any of the sites that are currently in question

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.