sync rate degradation, no visible CPU/RAM/Network problem

vadsu · November 16, 2022, 5:17pm

Hi, I’m observing a drastic sync rate degradation few minutes after syncthing is restarted. My setup is full mesh of 5 intel 24 cores machines, 2 LANs (so some machines share LAN), all intranet. The median files’ size is ~1.6k, not much variation. When I restart syncthing, I see the files are synced at thousands/tens of thousands per second. After the initial scan, the CPU usage is significant (~14 cores of 24 available), RAM usage is insignificant (few %), network is ~25Mbps of 10G. Few minutes later, the sync rate drops to <=200 files/sec, CPU usage drops to few %. I ran iperf while rate was low and didn’t see neither severe rtx nor other connectivity issues (I also tried to run it when the rate was high and could see no differences). No iptables, no firewalls, no NATs, no relays etc. I looked at the closed thread:

but couldn’t see any proposal… Any pointers?

calmh · November 16, 2022, 6:16pm

All other bottlenecks eliminated, my hunch would be the database. Synced files result in database transactions/writes, perhaps initially those are fast because they go to buffer/WAL/similar, then after a while compacting kicks in. That might manifest as I/O wait on the database disk. But I don’t know, if it’s not obvious you may need to do lots of tracing, profiling, etc to figure out what’s going on.