I have 10 folders which every one of them use 100% percent of one core. So initial scanning of all folders use 10 cores. This is fine for all folders expect one of them which has 2 millions file (150GB).
The servers I’m using syncthing on have 48 cores and are not meant to do anything expect syncing data between them, so I’m fine with even using all 48 cores for initial scanning and rescanning which takes ages.
Initial scanning took almost a week, but rescanning that one folder takes 9 hours!
And when I look at CPU usage of syncthing on these two servers, I only see one core getting used by syncthing and its usage is 100%.
They’re using a battery-backed RAID controller and IO utilization is nearly 1% right now
Almost 4GB of memory is getting used and servers have 128GB memory.
All these lead me to think that syncthing would definitely improve if it can use more CPU cores.
I’m running syncthing using systemd on Linux.
I’ve tried these without a luck:
Setting Environment in systemd file: GOMAXPROCS=“48”
Disable Temp Indexes
Increase hashers to 100 / Increase copiers to 100 / Increase Max Concurrent Writes to 100
Increase hashers to 2048
I don’t think it’s the best idea to fiddle around with the settings like that, e.g.
GOMAXPROCS is used to limit the CPU core number, not to increase it. The other options that you listed above also seem to be quite unrelated to “rescanning”. Especially things like fsync, temp indexes, copiers, and max concurrent writes have zero relation to scanning at all. I’d suggest to revert all of that to the default values.
Basically, the initial scanning consists of hashing files, which can be both CPU and I/O intensive. The rescanning phase is only about checking existing files for size and timestamp, which doesn’t require much CPU, unless the files have changed and need to be re-hashed.
I/O is a different story though. Does the RAID consist of SSDs or HDDs?
Correct, I was just exhausted and try all things I thought could be very little related to scanning, at last I have revert all to default values, except temp indexes. As these files are less than 100KB on average, I don’t think that would help me.
RAID consists of HDDs, but as IO utilization is 1% I don’t think that’s an issue.
At initial scanning I saw 10 Disk IOps and during rescanning about 50 Disk IOps.
Syncthing using one core is bottleneck, not anything else.
Hashing is concurrent (following number of hashers etc which you’ve already tweaked), however walking for changes is single threaded (per folder). So that’s a few million stat calls plus a few million database accesses, presumably it’s the database access that takes most of the CPU time. Having the database on SSD is important, as database latency will also be a limiting factor.
I don’t think it would be super tricky to make that part concurrent with a work queue of some kind either, but there’s little benefit for the vast majority of users. It probably wouldn’t get rejected if you wanted to dig into it and file a PR.
Can you elaborate on making database access concurrent with “a work queue of some kind”?
What can be concurrent with that?
Also I presume I should open an issue first prior to making a PR after reading contribution guidelines, right?
Last but not least what do you think about putting syncthing database on tmpfs?
It’s approximately 2GB right now and we have lots of free memory
And these servers won’t get rebooted too often, and if we had to reboot one of them and syncthing has to rescan it’s fine if it’s fast enough (less than 1 hour).
I searched a little and I’m thinking of something like this: Limit database writes - #5 by Guz
The database is not a cache; what happens when you have files that differs between two devices is entirely dependent on database state. Erase the database on one or the other side at wrong/right time and you can get anything from a sync conflict to an overwrite in the unexpected direction.
The database supports concurrent reads, and the filesystem supports concurrent stats, so I could see it being a gain for you to have one walk routine that feeds filenames to a queue and multiple processing routines handling the files from that queue.
If you only have either the raid or tmpfs, I’d absolute recommend loading the db into tompfs (and perstisting it when stopping syncthing). I think I still do that using anything-sync-daemon on my homeserver (edit: yep I do) because in the past I used to only have a crappy platter in my device, and it made a huge difference. As Jakob wrote, that means you might loose all changes to the db since you last synced back to disk which can then cause conflicts when starting again, but I believe that’s a worthwhile tradeoff for fast db access.
I just tried tmpfs and rescanned a folder and nothing was changed compared to HDDs. Prior to doing that I checked if database is already in memory cache using vmtouch, and 98.8% of the
index-v0.14.0.db directory was already in page cache.
Seems like putting database on tmpfs does not help in my scenario at least.
Jul 05 17:05:44 hostname syncthing: [XXXXX] INFO: Ready to synchronize "directory" (aaaaa-bbbbb) (sendreceive)
Jul 05 17:07:55 hostname syncthing: [XXXXX] INFO: Completed initial scan of sendreceive folder "directory" (aaaaa-bbbbb)
Jul 05 20:08:12 hostname syncthing: [XXXXX] INFO: Ready to synchronize "directory" (aaaaa-bbbbb) (sendreceive)
Jul 05 20:10:22 hostname syncthing: [XXXXX] INFO: Completed initial scan of sendreceive folder "directory" (aaaaa-bbbbb)
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.