CPU Usage on large folder

It is the way it is, and will most likely remain the way it is, because that provides the most benefit to the most people.

I think what I’m mostly wondering is if this is currently a limitation in syncthing (such that the inotify daemon cannot tell it what it knows) or just an implementation choice in the inotify daemon

A scan is only expensive if a lot of stuff has changed. Otherwise its essentially a stat call which is not expensive on most filesystems.

If you are using some fancy filesystem (btrfs for example), stat may cause high CPU usage in the kernel, not in syncthing process.

Just using ext4 here, and very little changes in the directory (mostly just renaming < 10 files)

A maildir contains essentially all files in a single directory level, and potentially a lot of files. This is rather niche and counter to what directories were invented for in the first place. Syncthing-inotify isn’t really optimized for this use case - potentially if there was a tweakable for “number of files that need to change before we fall back to just scanning parent directory” you’d want to set that way up. As it is, unless you really need the realtime replication, I’d just set the scan interval to an hour and be done with it. I run with daily scans on my big folders…

Yeah, long time between scans is what I’m using on my laptop. Server can handle the load, and long scan time isn’t (in theory) an issue on the laptop. Just trying to understand the nature of the issues, etc, as I experiment with this.

A profile could tell, but at a guess looking at the lots and lots of files in that directory causes lots and lots of stat() calls that are not cached (because too many to cache for long) and lots and lots of database lookups. The latter are sort of expensive when we’re talking hundreds of thousands of them.

Ooh. So it does a stat on each file and compares to the last stat that is in the DB? That makes sense for why the scan is expensive.

I am encountering a similar issue to the one mentioned. I would like to ask if you are aware of this: https://facebook.github.io/watchman/ I am not sure if this is useful?

As I understand, the problem isn’t about the scan process, but about figuring out whether the scanned files are in the database. The project you linked doesn’t help with that.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.