I’m having an issue with Syncthing using up all the available RAM in a Synology NAS, and being killed by the OS.
The NAS has 16GB RAM installed; amongst some smaller folders, I have one particularly large folder set up inside Syncthing - ~7m files at ~46TB. This folder takes a number of days to complete scanning - and I think it’s being killed before the scan completes.
The earliest record I can find of Syncthing being killed is 2018-04-02, during the 0.14.46rc2 release phase. There are no records of Syncthing being killed during the months I had it installed before that date, and I’ve had this large folder set up since my first Syncthing install.
It’s syncing your large folder, which means that at the moment it has the metadata for those 31 TB of files in RAM. That’s unfortunate for a huge folder where lots of it is out of sync, but how it is at the moment. Once the initial sync of that folder has completed the RAM usage will be much lower.
It’s not scanning though, it’s syncing. The two are different, and have entirely different memory profiles where the amount of memory required while syncing is proportional to the number of files that are out of sync. Unless you’re saying that the other side is the one doing the scanning and getting killed?
You can certainly file a request for it, but folders in the tens of terabytes is, to say the least, a somewhat niche use case. It’s also probably not going to be trivial as the reason for keeping it in RAM is to be able to sort the queue according to the configured criteria. Potentially we could add it as a sorting variant; “I don’t care, keep the metadata out of RAM”.
=> So there are a lot of things that have changed, on this device, compared to what we have in the database. Or, the files are new (not in the database already). These files should be synced to other devices.
Or, things haven’t changed locally but just reading the file metadata (listing directories and checking modification times etc) and database takes a long time because there’s a lot of it and disk access is slow and/or not cached in RAM.
=> In addition to the changes detected above (if any), there are a lot of files that are older than the files on other devices. These need to be synced to this device from someone else.
Scanning does not result in things to sync. Scanning results in things for others to sync.
In my case (on this particular device), this is definitely why scanning takes a long time - not because the content has changed a lot.
Sorry - I’d just initiated a Syncthing restart: after correcting the Ignore Patterns, I saw that the RAM usage hadn’t dropped and, rather than wait for the OS to kill it, I jumped straight in and restarted.
From previous experience, I expect the RAM usage will be up in the 4-5GB range in an hour or two whilst it’s scanning. Not high enough to be killed on this machine - but is that to be expected just during the scan phase? I’ll grab a heap profile if that helps.
16 gig of RAM is very little for a NAS with tens of TB of files, if you expect to actually access a large number of those files. 16 gig isn’t enough for my little desktop machine. The operating system will always access the files through RAM, so that repeated access only hits RAM. For that to be efficient you’ll want a lot of RAM if your server access many files or huge files.
For that kind of server I would want more like 256G of RAM. Although that’s expensive.
In any case, what you could do for the initial scan is to set up an awful lot of swap. There’ll be a lot of trashing and it won’t be fast, but it’ll get the job done and you can get rid of all that swap later.
It makes the block list smaller so the metadata overhead is smaller everywhere. The block list isn’t kept in RAM for more than a couple of files at a time, but if the files are large this can still be significant.