I have now seen this for the second or third time:
Syncthing suddenly just not running anymore.
This time i checked: sudo service syncthing@sync status
With this output:
● firstname.lastname@example.org - Syncthing - Open Source Continuous File Synchronization for sync
Loaded: loaded (/lib/systemd/system/syncthing@.service; enabled)
Active: inactive (dead) since Mo 2018-04-09 00:14:26 UTC; 10h ago
Process: 4304 ExecStart=/usr/bin/syncthing -no-browser -no-restart -logflags=0 (code=killed, signal=PIPE)
Main PID: 4304 (code=killed, signal=PIPE)
In the journal there was no error message at the end.
What could be causing this?
I have one suspect:
I have one folder which is VERY big (more than a million files, >100GB).
I have seen this before, that while scanning this folder syncthing stopped working. My guess would be that there is some kind of is-alive-check of the system to verify that the service is still running, but probably when the scan takes too long it is not responding or not generating a heartbeat or whatever triggers the is-alive-check. Thus the service is killed.
When i paused this particular folder the service survived.
When i first paused it until al other folders were scanned (after restart) and then activated it, syncthing would also survive.
Anyway … it seems connected. (But of course this could be coincidence as well …)
I actually had a similar problem with the 32bit dev-build (from another ticket) on my windows machine this morning. There was a popup from windows saying syncthing had a memory problem. Didn’t think about it more though. But synchting was using almost 1GB of RAM at that time.
Could you add a proper error-message in that case so it’s easy to understand?
Is there any way to work around this? I mean the machine is 32bit with low RAM, can’t change that. Maybe could you optimize / reduce the memory-usage?
If it dies due to failure to allocate there is a very verbose panic message, I just it just wasn’t captured by systemd or you didn’t see it. If it gets killed by Linux due to OOM, there is no chance to emit a message. The dmesg log will have a message from the kernel about it.
Generally speaking it’s fairly optimized as is. The known exceptions are very large files (which soon will be better with variable block size) and large folders where a lot has changed (because all of the changed files get queued).
[998745.207835] Out of memory: Kill process 21850 (syncthing) score 826 or sacrifice child
[998745.207975] Killed process 21850 (syncthing) total-vm:1340936kB, anon-rss:856940kB, file-rss:0kB, shmem-rss:0k
Would it make sense to use a disk-spilling queue in both scanner and puller? Scanner is a bit more problematic as we store a file info (without block though), not just file name. It seems pretty doable to use an adjusted version of the index sorter for this. It’s not the nicest solution considering the disk io discussions going on, but better than OOM crashes and short of walking the folders twice, I don’t see a possibility to have progress updates without intermediately storing files to be processed. I’d definitely propose to use some heuristic criterion for spilling, not a fixed max size, to prevent spilling on systems that can take the memory spike.