Version: syncthing v0.14.46 “Dysprosium Dragonfly” (go1.10.1 linux-arm) deb@build.syncthing.net - on a Raspi
I have now seen this for the second or third time:
Syncthing suddenly just not running anymore.
This time i checked: sudo service syncthing@sync status
With this output:
● syncthing@sync.service - Syncthing - Open Source Continuous File Synchronization for sync
Loaded: loaded (/lib/systemd/system/syncthing@.service; enabled)
Active: inactive (dead) since Mo 2018-04-09 00:14:26 UTC; 10h ago
Docs: man:syncthing(1)
Process: 4304 ExecStart=/usr/bin/syncthing -no-browser -no-restart -logflags=0 (code=killed, signal=PIPE)
Main PID: 4304 (code=killed, signal=PIPE)
In the journal there was no error message at the end.
What could be causing this?
I have one suspect:
I have one folder which is VERY big (more than a million files, >100GB).
I have seen this before, that while scanning this folder syncthing stopped working. My guess would be that there is some kind of is-alive-check of the system to verify that the service is still running, but probably when the scan takes too long it is not responding or not generating a heartbeat or whatever triggers the is-alive-check. Thus the service is killed.
When i paused this particular folder the service survived.
When i first paused it until al other folders were scanned (after restart) and then activated it, syncthing would also survive.
Anyway … it seems connected. (But of course this could be coincidence as well …)
I actually had a similar problem with the 32bit dev-build (from another ticket) on my windows machine this morning. There was a popup from windows saying syncthing had a memory problem. Didn’t think about it more though. But synchting was using almost 1GB of RAM at that time.
Two questions:
Could you add a proper error-message in that case so it’s easy to understand?
Is there any way to work around this? I mean the machine is 32bit with low RAM, can’t change that. Maybe could you optimize / reduce the memory-usage?
If it dies due to failure to allocate there is a very verbose panic message, I just it just wasn’t captured by systemd or you didn’t see it. If it gets killed by Linux due to OOM, there is no chance to emit a message. The dmesg log will have a message from the kernel about it.
Generally speaking it’s fairly optimized as is. The known exceptions are very large files (which soon will be better with variable block size) and large folders where a lot has changed (because all of the changed files get queued).
[998745.207835] Out of memory: Kill process 21850 (syncthing) score 826 or sacrifice child
[998745.207975] Killed process 21850 (syncthing) total-vm:1340936kB, anon-rss:856940kB, file-rss:0kB, shmem-rss:0k
Disables the cache which allows to calculates scan progress, so your scans won’t have progress but they won’t consume as much memory.
There are a few other settings that could reduce memory usage, yet I suspect it shouldn’t be this high to start with. Can you try redownloading the binary to make sure it’s not corrupt?
I don’t think that could / should be a problem. Did this ever happen?
I use apt-get to install it. How should it be corrupted without being totally broken?
Would it make sense to use a disk-spilling queue in both scanner and puller? Scanner is a bit more problematic as we store a file info (without block though), not just file name. It seems pretty doable to use an adjusted version of the index sorter for this. It’s not the nicest solution considering the disk io discussions going on, but better than OOM crashes and short of walking the folders twice, I don’t see a possibility to have progress updates without intermediately storing files to be processed. I’d definitely propose to use some heuristic criterion for spilling, not a fixed max size, to prevent spilling on systems that can take the memory spike.
I had to move Syncthing from my NAS box (4GB) to the main PC (16GB) to get it working. And now of course watching for changes does not work (on CIFS mounts)