First sync taking days on synology nas, taking 1GB RAM, stuck at 500 items out of 769

I’m running syncthings v0.10.11 on a Synology DS214+ (1.33GHz dual core armv7 ArmadaXP CPU & 1GB RAM) using the package found on the cytec.us repository : this is my “master-folder”, let’s call it machine A.

I’m also running the same synology package on a DS214play (1.6GHz dual core i686 Intel Atom CE5335 CPU & 1GB RAM), let’s call it machine B.

I already have a folder that is both on machine A and B (original sync done with rsync), that contains mainly videos of 700MB per file, for a total of 512GB od fata in 769 files and 476 folders.

I have setup the folder to be “Folder Master” on machine A and not on machine B.

After over 2 days of letting the 2 machines do only the sync (no other activity on the machines), I find that the folder is still beeing synced, and the machine A seems to be stuck at “500 items, ~175 GiB” when looking at the web-GUI. (I’ve been trying to minimize use of the GUI since I understood it puts extra load on the machine…)

Still in the GUI, syncthings is using a lot of ressources :

Download Rate     0 B/s (91.5 MiB)
Upload Rate     0 B/s (27.1 MiB)
RAM Utilization     986 MiB
CPU Utilization     155%
Global Discovery     OK
Version     v0.10.11

I’ve seen the RAM usage go over 1GB (wich did not seem right given the machine only has 1GB…)

When digging a little, I found that there is alot of “panic” logs in the var folder… Both on machine A and B…

  • Here is a list of the files inside the var folder on machine A : http://pastebin.com/X1YnwRxE
  • Here is the content of the panic-20141210-190739.log on machine A : http://pastebin.com/9m80LZJA (I’ve picked a date a while back since I’m currently tryng to sync other things between machine B and machine C, so this is the logs from a time where only A and B where syncyng…)
  • Here is a list of the files inside the var folder on machine B : pastebin : R0QtrbHr
  • Here is the content of the panic-20141210-192422.log file on machine B : pastebin : FLd0VyQ7 (I’m a new user, so I can’t add so many pastebin links, just copy/paste the ident in bolg after the pastebin url.)

I realy don’t understand these panic logs, and I don’t understand why it takes so much memory… It would be great if someone could help me find out what to do to finally be able to sync this folder that it taking days to be scanned and seems to ne be making progress for about 1 day now…

Thank you in advance

sounds like syncthing needs too much memory and crashes because of that. Since you say that you only have 1GB but syncthing reports more this could be the issue. Maybe increasing swap size could help (the RAM usage of over 1GB could be because there is some swap, but not enough). It could be that the 32bit or arm build just can’t handle so much (at least there is a issue for 32bit windows that it crashes because of fatal error: runtime: cannot map pages in arena address space even if there is more free RAM), maybe @calmh or @AudriusButkevicius knows if this is the same for Linux :wink:

1 GiB != 1GB

Alex is right that you should perhaps try and setup some swap, just to see if that helps. Can you also do file syncthing to see if it’s a 32bit of a 64bit binary? As the package maintainer has removed the label from the version string, but I assume it’s 32bit since the device is quite lightweight.

Half a terabyte is quite a lot of data, but I am not sure it should be hitting 1GB. Have the machines been running previous versions of syncthing? There were some issues previously which corrupted the database causing high CPU/Memory usage with later versions.

Otherwise, I suggest you run syncthing with the heap profiler (see -help) for a while (until it crashes) and then provide the binary together with the profile data for us to look at.

It’s too early for me to do the algerbra of how big the index should be for 512GB, but it works out at around atleast 64 bytes of index data for 128kbytes of file data.

The machines have never been running another version, I’ve installed it from the synoPackageRepo and immediately restarted syncthings for upgrade to this version before doing any other configuration.

I did not know that “file” command, that’s very useful ! Here is the result :

  • machine A : 0 ✓ root@machine1 /var/packages/syncthing/target/bin $ file syncthing syncthing: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, stripped Segmentation fault (core dumped)

  • macine B :

    0 ✓ root@machine1 /var/packages/syncthing/target/bin $ file syncthing syncthing: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), stripped Segmentation fault (core dumped)

That segfault in file is not encouraging…

Regarding the ram and swap, I checked, and when running htop I found mem/Swp, and just for information, /proc/cpuinfo shows the following bogomips :

  • machine A : Mem = 1009MB & Swp = 2047MB & bogomips = 1332.01
  • machine B : Mem = 699MB & Swp = 2467MB & bogomips = 3200.23

For now, I’ve decided to try syncing music instead of movies (smaller file, faster hash ?) Same situation here, the folders already exist, initial sync was done using rsync, they contain 80.98GB in 10911 files and 938 folders.

My fear is that I’ve been adding/removing directories in the web-gui to try different settings several times before they could finish syncing… My index directory is now 422.3MB on machine A and 595.0MB on machine B. (by the time I finish this reply, it has went up to 488.6MB and 598.5MB)

It’s looking better with these music files, I also added the following ignore pattern on both sides :

**@eaDir
**@eaDir**

I did this because synology indexed media files and creates these @eaDir folders that I do not care about syncing…

It’s now lookging better, almost 2 hours withoput panic log, so I’m waiting a bit before I restart the app wwith “/usr/local/syncthing/bin/syncthing --home /usr/local/syncthing/var/ STHEAPPROFILE” and I will post the logs here once it’s done.

500MB for 80GB does not sound right. It might have the old 512GB index there too, plus some trash as you were adding and removing folders, so you might want to nuke it.

I think it works out approximately 1-2MB of index for 4GB of data, hence 500MB is like 2TB.

Oh, ok. Could you please help me making sure how to “nuke” the index ? I’d try this :

  • shutdown syncthings
  • go into the foder /var/packages/syncthing/target/var/index
  • rm ./*.ldb
  • should I also remove these files : CURRENT LOCK LOG LOG.old MANIFEST-041384
  • start syncthings

But maybe I need to use the “-reset” option I saw in the help section…

shut syncthing down, and just rm -rf /var/packages/syncthing/target/var/index

Well, on the music directory (80.98GB in 10911 files and 938 folders), It finally completed the full scan on both machineA and B.

In fact, I might have about 40 files on machineA that are not already on machineB, just to be fully honest.

on machineA, in the GUI i have this for the folder :

Up to date
Folder Path  /volume1/music/
Global State 11850 items, ~81.0 GiB
Local State 11850 items, ~81.0 GiB
Folder Master Yes
Rescan Interval 90000 s
Shared With machineB

and this for the remote device :

machineB  Syncing (1%)
Download Rate 0 B/s (22.5 MiB)
Upload Rate 0 B/s (22.7 MiB)
Address BB.BB.BB.BB:49970
Version v0.10.11
Folders music

So it’s stuck at 1%, but no CPU usage, and no network usage, I was expecting it do send the 40 missing files to the remote location at my top upload speed of 100kib/s, but it’s not…

And when I look at the GUI on machineB, for the folder it says :

Syncing (0%)
Folder Path /volume1/music/
Global State 11850 items, ~81.0 GiB
Local State 11789 items, ~80.3 GiB
Out Of Sync  11742 items, ~80.3 GiB
Rescan Interval 90000 s
Shared With machineA

and the remote device shows up like that :

machineA Up to Date
Download Rate 0 B/s (22.7 MiB)
Upload Rate 0 B/s (22.5 MiB)
Address AA.AA.AA.AA:22000
Version v0.10.11
Folders music

And I get the following Notice :

15:15:02: Folder "music" isn't making progress - check logs for possible root cause. Pausing puller for 1m0s.
15:17:44: Folder "music" isn't making progress - check logs for possible root cause. Pausing puller for 1m0s.
15:20:26: Folder "music" isn't making progress - check logs for possible root cause. Pausing puller for 1m0s.
15:23:08: Folder "music" isn't making progress - check logs for possible root cause. Pausing puller for 1m0s.
15:25:51: Folder "music" isn't making progress - check logs for possible root cause. Pausing puller for 1m0s.

I’ve checkes the logs, on both machine A and B, there is nothing there :

0 ✓ root@machineB /var/packages/syncthing/target/var $ ls -al
drwx------    4 syncthin root         12288 Dec 12 13:10 .
drwxrwxrwx    5 syncthin root          4096 Dec  9 08:36 ..
-rw-r--r--    1 syncthin users         1411 Dec  7 11:49 cert.pem
-rw-------    1 syncthin users         2524 Dec 12 10:52 config.xml
-rw-------    1 syncthin root           785 Oct 30 20:02 config.xml.v2
-rw-------    1 syncthin users         1275 Dec  7 11:49 config.xml.v6
-rw-r--r--    1 syncthin users          247 Dec 12 09:48 csrftokens.txt
-rw-r--r--    1 syncthin users         1411 Dec  7 11:50 https-cert.pem
-rw-------    1 syncthin users         2455 Dec  7 11:50 https-key.pem
drwxr-xr-x    2 syncthin users         4096 Dec 12 15:10 index
-rw-------    1 syncthin users         2455 Dec  7 11:49 key.pem
drwxr-xr-x    2 root     root         12288 Dec 12 13:07 old-panic-logs

At least the indexes are of decent size : 132.4M on machineA and 118.9M on machineB…

Any idea on how to find a solution ?

The logs are being printed to stderr of the application.

Just to double check:

Do you by any chance keep the webgui open? The webgui takes a lot of processing power and on slower machines can slow down progress to a crawl, i.e. if I have the webgui open on my raspberrypi then syncing basically doesn’t work/progress.

Ok, I have stopped the syncthings that was executed with start-stop-daemon and started it myself in ssh.

I was indeed keeping the GUI open, wich I will now close, I’ll be back with logs hopefully soon :smiley:

When lauching syncthings, I found the issue : on machineB, it was trying to chmod the file, but the syncthing user did not have permission to do that… I solved it with a “chmod -R g+w .”, and it has been syncing for a while now, and it seems to do the right thing… No logs anymore on the ssh prompt !

Only strange thing : on machineB, the GUI show what I expected on the folder:

Syncing (99%)
Global State 11850 items, ~81.0 GiB
Local State 11827 items, ~80.8 GiB
Out Of Sync 23 items, ~175 MiB

Only the device shows Syncing (1%) and I don’t understand why…

But on machineA, the “master”, figures look very strange to me :

Up to Date
Global State 11850 items, ~81.0 GiB
Local State 11850 items, ~81.0 GiB
Out Of Sync 11789 items, ~80.3 GiB
Folder Master Yes

Why is almost everything out of sync ?

During that sync, memory usage was acceptable, so I’ll wait a bit for the last few files to be synced, and I’ll add the video folder again with it’s 512GB for 769 files and 476 folders, will keep you up to date !

In any case, I wanted to thank you very much for helping me and let you know I really appreciate how quickly you responded !

It is most likely because machine A is master, and machine B has modified mtimes or permissions, which A refuses to sync (since it’s master) and looking out of sync in the global picture.

You might be right, I’ve hit the “Override changes” on the master, and now, I’m seeing a lot of logs on the slave :

[3767M] 17:14:06 INFO: Puller (folder "music", file "some/folder/relative/path/file.flac"): shortcut: chmod /some/folder/absolute/path/file.flac: operation not permitted

I don’t know what it’s trying to do, and when I lok in detail at some of the files in these logs, I se this :

ls -al file.flac
-rwxrwxr-x    1 toxic    users       651090 Mar 24  2007 file.flac

I’m running the syncthing program with the “syncthing” user, and in the /etc/group file, I have this :

users:x:100:syncthing

So I believe it should be ok… I do not understand what’s going wrong here…

Well do a chmod yourself as that particular user to see if it works, or sync with Ignore permissions.

Ok, it took me a while, but the issue has reappeared now that I’ve setup te huge video directory again.

To summarize, for now, I have shutdown syncthings on machine B that had the music folder synced, but errors with chmod (I tried chmod myself with the user and it says permission denied, so I’ll solve the issue a little later now that I know what it is)

Now that machineB does not run syncthing anymore, I hoped that machineA that has a very small arm CPU would’nt be slowed down by machineB tring to sync the indexes before the folder could be fully indexed on machine A.

So I added my video folder to machineA, selecting “masterFolder”, “Ignorepermissions”, and an ignore pattern "@eaDir" and "@eaDir**. I did not select any remote device to share this folder, I wanted it to be indexed before I shared it… Syncthing is running on machineA, but launched from ssh terminal with the syncthing user and the command :

/usr/local/syncthing/bin/syncthing --home /usr/local/syncthing/var/ STHEAPPROFILE

I let it run through the night, and I got some panic logs : http://pastebin.com/2Hbv83hw Here is the output of my console : http://pastebin.com/kWm4pAFw This is the panic-20141213-200231.log file (first log file since I added the movie folder to be synced) : http://pastebin.com/9TzdDGM0 Other files don’t say much more, only the last one (panic-20141214-074928.log) seems to be incomplete because it only contains this :

...
Panic at 2014-12-14T07:49:32Z
fatal error: runtime: cannot map pages in arena address space

goroutine 37 [running]:

and nothing more…

Right now the machineA is beeing slowed down by an automatic (rsync) backup on USB3, so the CPU spends 80% of it’s time in IOwait mode, but still, syncthing is responding, not using much CPU because of rsync eating it all, but nonetheless using 1.00GiB of RAM for syncthings… But when the localstat finally shows up in the web UI, it show that it has only indexed ~36.6GiB for 100 items…

One improvement thought : since I trashed the index folder, it is now at a reasonable size of 177.2M, that is for the whole music folder (81GB in 10911 files and 938 folders) plus the few files it indexed on video…

If you have any more idea on how to get this huge directory finally synced… as reminder, the video folder is now 1.3TB big for 1806 files in 1176 folders…

I also had a side-question, assuming we solve the issue and I take the time to do the first indexing, afterwards, I expect to add or move about 1 file (~900MiB) every day in a subfolder of this video synced folder. It seems to me that once the first indexing has been done, upon restart it is much faster to rescan. So could you please calm my fears and tell me that full re-indexing is not required very often in normal use ? Is it a bad idea in itself to try to use syncthings for such a big folder ? Or on a such underpowered machine ?

Thank you in advance !

I had one idea that might maybe help… I’ve looked at the output of lsof command once, and I know that syncthings is opening several files at the same time while syncing…

Maybe there is a way to set the max number of indexing threads to 1, because I know that this CPU is very slow, but I also know that disk IO is pretty good (it’s only in IOwait mode when on USB…) Maybe hasing such big files is using a lot of memory and having only one hash at a time would help…

Or maybe it would be a waste since this CPU has 2 cores, even if they are small… It was just an idea…

Your issue is obvously a huge folder on a calculator like device.

I don’t think there is a lot that could be done here. You can switch off scan caches and so on, but these are nothing compared to the actual index size.

There isn’t a way to stop parallel hashing, but that might help, but I guess you’ll have to to either raise and issue in github and wait until someone cares enough, or do a custom build yourself. You can also try splitting your large folder into multiple smaller folders, adding them one by one and letting them reindex one by one, then having different enough rescan intervals not to trigger allto rescan at the same time.

I feard so… Thank you anyway for the help ! I won’t bother creating a custom build or waisting more of your time on this…

I’m gonna investigate unison for this purpose, I read it uses rsync more cleverly than I do to achive 2-way-sync…

And if that too does not work, I’ll stick to 1-way-sync with rsync, because I know this is a calculator-like CPU, but it’s low power enought that I can leave it on all the time with no noise, no heat, and limited electricity bill, so I’ll keep that thing, and give up on shiny features like syncthings…

Thank you anyway !

As I suggested, splitting stuff into smaller pieces might work, or maybe doing indexing on a desktop and then copying the index across on to the calculator.