Using Syncthing with about 20TB and more

Matthias · December 17, 2018, 4:15pm

Hi,

This is my first post here so a hello to everyone!

We’re using Syncthing on two Linux servers, with currently two folders in sync, with almost 21TB in total. So far the setup works, and right now I am looking (with fingers crossed) at files to be synced – about 20GB for today left. It always takes a while, because of syncing with currently max. 30 Mbit, but not always available.

Since we’ve done not much tuning beside changing the inotify limits and used CPU power, I am now starting to exclude problematic directories, and have a look what does not work on a daily basis.

Be prepared for incoming questions =)

Greetings, Matthias

ellnic · December 24, 2018, 11:00pm

Since you’re in your tuning phase, don’t forget to enable large blocks.

xor-gate · December 25, 2018, 1:03pm

Thats an insane amount of data, hope you don’t smoke Syncthing or your kernel with all those inotify filedescriptors

Matthias · December 27, 2018, 1:32pm

Thanks for that, it wasn’t enabled until now – there are lots of files bigger than 256 MiB.

Matthias · December 27, 2018, 1:49pm

Yes, currently in total there are 7.696.359 files in sync =)

About 7 GB of RAM are in use right now, and – if correct – 55% CPU in use from the Syncthing process. But the server is ‘only’ in use for sharing via Samba for about 20–30 users, so it should be OK, and the CPU is a Xeon E5-2640 v4 @ 2.40GHz with 20 cores with lots of RAM, on a ZFS pool.

I am thinking though of using cpulimit, and reading through its manuals, currently the nice level of the Syncthing processes is 11 (not 9 like in the docs?).

Matthias · March 26, 2019, 10:04am

Hi all,

I just wanted to report back, since we are using the setup now since December and since changing to enable large blocks the speed is amazing (considering the data in total and still the 30 MBit connection).

One problem we see from time to time (now about every two weeks) – when there are lots of file changes at once the syncing stops, but I do not see anything wrong in the GUI. Then I restart the service on both sides and everything’s OK again.

As an example: I deleted a folder on one side with thousands of small XML files, and on the next day or some hours later the sync was not working anymore (which took on the server itself a long time and rm or find ... -delete didn’t work anymore in the directory itself).

I am thinking now of a script which creates files with random content and compares those files on both sides let’s say every hour, and if they are not the same to get a notification.

imsodin · March 26, 2019, 12:15pm

Did you check Syncthing’s logs and system performance metrics when the sync stops (e.g. memory usage)? Seeing nothing wrong in the GUI means it responds normally, shows everything as up to date, but it isn’t? Really any info on it would be good, because while it’s nice that it works generally fine for you, it would be even nicer to fix that for you and everyone else

Matthias · March 28, 2019, 4:21pm

Of course I try to find out what is happening, and currently read through the debbuging Syncthing help article, since I don’t see that much in the journalctl right now in the fitting timeframe when the sync stops (for example yesterday between 18:00 and 22:00).

imsodin · March 28, 2019, 5:31pm

I didn’t mean to accuse you of not trying to assess the problem, I was just routinely asking for details. Please don’t take any questions that seem obvious/trivial as a slight, that’s just supporting - I am just trying to help and I can’t know what you already did.

The first thing that picks my interest is your statement that in the GUI nothings looks wrong. As you have unsynced files, do all involved folder and device statuses show up-to-date and do the local/global item counts on both sides match? Does manual scan or pause/unpaused of an involved folder show any change?

And to obtain meaningful logs: When the sync stalls/-ed, enable model debug facility (GUI: actions->logs) and then pause/unpause an affected folder, that should trigger the relevant processes (connection, scan and pull).

Matthias · March 28, 2019, 8:12pm

Oh you didn’t accuse me at all =) I even wanted to delete the entry words of my last post “Of course” but got an error message that the body of the message is the same or something like that.

Yes, in the GUI everything looks like up-to-date in such case. But I haven’t looked at the item counts, and haven’t clicked on pause/unpause then. I can try this the next time!

I have enabled now the model debugging function – maybe connections is also interesting?

imsodin · March 28, 2019, 8:53pm

I’d first check item counts and hitting scan and/or pause/unpause. Connection logging is usually verbose enough to see whether there is any problem at all at default level, so if there’s nothing connection related in the logs now, you don’t need to activate that.

Matthias · March 29, 2019, 12:41pm

Hm, I had such issue today in the morning, the GUI looks on both sides the same (global item count) – paused and unpaused; and the sync stops after a while, with the same sync % / file count. In the log on both sides there is:

2019-03-29 13:30:09 progress emitter: timer - looking after 1
2019-03-29 13:30:09 progress emitter: nothing new

I don’t know if there would be such behaviour when stoping and starting the Syncthing service? I didn’t do that now so far. If I pause and unpause now, there would be again synced after a while until it stops.

But a case like this I haven’t had in the last weeks.

imsodin · March 29, 2019, 12:49pm

Those log entries are normal.

I am not quite sure I understand: Before pausing, both sides were up-to-date, afterwards both devices show progress syncing until at the same % progress stops? What’s in the out-of-sync list? No failed items?
That state with same % sounds quite weird, as syncing happens independently on both devices, i.e. not linked. Probably there’s some misunderstanding on my side - could you please take screenshots.

Matthias · March 29, 2019, 12:56pm

Yes, before pausing, I looked the GUI, and it seems to be OK, but I just copied a file on one side, and it wasn’t recognized. Maybe inotify / kernel is the problem?

Here is a screenhot of the current state:

Matthias · March 29, 2019, 1:16pm

I would restart now the service (on the right side), and if it is like in the past it would be OK again. Currently it isn’t syncing, so stopped with the 3633 items.

AudriusButkevicius · March 29, 2019, 1:27pm

You have failed items that are potentially causing this, fix the issue there first.

imsodin · March 29, 2019, 2:26pm

If you suspect inotify/watch for changes, you can check by hitting the rescan button, which would pick up anything watch for changes missed. And generally what Audrius said.

Matthias · March 29, 2019, 2:27pm

Hm, the failed items are all different states of folder contents, like files were deleted on A, but are still on B, and error message then is:

peers who had this file went away, or the file has changed while syncing. will retry later

I’ll sync those folders manually, and let’s see what happens then.

imsodin · March 29, 2019, 2:42pm

These are error messages from out-of-sync items, we are talking about the failed items.

Matthias · March 29, 2019, 3:39pm

I only have failed items now on one side, and these are all with this error message – not from out-of-sync items.