Best practice for adding nodes with existing data copy in 10 TB range.

Allen_G · February 21, 2017, 10:46pm

Is there a how-to on how to add nodes that already have the data that needs to be kept in sync? Simply adding them all at once causes them to all spend a very long time exchanging indexes just to conclude they all have the same file set.

I did this with a 5 node system each with the same 8TB of data. Most on 100kbps upload links. After a week they were still sorting so I had to intervene and stop things.

imsodin · February 22, 2017, 8:04am

This seems to be the same problem as discussed in this issue:

From how I understand that discussion, the fastest way is to add nodes one by one, i.e. create the folder and scan on a first node, then add it to a second one via sharing the folder and wait until they are in sync. After that proceed the same way with each additional node on after the other.

I think the problem is, that if you first create and scan all folders, then connect them, it needs to exchange indexes and resolve “conflicts” with all other nodes. Quoted “conflicts” because they will turn out to be identical, but at the beginning all sides have copies with the same version vector (meaning first version ever that syncthing saw), so they need to be tested for conflicts.

Allen_G · February 23, 2017, 1:57pm

Indeed - that approach doesn’t seem to scale well. You can add a node only one at a time and each addition can take many days to complete. Should you mistakenly add more than one at a time, things slow down to at least linear or perhaps geometric rates. This is doubly hard to coordinate in a group setting.

There might be some benefit in pausing connections to other nodes or stopping the service on already joined nodes as well so they don’t interact with the global model.

Allen_G · February 25, 2017, 2:01am

I speculated that the following approach might help and so tried it:

Deploy and establish the trust between all the devices
On each host pause connections to all other nodes
Select one host to be first, and create folder shares on it as ‘Send Only’
Share these folders with one other host and un-pause connections between these two.
Allow the new host time (days likely) to scan and possibly decide it’s out of sync.
Hit ‘Override Changes’ on the initial host until they agree they are up to date.
Repeat on successive hosts

At this point all devices are up to date with respect to the initial host, but they are not sharing between themselves

On host two, un-pause connections to host three.
Allow them to finish comparing notes and proceed to the next host.

I made the mistake of getting to step 7 (after a week) and then un-pausing multiple host connections at once. I’ll see if I can recover from this.

I kinda feel like I’m shaking a dead chicken over an idol here and there might be no connection to reality, but it’s at least a process you can launch and come back to see how it worked.

Allen_G · February 25, 2017, 10:50pm

After 24 hours, the smallest (in home folder size) of the hosts filled up the home partition with 30G of ~/.config/syncthing data and the process stopped servicing web requests. I created a symlink to a larger partition and restarted.

The other hosts are still running disk IO to index and temp index files as fast as they can go, hitting the CPU hard and are around 13G in database size. The ‘initial’ node is at 17G - though not doing any significant IO and CPU.

The smaller one was a SSD and was somewhat faster.

increa · March 3, 2017, 3:52pm

See related question here Migrate local sync directory to a new local location?.

I was already using the “send only” trick, but even then, it seems the content of the entire shared folder is being re-transmitted before the other device is happy. Local send-only node is happy because it reports “Up To Date” nearly immediately.

I read your experimentation that you’re still enduring a re-send of all the data. Is there no way to locally (on both sides) hash the entire directory and compare hashes? - then only sync anything that is different or declare sync victory 100% complete. If the entire folders do not batch, crawl down the directory tree and at least eliminate any sub-folders that ~are~ identical.

increa · March 3, 2017, 3:53pm

Uhh… wish I could edit a post. I meant “…do not match”, not “…do not batch”.

AudriusButkevicius · March 3, 2017, 4:55pm

That’s not really the point, the point is to exchange the local view of data in the order it was discovered. You connect and you say I’ve seen your last change starting with X, and then other side has to send changes that were detected after X in the order they were detected, which means we have to sort the changes by the order they happened (and not the alphabetical order), which is expensive, especially then the last change the remote side saw was 0, implying we need to replay everything. It’s not about things being the same, it’s about being aware of the history, so that we could do conflict detection etc.

increa · March 4, 2017, 1:50am

Audrius (@Syncthing Maintainer),

I think you said that Syncthing nodes need to know each other’s history of events. The moved sync directory (recreated) would be the one with no history and I notice data is flowing ~to~ the remote node (that would have history), so it seems the data flow is backwards from what you describe.

But… really that’s tangential. I think the real point is that if any sub-directories (or the entire directory) are the same, then history is irrelevant. That can be a new “time zero”. History prior to identical-ness seems useless information.

What am I missing? It seems awkward to say it’s necessary to accept indexing times of days (like original OP has - my time is only hours).

Allen_G · March 21, 2017, 12:20pm

Adrius comment suggests that there is no initialization state. The ‘first’ scan is just like any scan, and everything not hashed (which is everything) is new and must be transmitted.

That seems like an over-simplification, but is this correct?

As far as bringing pre-seeded nodes up, I was unable to find an optimal path, other than making sure not to bring more than one up at a time.

I have one device on a slow link I am still struggling with.

calmh · March 21, 2017, 2:16pm

If you are absolutely certain that the contents are identical, the optimal path is to scan once on one device, then copy the index database to the other device before turning it up. Both the contents and the index are then identical and there is nothing to reconcile, resulting in a pretty quick up-to-date state on both sides when they first connect.

However, any discrepancy between the scanned data and the actual on disk contents will then be interpreted as a new change - a missing file will be a delete, synced to the other side, for example. Given that this is fairly risky this is seldom a recommended path to take.

Correct.

No, unless you mean the index data. But that can still be quite a lot, and the cost isn’t in transmitting it but in “reconciling” it, resulting in multiple roundtrips of that index data and a lot of database operations.

Allen_G · March 21, 2017, 2:35pm

Copying the database is a neat trick I hadn’t considered. I’ll give that a try now and note the results

simon-c · March 21, 2017, 7:44pm

I am new to Syncthing, it seems generally awesome and what I am looking for, and I am also looking at 10TB replication on a LAN - but I’ve been having some difficulties.

The speed of scanning the first node is great, it builds the index at the read speed of the drive (160MByte/sec - 17 hours), the index is on a separate SSD, but after that, it is impossibly slow at getting that data onto the next node, going at maybe 12MByte/sec (would take 10 days) - even slower on a low end NAS (3Mb/sec - 38 days)

An OS file copy is much faster (110MByte/sec - 1 day) - or faster if direct attached, but you then have the problem described in this thread, of the second node scanning the OS copied files, and the indexes fighting with each other.

Is there a way to put the second node in a ‘seeding’ or ‘write only’ state (the opposite of master), where it does not automatically scan it’s files, and expects the files it is pulling to already be on the local storage, then it can scan the local file and keep it if it matches, or download / replace it if it doesn’t. The index could then match the order of the first node, eventually delivering a copy of the source index to match the copy of the files.

Copying the index over seems complex, and if there is already another share, you would have to merge the indexes. Also, is it an OS neutral format - can I copy the index from Windows to Linux?

Searching these forums for ‘speed’ brings up lots of posts about quite slow performance of Syncthing, I think this is in part due to the multi-tasking that ends up thrashing magnetic disk’s into their worst possible access pattern of small random reads for large files.

I’ve tried playing with the number of copiers and pullers with no real improvement on the seeding performance.

Once it is seeded, I think the performance will be fine, but getting it setup is proving quite challenging.

AudriusButkevicius · March 21, 2017, 9:51pm

I don’t think anything is easy in the 10TB range.

simon-c · March 21, 2017, 10:13pm

Yes Syncthing seems like a real contender though.

What do you think of the idea of a ‘Receive Only’ folder type as a possible solution for existing copies of data on a node?

Once all the existing data has been used (scanned locally instead of pulled remotely), then you can switch the folder type to ‘Send and Receive’ to pickup any new files.

Allen_G · March 23, 2017, 3:29pm

I rsync’d three shared folders, rsync’d the syncthing database and edited the config file on the new host to add the shares, then started it up.

So far lots of tmp-index-sorter and hi io. I do have one difference between to two - one is on a different files system and is ignoring permissions.

Allen_G · March 25, 2017, 1:56am

Eventually stopped responding to SSH clients and had to power-cycle it. I’ll try this approach on another replica where the systems are perfectly identical

system · April 24, 2017, 2:00am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.