Syncthing for read-only mirror replication

DarkHelmet433 · April 16, 2015, 10:56pm

I’m using Syncthing for doing read-only content replication for mirrors. I know this isn’t quite the intended use case but I thought it might be worth writing up what we’ve found so far.

First up, the data. There’s four different data sets that I’d like to use with Syncthing, and I’m using it on two so far to get some experience. Until recently we were using rsync exclusively, and had a brief encounter with btsync.

The first set is around 70GB, most files are in the order of 100MB to 1.5GB. The second set is smaller (4GB) with small files - a few hundred kb through tens of megabytes.

The data is being replicated all around the world (East Asia, North America, EU, West Asia, etc), we’re seeing the worst of the internet misbehavior. Things were getting out of hand with rsync when we had to do transfer rate benchmarks periodically and move the rsync mesh endpoints to find better traffic characteristics.

I haven’t looked too closely yet (I know that we can see the pending transfer list via the REST api), but when a new bulk directory of thousands of files is scanned, it looks like all the clients try to fetch the same files, in the same order, from the one master that had it, rather than trying to get some sort of p2p flood behavior. Am I imagining this? Is there room for adjusting this if so?

Our experiments with btsync a while ago did show this (in between crashes) - the master with the new blob of data would distribute fragments to all the different peers rather than the same data to every peer.

The second thing that caused a great deal of surprise was the master mode definitely did not do what I expected. What I was looking for was something like the btsync readonly vs readwrite semantics. But right now if a (supposedly) readonly mirror damages a file, the others will replicate the damage. The master merely activates an optional overwrite button that somebody has to go and click.

I think we worked around this in the interim by creating a second staging master, putting all the syncthing mirrors into read/write/non-master mode, and rsyncing over any changes that appear. (ie: something coredumps into a mirror, it gets replicated, then the rsync goes and removes it and the removal gets replicated. No button click needed. It does seem sub-optimal though when what I think I want is some sort of auto-override behavior.

The third discrete data set is around 2TB, about a million files. We’re replacing about 200GB per day of these files and are waiting for mirrors (rsync based) to catch up before moving symlinks around. The drop size of new files is around 50GB at a time. I have suspicions that this will be too much abuse for syncthing and too far outside its design intent. btsync wouldn’t have worked either - it would have spent all its time rehashing. This volume is highly dynamic. Think of OS package build farms feeding this.

The fourth set is around 1.5TB, 2.7 million files and is mostly static, with another 3TB/6 million files on the side that is almost completely static.

I’m pretty sure the data profile of set #4 would be just fine. It would presumably take a while to index and converge, but I haven’t actually tried it yet. I’m really worried about the readonly replica problem on #4 though. I don’t have space to keep a non-syncthing source-of-truth online to have a continuos rsync any changes in the remote replicas.

The reason I’m looking for something better than an rsync mesh is that I’m looking for something to automagically adapt to internet bottlenecks and stop babysitting it. When switching rsync endpoints around is the difference between 10KB/sec vs 40MB/sec in throughput means we have to do it.

So, that’s what I’m trying to do. I realize that this isn’t quite syncthing’s goal (ie: make the replicas look EXACTLY like the masters, no matter what, and immediately undo local changes). Is this something that I can expect to have to fight with syncthing over, or ways it could be tweaked to get behavior closer to what I’m after?

Looking over the REST docs, it looks /rest/completion might give me the status info that I would need for doing some of the state changes for dataset #3 (or wherever it moved to).

TL;DR - I’m looking for more randomness in pull order from clients to maximize p2p throughput, and read-only slaves that can never initiate changes into the cluster (or another way of simulating that)

Any other thoughts? Am I even looking in the right place?

AudriusButkevicius · April 17, 2015, 9:28am

Yes, currently it does stuff in alphabetical order. Also, currently only peers which have the full file are able to participate in the seeding.

I have a pending pull request which allows peers with parts of the file participate in the seeding (it’s fairly primitive and costly, but should work for most cases), which also randomizes the order in which files and blocks get downloaded.

I think implementing a new write only (wo) folder type has been requested for quite a few times. It’s not rocket science and fairly easily doable (basically you need to strip the scanning/reading/r part from the current rw folder implementation), given you are willing to invest some time to get it working.

DarkHelmet433 · April 17, 2015, 5:26pm

Ah, good to know. I’ll go and poke around the pull requests to see what it looks like. The randomized file order is the part that sounds particularly interesting.

When you said “have the full file”, I think that’s quite ok for my use cases.

On the ro folder thing. Scanning would still be needed. When a local change was found it would have to undo it and revert back to the cluster version rather than distribute it to the cluster. I will spend some quality time with the tree this weekend and see what I can come up with.

xHN35RQ · April 17, 2015, 5:50pm

@DarkHelmet433, have you thought of using a Bittorrent protocol solution for your project?

Some examples:

More info:

Zillode · April 18, 2015, 2:30pm

We are aware of related work and this has been suggested multiple times. The use of the bittorrent procotol falls outside of the scope of this project as explained in previous posts.

xHN35RQ · April 18, 2015, 11:23pm

Ah, my mistake, I was trying to suggest to @DarkHelmet433 to look into using BitTorrent protocol for his project.

No, I’m happy with Syncthing as a project and the current protocol implementation (so far as I understand it) seems well-designed!

Zillode · April 19, 2015, 6:58am

no my bad, sorry. Thanks for your answers

xHN35RQ · April 20, 2015, 12:12am

@DarkHelmet433 Also have a look at https://github.com/russss/Herd - It might be just the thing you’re looking for.

xHN35RQ · April 20, 2015, 2:01am

My brain kept thinking about this so here’s one way you might be able to use Herd:

I know that your data-sets are highly dynamic, perhaps one way you could use Herd is to set a file-watcher to track your data-sets on the master server and bundle changes into discrete torrents that are immediately pushed out to mirror hosts via Herd.

Upon successful torrent download completion, the mirror might run a local rsync to merge in the changes from this torrent into the local copy of the data-set on the mirror.

Depending on how fast your master data-set changes your file-watcher could poll on change (inotify?) or poll every X number of seconds. Either way your mirrors will lag behind the master by the time it takes to:

Build torrent file on the server
Transfer torrent to mirrors
Merge changes into local copy on mirror

You might even have a local cache time set on each mirror to determine how long to keep each torrent for seeding to all mirror peers. Each torrent can be held for X hours before deletion. This way you merge new changes immediately, but hold torrents until most of the entire swarm has received them. (I guess a tracker server could handle this, but it seems like you might not need it.)

In summary, you might be able to roll your own alternative to BitTorrent Sync using a combination of Herd for data transfers and rsync for merging the data on the mirrors.