On minimizing upload traffic


#1

I have two “remote devices” on my computer, and both devices peer directly with each other. Will SyncThing upload the same data to both devices simultaneously, or will it upload to one device first before uploading to another, or will it split data 50/50?

Context: I’m moving from Dropbox, and I’m trying to use SyncThing in a star configuration with two central servers for redundancy. That is, every PC syncs with two “remote devices”.

However, at home, I have a quite poor connection, in terms of upload speed (it can technically reach 70 kB/s up, but I have to limit apps to slightly lower than that, to leave space for TCP ACKs).

So my concern is that if I have two “remote devices”, then all uploads from my home computer will take twice as long because they’ll have to be duplicated to both remotes.


(Evgeny Kuznetsov) #2

Twice the upload from your home PC will only take place if your two servers can’t connect to each other. If the servers are connected between themselves, then naturally they will exchange data on the folders they both have, and so server A can get the data from server B that it didn’t get from your home computer X.


(Evgeny Kuznetsov) #3

However, there’s no way to explicitly require this behaviour. Say you have servers A and B, and your computer X that has the data. X starts sending data to A and B. If at some moment A and B have different sets of data, A can get data from B and B cat get data from A. But there is no way to tell X not to send the same data to both A and B.

If the amount of traffic is a concern here, you may first only connect X to A, wait for all the data to be synced, then connect A to B, wait for B to get all the data, and only then connect X to B - they will see they have all the same data and go on syncing as expected.


(Simon) #4

No need for that: You can’t require it, but it will happen due to random pull order. The chance that both remote devices get the same date is vanishingly small.


(Evgeny Kuznetsov) #5

True.

But just for the sake of answer completeness: say device X is on metered (and very expensive) network, and connection between A and B is relatively slow. There’s a big chance that for N amount of data that is to be synced from X to A and B the upload traffic of X will be much closer to 2N than to N.


#6

Thanks, I think I’ve read somewhere about this (yes they’re connected to each other), but I didn’t quite realize until now that the actual data transfer was pull-based, not push-based. So that means the traffic depends entirely on what the servers decide to pull, not on the uploading client’s decisions?

But I wonder if the servers will actually prioritize syncing from each other, rather than syncing from my slow device first. (Do I need to configure priorities?)

I think I’ve seen the pull order configuration, is it block-based or whole-file-based? (Random block pull order sounds like it’ll do the job just fine; but if it’s whole-file, then it probably won’t help when I’m syncing just a few large files.)

In my case the connection between A and B is practically always at least 10x faster than X-to-A or X-to-B. (Though it is variable above that.)


(Evgeny Kuznetsov) #7

Correct. Remote machine adverticises what it has, client pulls what it needs. Somewhat like BitTorrent.


(Audrius Butkevicius) #8

It’s block based, and prioritisation will happen naturally as requests are sent to the least busy device (in terms of outstanding requests), which the slow device will have plenty of. However, this will only kick in quite late, as the probability of fetching the same block the other has device has initially is low, also, every now and then you’ll stumble on a block that only the slow device has.


(Simon) #9

Both, however the file pull order is configurable (e.g. smallest first, newest first, …).


#10

Makes sense. Some extra transfer is fine with me (I’m not paying for it).

Aha, so the configuration I’ve seen (file-based) is a completely separate thing from the order that imsodin mentioned (block-based).

Actually, now that you mention it, I think “smallest first” file order might help me a lot – that way I can edit my notes/code and have it propagate quickly even though the large images/data files are still in the queue.

Thanks for the explanation. I think I will start with the default configuration (with both remote servers active) and just see how the pull logic behaves, do some measurements.

If it actually ends up being annoyingly close to 2X, then I’ll probably use your suggestion and keep only one server as primary and the second disabled (backup)… or just live with it.


(Audrius Butkevicius) #11

Once it starts downloading something, it won’t scan, nor it will adjust the queue. So once the big files are in flight, nothing will happen until they are finished.


(Evgeny Kuznetsov) #12

I’d say it’s highly unlikely on your configuration, being that your X–A and X–B connections are at least an order of magnitude slower that A–B one. But if testing doesn’t cost you much, giving it a try is always the best way to figure things out.


#13

Ah. Kind of a shame then.

But is the queue & scanning global or is it per-folder? Since SyncThing allows multiple “folders” with independent configurations, I guess I could have one for notes/small files, and another for storage/large files.

It doesn’t (it’s fixed-rate), so yeah, I’ll just do some tests and see how it works.


On a related note: it sounds like “outgoing rate limit” is either global (affecting LAN sync too), or per-device. I was looking for a way to limit (serverA + serverB) to 60 kB/s in total, while still potentially having fast sync between same-LAN peers. I guess that’s not really possible? But I think it should be fine if I set both serverA and serverB to half the speed, i.e. 30 kB/s each.

(Why do I want to limit the already-slow uplink even more? I found that capping upload to 90%–95% mostly avoids causing heavy lag and doesn’t compete with other applications as much, whereas uploading anything at full speed really hogs the connection, regardless of protocol.)


(Jakob Borg) #14

Per folder. There’s a separate setting whether rate limiting affects same-LAN at all, which defaults to off.