Scalability Question

MattM · October 8, 2015, 1:46pm

I have searched and read forums about scalability, as in how many connections per node are practical. From what I see, consensus is something like “10 is OK, 50 is pushing it, and 100 is probably too much”. Is this about right?

Background: I am interested in pursuing a scalable network that might grow to 100 or 1000 sites. Most of the data flow would be 1-to-many, so sort of a private content delivery network. I understand that a fully-meshed network of 1000 nodes is not practical, so as a “backbone” hierarchy, how do we configure it?

E.g. to grow to 1000 sites, is our baseline hierarchy: 1x1000, 1x10x100, 1x10x10x10, 1x30x30

We can add “mirror” sites and partially mesh it etc, but some rule of thumb regarding how large can “N” be when it comes to 1xN connections would be very helpful in planning.

Qualitative feedback is better than none. If we were to run some experiments to gather quantitive performance data, does anyone have any ideas how to simulate lots of nodes (e.g. 100) without actually having to pre-invest in that many computers?

kisolre · October 8, 2015, 5:11pm

If I read the data (Statistics) correctly there is at least 1 cluster with 464 nodes (464 nodes sharing the same folder?) and at least one device managing 18 TB of data.

uok · October 8, 2015, 5:27pm

and 116.0 GiB/s hashing performance

MattM · October 8, 2015, 5:43pm

I have no doubt Syncthing can scale up to large networks. Question is how best to construct the network connection topology…

In one extreme, every node is connected to every other node: (N-1 connections each node, ~N^2 connections total).

In another extreme, each node is “daisy chained” one to the next: (1 or 2 connections each node, ~N connections total).

What I want is guidance on is how best to connect the in-between cases. How many connections PER node are reasonable before performance significantly degrades?

As point of general feedback, we have tested Syncthing vs a range of other file sharing approaches and Syncthing performs best-in-class in terms of update rates and latency. Scalability is on our “to do” list, thus this inquiry.

AudriusButkevicius · October 8, 2015, 5:50pm

My gut says hubs and spokes is what you are after.

All hubs connect to neighbouring hubs, all spokes can connect to only their hub and other spokes of the hub. A hub can potentially be a few nodes for redundance.

Something like a snowflake topology.

Bare in mind, this is my uneducated gut feeling.

This I feel solves two problems at the cost of latency:

Connection count
Index size kept per device

The size of the hub will define the latency.

MattM · October 8, 2015, 5:52pm

About how many spokes per hub do you think?

AudriusButkevicius · October 8, 2015, 5:56pm

Depends on the dataset, and acceptable latency.

MattM · October 8, 2015, 6:09pm

OK thanks for the feedback. Sounds like some testing is in order to get a measure for how fast latency increases as a function of connection count. If anyone has more quantitive experience thanks in advance.

FYI our dataset is not likely to be relatively large by most standards, e.g. thousands of files total, <1GB total, ~1 new file per second. But we do care a lot about latency, as updates need to be propagated in near real time (<10 sec latency, faster is better).

(Yes we are basically streaming data via file sharing. Please don’t tell us not to do that; it works great and has a lot of advantages. Syncthing is great at this for point-to-point scenarios thus far).

Nutomic · October 8, 2015, 8:49pm

For testing, you can start multiple Syncthing instances on one device with syncthing -home="dir". To generate keys/config, you can use -generate="dir".

graboluk · October 8, 2015, 10:00pm

My impression is that in a large network a good solution could be a small degree expander graph, for several reasons, including for example that it would be extremely hard to make the network disconnected by even a relatively large number of failed nodes.

MattM · October 9, 2015, 5:45pm

Using the advice to run multiple Syncthing’s on same system (-home option), I ran a test with a “master” server and 10 connected nodes, all but one on same machine. inotify ran on master Syncthing (only). Hub and spoke topology.

Updating the master with a new file once a second for 10 minutes, average latency was less than 2sec (to the remote node). This is indistinguishable from earlier tests with only a single connection. CPU on iMac (i5) for all 10 Syncthing’s was about 20% (out of 400% available).

Conclusion: 10 connections of themselves (with modest number and size of files) are not a significant performance burden. Based on CPU use, I would expect order of 100+ connections to be feasible at moderate system load.

It was somewhat tedious to setup each of the 10 child nodes manually. A fully automated scripted way to add additional nodes would be handy to test larger numbers of connections.

Overall, Syncthing continues to shine as the best performing file sharing solution we have tested.

quantkiwi · October 9, 2015, 6:07pm

Has there been any research into practical file limitations? – i.e. Can I sync 100 million+ 10kb files?

Thanks,

MattM · October 9, 2015, 6:12pm

Our application doesn’t require large numbers of files, so we haven’t pushed that in our testing. Perhaps others can better answer. I recall seeing discussions along this vein in other threads.

quantkiwi · October 9, 2015, 6:15pm

Thanks @MattM, we have about a 1-3TB of 10kb files Id like to sync. I’d be curious to know how much RAM/processing it would take (and if there are techniques to reduce its requirements).

AudriusButkevicius · October 9, 2015, 8:55pm

A lot of small files can be painful, as the overhead of managing them is quite high, and stuff such as metadata starts consuming more space than the actual file.

Syncthing will probably perform best with 1mb-100mb per file type payloads. The latency you are seeing can potentially be decreased with a patch I have in line.

We have had 10M files managed by someone, so not sure how 100M would scale. It would probably make the scan lengthy.

system · November 8, 2015, 8:55pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.