Hi all! I am currently using Syncthing to share a bunch of scientific data (~250 GB, 30,000 files) between just several machines. I wonder how well would this scale to, say, 100 nodes?
I.e., is there any limitation that would prevent such use? For instance, required memory, or starting time increases a lot with each added node? (Or any other issue you can think of?)
I also wonder if anyone tried pushing Syncthing to the limits, and what was the limit, if any?
Sorry if I missed any previous discussion about this, and Iāll appreciate any replies.
Well, this is both encouraging and worrying! I guess that RAM requirement is related to huge number of files or total size of data? In such case my much smaller database might by OK with much smaller RAM amount.
However, does RAM usage also grow significantly (e.g., linearly) with each added node?
Actually its the index database which grows linearly, but given go does its own memory management you can expect slight memory growth due to having to deal with more data between the gc invocations.
The stats in the 100% column should be read as āthe heaviest user has xā. It doesnāt mean that all of the rows in that column are for the same user!
So, if I understand correctly. At first the network has few nodes and each node is OK with little RAM. Then the network grows and some nodes with small amount of RAM may have to drop out. So eventually the network consists of only big badass nodes.
Thanks, I was just wondering about this. It would be nice to see the full set of stats from each of the heaviest users! (heaviest in each category)
Maybe you could try a hierachical approach? If you have e.g. 100 nodes, group them by 10, then have node 00, 10, 20, etc. share āFolder Aā with each other, then add shared folder āFolder Bā of the same local path (maybe needs a symlink on the filesystem, if ST doesnāt allow using the same path for two shared folder) and share that with the nodes x1-x9.
In short: Share the same folder twice, once for the āmastersā of each group of ten, and once for node x1-x9 of that group
That should limit the RAM usage to 1/10th on all āslaveā nodes, and still only 1/5th on the āmasterā nodes (which may need to be a bit beefier then, RAM-wise) - and they would still all be syncing with each other.
Well, thatās a theory at least, i have absolutely no idea if it would work in real life.
@Kirr now that five years have passed, do you have any more experience sharing large scientific data sets via Syncthing? I recently began working with astrophysics and cosmology communities who frequently synchronize large and dynamic data sets, and I have been looking for an excuse to experiment with Syncthing for some of their use cases. Another technology Iāve been thinking about is Hypercore.