Load balancing with clusters and enough/max

Hello fellow Syncers,

I’m trying to achieve a setup where I have to transfer like 300GiB of files to ~190 nodes. Between those files are some “big” files (between 5 and 20 GiB) that I have to update maybe once a month. The environment in question is composed of 9 classrooms/labs with 21 workstations (one for the teacher and 20 for students). Every classroom is in its own 1 Gbps network. All networks are connected though. There is also a file server in its own network that works as the master copy for the files I want to sync/share.

I have a set up Syncthing as the sharing software right now and it’s working somehow but I have some small problem I am not sure how to solve.

Right now the teacher computer of each classroom connects to the server (receive only). Each classroom internally is a mesh (in the meaning all nodes are connected to all other nodes). And this works, but my problem is when I have to update some files. As said, I have to update some files maybe once a month that together amount for like 20 to 30 GiB of data. When I do the update the teacher computers are max out and become somehow unresponsive to the user (the teacher) for 30 to 60 minutes, while the student computers receive the update. So I can’t do updates during teaching hours. I have to “publish” them during “off-hours” and turn on the computers on each classroom so the update goes through before the next teaching/practice.

Looking for an solution I have noticed the enough and max global settings, that I think can be what I’m looking for. But if I have understood things right, if for example, I set max to 3 in the teacher computers (the intermediate nodes between the main server and the classroom computers) I can’t be sure that one of the connections is the connection to the server. Is that right? Is any way I can force the connection to the server to be fixed/permanent if I use max?

I have also thought about alternative layouts. For example, if I set my network to be a mesh (with connections to the file server in “Receive only” mode) and the max parameter set to 3 or 5, is this feasible? Is the database space needed in each node too big to be practical?

Can someone give me some advice what is the best way to do this?

Many thanks in advance!

PS: I hope I made it understandable, but if not, just ask me :slight_smile:

Nice write up.

What OSes on the teacher and student Devices?

Is it only the teacher Devices that have performance issues while syncing the changes?

Do you have a sense as to where the performance bottleneck is? CPU, RAM, disk and network are the typical culprits.

Without knowing why the teacher machines are bogging down, here are some general ideas: Have you experimented with Actions / Settings / Connections / the two Rate Limit settings? I’d also look at the number of connections (you can set this per Device) and potentially using the OS to reduce the priority of the Syncthing process.

1 Like

Have you seen the docs page on enough/max? It describes an example load balancing scenario somewhat similar to yours (although without the “intermediate” nodes). Maybe something like that could work for you? (limit the clients to one server to basically let them “fail over” to whatever server is currently online). Nevermind, just saw that in your case the servers are already overloaded even with clustering, so this probably won’t help.

1 Like

Whole network is Windows, even the server.

That is a very good question that right now I realize I don’t have an answer to :sweat_smile: . Somehow I have assumed that the culprit was the disk, but honestly I don’t know. I will test it myself this week and I will come back with a real answer. Just as extra information, workstations have at least 32 GiB RAM, at least two drives: C: is a NVMe drive dedicated to the OS (Syncthing program and database folder here) and D: contains the synced folder. Depends on the classroom, it is a 2GiB a mechanical drive or a 2GiB NVMe drive. Processor don’t remember right now but they are modern i5/i7 processors.

Some time ago I played with some settings but I’m going to be honest and say that I don’t remember exactly what settings I changed. I just remember that I didn’t get the expected results and I forgot about it. I will also make some tests with these and I will come back with the results.

Thanks a lot for your input.

1 Like

Yeah, I saw that, but as you yourself said, I can’t apply it directly to my case. I want to use enough/max but I want to be sure how to use it in my case (or do I have to change my syncing network layout)

Thanks anyway for you answer.

1 Like

With those specs, I’m pretty surprised that you’re having this issue. If the spinning rust machines were bogging down but not the NVMe that would be one thing. But if they’re all bogging down, that makes me lean more towards Rate Limit as the fix.

When I do the tests, can you suggest what could be a good value for it? At least as a start. For sure I will try something like 1 to see the effect, but probably I will want to raise that a little bit to get a better syncing performance.

I wouldn’t suggest setting Rate Limit at 1 KiB/s.

I would start with a blunt instrument. If at 1 MiB/s there are no performance problems, I’d go up to 10 MiB/s and then 100. In your shoes I probably would not want to go above 250 MiB/s.

Are there any network performance concerns? Or is it just the teacher’s PC?

Yes, of course. I meant 1 Mib/s, and in this case b stands for bit, but probably anyway is too conservative, maybe best to start from 10Mib/s. Sorry for the confusion.

There are no known network problems. Cabling is not the best, but switches and routers were updated during last summer.

1 Like

Note that the Rate Limit is configured in KiB/s – Big B for Bytes :slight_smile: . I sure hope you come back and let us know what you found out!

1 Like

Not having worked with these settings myself, I don’t think you can achieve such “force priority device” semantics. But let me ask, would it even help with you problem?

If you make sure that every client in a classroom connects to its associated teacher computer, plus additionally two other student computers, then the former will still get bogged down by exactly as many connections as it currently gets. You need to give the intended freedom to choose max nodes for connection, otherwise you won’t reduce the load at all.

1 Like

Thanks for your remarks.

My idea is to set the max setting specially to the teacher computer, so it serves a max of 3 other computers. And to create like a “cascade effect” if every classroom computer has the same setting.

What I want to be sure is that in the teacher computer, one of the connections that is available at all times is the connection to the file server (that acts as the master copy for the files).

Thanks for the clarification. Though I have no idea how to achieve that without risking a situation of a split subset of student computers which are all “well-connected” but don’t notice any updates from the teacher computer.

Hello again.

I have been able to do some tests and to my surprise, the results are… that I have to do nothing! :smiley:

I am not sure what to say, because I am 100% sure this has been a problem in the past, but right now without me touching any setting, I was able to sync 45GiB of data to the whole network without the teachers noticing anything different in their computers (I warned them beforehand, just in case).

So I am going to close this now, but many thanks to @chaos, @Nummer378 and @acolomb for your answers. I am wiser now that before asking :slight_smile:

5 Likes