Concurrent connections to improve download speed

Hello,

I have searched the forum for similar question but did not find an answer.

Some download clients allow you to download the same file in a several connections. Under certain scenarios it can speed up the download speed tens of times. In particular when you are downloading a file from A to B, anywhere between A and B could be a place that is limiting the speed of a single connection. If, however you use 10 connections to download the same file you observe ten times the speed. This is based on personal observations.

I’m downloading files via syncthing, from A to B (that is there are only two peers here, for the sake of this post, A and B). And I can see that syncthing is only opening a single connection. Somewhere along the way there is a limit on this connection, I can reproduce the same limit by downloading a single file over HTTP(s) if I download 2 files or a single file with 2 connections I can double the speed.

I’m looking into what I can do with syncthing so that it uses more than one connection between A and B instead of trying to transfer everything in the confines of a single connection.

Any help is welcome.

1 Like

This is not possible in Syncthing without some hackery*. There was a very heated discussion on this particular topic in https://forum.syncthing.net/t/cannot-find-max-file-size-in-documentation-and-some-other-specs/17286 recently. You may want to read it thoroughly.

*Unless you run multiple instances of Syncthing simultaneously. You still won’t be able to sync the same folder at the same time, but you can sync different ones this way. Please keep in mind that doing so will use more resources and may require some babysitting to keep everything separate. The Docs have the required information on how to use command line switches to use separate configs and such, and you will also find plenty of information on the forums if you search for “multiple instances”.

1 Like

Thank you, I did suspect this is not something the authors would be willing to support, but wanted to make sure. I think that thread is conclusive. Thank you for your help.

1 Like

It’s not that we are not willing to support.

I think we are willing to support, if there was a simple library we could use, but there isn’t, and we are not convinced the benefit of this or the amount of people this would affect is worth the effort that this would require.

6 Likes

@AudriusButkevicius Thank you for this, Audrius. Yes this is exactly what I meant more or less. When I said “not willing to support” I did not mean it as a moral support for the feature, I meant you implementing the support in the software. And I meant, that since the developers themselves do not seem to have a use case for this (judging by the linked thread), it is likely that they will not be willing to work on it.

Syncthing is an impressive piece of engineering. I’m pretty sure that while it itself, as all software on earth barring the simplest, uses dozens of libraries, the core functionality is written by it’s brilliant authors. An absence of a single library to wrap the command line around was not a deterrent in creating it. I think we can agree that supporting multiple connection is possible in principle, there is nothing technically preventing it, library or not. But of course this, same as any other aspect of this software, is a lot of work. So the question in my head was would you be willing to do this work and support this feature. And the answer in my head was that you most likely do not consider it important for you to actually put time in it. And this is what I expressed by saying that I do not think you would be willing to support it.

No matter how I look at it, it seems to me that I was right ;).

Thank you for chiming in, for all your work and for the community support, really appreciate your work, it’s no small feat.

1 Like

It is easy can be solved by abstractions inside source code. Though I am not Go developer and do not know existing code base.

But in general you can open multiple connections for single file (if big enough). If you need analogue it is HTTP range request. It allows to send portions of the file.

Solution is very simple when you got some binary you simply disassemble it on the one side and assemble it back on another (you just need minimum chunk size).

If files are smaller than chunk size you simple send it whole by one of connections in pool.
But in case of small files you still can send it in N threads/connections where each of them sends some file from queue. So yes it is multi-threading programming but it is not a problem many years already.
But once again not sure what in Go world happens.

But at least we know that the feature is in demand.

It’s totally not simple.

First of all syncthing is not http, so your oversimplification of things does not do any justice to the problem.

You can go read the syncthing protocol, but there are various edge cases to the problem, namely what happens if you sent a request, but the connection you sent it over failed? Do you re-send it? How do you know if the other side even received it?

What if it’s something that you can’t resend, for example index updates, which would potentially cause database corruption? What if the requests you send arrive out of order? Applying index messages out of order is not ok and will lead to database issues.

You’d need to effectively implement something like TCP, with sequence numbers, acking, signalling of lost messages, retries etc, which is far from simple.

2 Likes

On the practical side, for what it’s worth I used aria2 it does support download over multiple connection with many protocols and is working quite well, this is point to point though, unlike syncthing, and it’s for a single file, not for syncing folders. May be that’s something that can help others in a similar situation.

If you messed control flow and data transferring in your design then yes maybe it is a problem. But I do not know the details of protocol. I cannot operate by terms you used. But I understand how it should be made in general.

Do you re-send it? How do you know if the other side even received it? What if it’s something that you can’t resend, for example index updates, which would potentially cause database corruption?

How do you do that today? You cannot resend it the same way using single connection - if single connection fails - result is the same. I do not see any difference.

File updating is atomic operation. You cannot write half of file first and half later. For example the app writes temp file in place and replaces the original file. But I expect more effective solution when you open file exclusively and update it partially (though it is a bit more risky).

In any case it is atomic procedure. You cannot start updating until you have all desired data on another side.

So it should not be matter how to send that data. Of course you need to know how much data should be received by remote endpoint. In other words you tell what you will send, and then remote side accepts all data blocks (using any count of connections), when remote sees it got all needed blocks for operation - it performs update.

No need to have something like TCP, but of course some work is required.
When a remote end gets data it needs aggregate data from multiple connections until all required blocks are received. Source end should send count of chunks and size of each (something similar to purpose of Content-Length in HTTP world).

P.S. I read it briefly, but it does not look as detail description of what happens under the hood. Is it the correct document? https://docs.syncthing.net/specs/bep-v1.html

P.P.S. Dirty workaround which can do the trick.
We can start syncthing process for each inner directory (but transparent for user). If you want to sync C:\SomeData and inside you have Data-A, Data-B, Data-C, then you can internally (inside syncthing) start the same code for C:\SomeData\Data-A, for C:\SomeData\Data-B, for C:\SomeData\Data-C, but user will see only C:\SomeData on UI. It will need some wrapper code but probably less than you expect with multi-connections data transfering. Though if I benefit from such tricks, from point of view of code it is not a good idea, but maybe as temporal “solution” it is really not so bad.

I also use aria2c and even built version with libssh2 and torrent support for Windows and Linux using latest code from repository. I have also increased default maximum connections count from 16 to 256.

So it is amazing tool to make “aimed shot”. But for “carpet bombing” we need something like Syncthing.

Everything seems simple when you don’t understand the details. There is of course a connection abstraction in Syncthing. Please feel free to implement multiple connection management underneath that.

1 Like

For current topic the details don’t matter if abstraction is correct.

P.S. And I told already I am not Go developer. C#, C++ - OK but not Go.

Indeed none of this matters unless someone is going to sit down and implement it. :stuck_out_tongue:

5 Likes

Everything seems simple when you don’t understand the details

That’s exactly right. I’m often tempted to provide my “valuable insight” how something can be done in principle, based on “years of experience as a software engineer” without understanding deeper details in a particular technology. More often than not, if I dig deeper I discover that since I was not familiar with that particular technology and a particular implementation my general thinking does not apply - the devil is in the details.

So as a rule for myself I try not to tell other people that their job is easy, unless I’m prepared to put my money where my mouth is, roll up my sleeves and implement it myself - because it’s so easy. I usually do not excuse myself with the fact that I don’t know particular language or framework, because there is a good chance that if I knew them then I would realise that it is not easy at all :wink:

4 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.