Parallel pulling/rescan?

Hi everyone,

first of all: What a great tool. I am currently setting it up to synchronize two large data sets over an internet link and it is working very well.

My question is about Sycnthing’s behaviour when files are pulled. Within my setup, I have very large files which could take a day to pull, and also there are very small files. If a pull has already been started on the large file, is it possible to transfer a small file simultaneously? The benefit would be that the large files won’t hold all the other transfers up.

I have already tried using the “Smallest First” ordering on the folder - but it seems that this does not apply if a transfer is already running.

I guess what I’m looking for is something like this:

  1. Rescan -> find new large file

  2. Start transfer of large file

  3. Rescan some time later -> find a new small file (large transfer is still running)

  4. Transfer small file immediately

  5. Continue large transfer until it is finished.

I’d be happy to read your opinions on whether this is possible with Syncthing, or if you have an idea for a workaround.

Best Regards,

Florian Utzt

It’s parallelized, but on a block level. So in most cases it’ll have at most a handful files in flight at the same time, and for very large files it will usually be only that one. Currently I don’t think there is a way to do what you want, other than to separate larger and smaller files into separate folders.

If you restart it will pull acording to the setting you set up. Once it’s in motion changing the setting means nothing.

You can also increase the number of copiers in advanced config which will make it pull a few files at a time (up to a maximum of “puller” requests on the wire for all files)

Lastly you can open the out of sync dialog, and bump individual files to the top of the queue (only the files that haven’t been started or scheduled yet tho)

Wow. Replies from two developers after 20 Minutes. Respect!

I just tried restarting the Syncthing service (I guess that’s what you meant by “restarting”?), but it carried on like before. So I think you’re right, Audrius. Once it’s in motion → rien ne va plus. :slight_smile:

The “copier” action seems to be described differently in this thread, but I’ll test it out as soon as I can.

Finally, Jakob’s suggestion about using multiple folders might be a good workaround and I’ll look into it. Is there a theoretical/practical limit to the number of folders?

Audrius’ suggestion about increasing copiers is a good one and something I didn’t think of. Do try that - it might give you more file level parallelization.

Regarding number of folders, if you are generating them by hand I think you will give up before Syncthing. There’s no theoretical limit but we haven’t tested scaling it into astronomical numbers. There’s some state kept for each so you’ll increase memory usage somewhat with the number of folders.

If you changed the pull order, saved and restarted syncthing, it should pick the setting up. If it hasn’t, it’s a bug.

Thanks a lot for the insights. Just tested, but I can’t alter Syncthing’s behaviour by restarting or changing the folder settings back and forth. The file order even persisted through a program update from 0.12.22 to 0.13 which I did when I was a little desperate :slight_smile: . So at the moment, my conclusion is that once the transfer has started, the order is really final. Maybe I’ll give it another try with a fresh install some time later.

I also tried increasing copiers (first to 10, then to 100), but wasn’t able to produce different results.

What DID work was creating multiple folders as suggested. There’s still the risk of many files clogging up the queue for one folder, but totally unrelated files in different folders could be transferred in parallel. I’ll stick with that for the moment.

If another idea comes to your mind, please tell me - I’ll be happy to test it out.

Best Regards,

Florian Utzt

Well that’s not the case for me. I’m trying to sync 50GB VM split into 13 ~4GB files. Syncthing has created 13 temp files and is constantly running is circles and every time pulling a bit of each file. It’s been like that for 5 hrs and due to this cyclic approach I don’t know what the progress is. I’m not 100% sure but I’d say that it didn’t do it this way in v.12. I have just disabled temp indexes for that folder (no reason - just random test) to see if that helps.

So the settings has to be applied before downloading starts, hence you set the settings, save and restart and it should work. The only visible change when setting copiers to 100 will be in the out of sync modal.

If you changed the pull order, saved and restarted syncthing, it should pick the setting up. If it hasn’t, it’s a bug.

There might be a misunderstanding. I have no doubt that the settings get applied upon restart, but they don’t effect transfers that are already underway.

I took a look at the REST API /rest/db/need. There are 3 states: progress, queued, and rest.

  • My large file is in “progress”, since it is being downloaded.
  • nothing is in “queue”.
  • Everything discovered meanwhile goes into “rest” and stays there until the large file is finished.

-> This behaviour is exactly as stated in the documentation. I was just looking for a means to get the “rest” files up into “queue” (or even better, into “progress”) before the one large file has finished. It doesn’t seem to be possible (/rest/db/prio already checked). Which is also not really a problem, rather nice to have.

For clarification, this situation is also different from bedosk’s use case, since his VM files would likely all be discovered within one rescan and would all end up in queue or progress, but not in rest. I suspect if he added one more tiny file before the VM transfer has finished, he would have to wait for the 50 GB to transfer before the tiny file follows.

Right, that is a different case. It starts downloading stuff before it’s aware of everything available. Sadly there is no workaround for that apart from splitting the data up.

Yes, I agree. Thanks again for your time, guys.

Provisional conclusion: I will live with the behaviour and will use rsync/robocopy to preseed small files if I need them faster than Syncthing can provide. I guess it won’t happen very often.

I looked into the possibility of automatic (script-based) creation of folders within the ST config along with auto-creation of ignore files…but decided against it because of the complexity. The risk of creating overlapping folders is too high compared to a little gain in sync comfort. :slight_smile:

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.