Syncthing retransmits large files on rename

rsd · February 26, 2020, 8:27pm

According to the FAQ and this:

Syncthing should detect file renaming and not retransmitting it. I am organizing a large collection os lectures (1GB to 3 GB) each file and renaming is part of the process. I am not talking about copying before deleting (without transmitting) but retransmiti ng files after renaming.

Please, remove the statement from the FAQ until (if not a feature) this is fixed.

-rsd

imsodin · February 26, 2020, 8:49pm

I don’t get what you want.

rsd · February 26, 2020, 8:59pm

Syncthing should not retransmit renamed files.

From the FAQ: FAQ — Syncthing documentation

Syncthing handles renaming files and updating their metadata in an efficient manner. This means that renaming a large file will not cause a retransmission of that file.

AudriusButkevicius · February 26, 2020, 11:18pm

This is still true in most cases, yet subjective to how many files we are talking about per rename operation, which is probably worth while mentioning in FAQ. Also, if the files live in a different syncthing folder, this is somewhat expected.

Also, syncthing does not have a good way to indicate that the optimisation was taken, so you’d have to tell us how you verified that.

rsd · February 26, 2020, 11:40pm

I am sorry I wasnt too clear.

They are in the same Sync folder:

$ find | wc -l 
450 
$ du -sh
322G    .

All files were 100% synced between 2 hosts. I started to rename the files maybe 5~10 files per minute. Rescan is set to 10m. After a while, the out of sync files in the other host went up to 45GB. The upstrem went up (max). Looking at the out of sync files (pop up list of files), many files were showing 0B, so I assume it did detected the renaming procedure. OTOH, most files were showing the size (2GB, 3GB, …) which reflected the upload rate.

AudriusButkevicius · February 26, 2020, 11:46pm

Stuff will still go out of sync and show large sizes of out of sync, regardless it will be renamed or not. The only way to know what exactly is happening is to enable debug logging.

If inotify is enabled, a few files a minute should be ok as it delays deletes compared to new files appearing.

If its on and you still have seen issues, it might be too much changes in a single scan period as the messages indicating change are capped in size, which the splits additional and removal of file over separate messages in which case the optimisation does not work.

It’s much worse when you rename directories, as then things are definately split over multiple messages.

If you don’t use inotify, then the bets are somewhat off for large files due to the message splitting business I’ve explained above.

As usual, the answer is not binary, but I suspect if you rename a single file, it should work as expected.

imsodin · February 27, 2020, 7:16am

However even that should result in copying the files locally, and later deleting them, instead of a rename, but not syncing actual data from a remote device, as we always send additions/changes before deletions. This should only break down if running out of disk space, in which case no new files are created, but deletions go ahead to free up space.

So we’d probably need model debug logs (probably also db for incoming indexes) to see what’s going on in your case.

rsd · February 27, 2020, 4:28pm

I will enable the debugs (model + db), wait for the remaining sync to finish and start a new batch of renaming to try to catch on logs.

Do you think that it would help:

change the scan to a very large interval,
disable inotify (watch for file changes),
hit Rescan,
rename the files
rescan again

to get an usefull log?

imsodin · February 27, 2020, 4:31pm

I don’t have any clue how this problem is triggered, so from my perspective do whatever you need to repeat it - for the logs it doesn’t matter.

AudriusButkevicius · February 27, 2020, 4:51pm

The more stuff happens in a single scan cycle, the more likely you’ll hit this problem.

You should keep inotify enabled as it makes the problem worse (by triggering smaller more often scans as it detects stuff).

imsodin · February 27, 2020, 4:59pm

Can you give me a (contrived) outline of how a rename can result in the deletion ending up sent before the creation? I fail to think of any (unless inotify messes up and only detects the creation long after the deletion, but that would strike me as very odd).

AudriusButkevicius · February 27, 2020, 5:05pm

In inotify case i lt should never happen, as deletes are delayed, so worst case you end up with a copy (if blockmap still works, which we don’t know). In manual scan cases it can still happen.

rsd · February 27, 2020, 5:49pm

Just for information, my system is an Ubuntu 19.10 with 32GB. The other host is a MacOS (don’t know the specs). I do have very long file names, up to 209 UTF-8 characters with accents (which makes them longer bytewise).

rsd · February 28, 2020, 2:25am

So, how can I send you the logs privately (as it has sensitive information to let it be posted publicly)?

ulgonia · April 23, 2020, 9:16pm

Hello, It seems that I have a very similar problem, but maybe I don’t understand how it should behave.

I just installed syncthing on a linux server and on my win10 computer. I set it up to sync a folder in c:\users\username\xxx on the windows machine.

I copied a big file (300MB) in the synced folder.

When I rename the file on the linux machine, the name of the file on windows changes almost immediately (meaning that I can see the .syncthing.xxx.tmp file only for a very short time in the file explorer, then the filename is immediately changed).

When I change the name of the file in windows, and then do some ‘ls’ on the linux machine, I can see the .syncthing.xxx.tmp file much longer. I also see the size of the folder growing progressively, until it reaches twice the size of my big file, and only then the initial file (the one with the old name) is removed.

I deduced that the rename event is well catched by the filesystem watcher in linux, but it isn’t correctly handled in Windows.

This is very disappointing as this would mean if I move or rename the files I want to sync, this will lead to very heavy copy load to get it synced, although it would have taken only a couple of seconds to apply the correct changes.

I tried this using SyncTrayzor or Syncthing-GTK integrations with exactly the same results.

Please help me understanding how to get it work on windows as well.

Thanks!

AudriusButkevicius · April 23, 2020, 11:38pm

It probably does not retransmit, it just copies from the other file.

You can track the progress in the web ui which will tell you if its downloading stuff or just copying from the existing file nearby.

Also, all of this is not fool proof, syncthing will in some cases redownload files.

There is no rename operation at the protocol level, we have to deduce a rename from “add” and “delete” with the same content.

For example if you renamed a large directory, you’d produce a lot of “deleted” and “added” messages for many files (or simply two large message for a single large file), which might get broken into multiple network messages, hence other side gets one of them, starts downloads (or rather copy) before it knows that the other file is no longer needed, because that arrives on a later message.

ulgonia · April 24, 2020, 7:28am

Thank you for your answer.

Is it possible that the behaviour between windows and linux are different? I was really happy to see how fast the rename occured on the windows machine after I changed the name of the file in linux…

Also, what is the reason of the ‘local’ copy if there is no redownload?

You meant ‘track the progress in the web UI’. I would be very interested in having this kind of tracking from in a sort of log file, if possible, in order to ensure I understand the process correctly.

Is there a way to do that?

Thanks

AudriusButkevicius · April 24, 2020, 7:39am

I don’t think the specific behaviour is different between windows and linux, the filesystem watcher behaviour is different which leads to different behaviour in this logic.

I am not sure I understand your question about local copy.

There is no user friendly way to monitor this from the logs, you could enable model debug facility that will have some logs/numbers related to this, but it’s not user friendly as self-explanatory in anyway.

ulgonia · April 24, 2020, 7:50am

With ‘local copy’, I referred to the fact that in my example, you suggested that the file on the linux server would be locally copied instead of renamed. I wondered what the reason for this copy is?

I’ll give the tracking in the web UI a try.

Thank you so much for your very reactive and complete answers.

AudriusButkevicius · April 24, 2020, 8:03am

I’ve explained the reason in my initial reply.