RenameFile operation doesn't seem to be performed everytime when renaming a file in a Syncthing folder

enfuzion · February 9, 2017, 8:56am

I’m studying rename file operation of syncthing recentlly and I found some problems. The funtion renameFile() in source file rwfolder.go dosen’t seem to be executed each time when I rename a file on the opposite machine, and once it is not called, Syncthing will genrate a new file by the name I renamed to and copies data into it instead of just renaming the old one. I traced the bep message and found that sometimes Syncthing receives one bep contains both Add and Delete FileInfos after doing rename and sometimes it receives bep message twice and each one cotains one FileInfo. It seems that the renamefile funtion will be executed exactly as serval FileInfos are received in one time rather than receive them for several times. So I wonder the reason why bep message sometimes could carry serval FileInfos and sometimes just carry one. Is it possible to make Syncthing send both Add and Delete FileInfos in one message everytime when a file is renamed? If it is not, is there any other way to ensure the rename operation will happens on local machine as the same operation happens on the others?

calmh · February 9, 2017, 9:45am

Ah, this is probably timing related. We first scan for additions, then for deletes. Updates are sent to other devices with some delay for batching. If the time between adding the file and seeing the delete is too long, the updates will end up in separate batches. Then the device on the other side just sees the add and can’t do a rename.

Good catch. Not immediately obvious to me how to fix it without potentially delaying updates for a long time.

enfuzion · February 9, 2017, 10:25am

I found that the function updateLocals() in file model.go will run after the scanning process discovered the file changes. It has code like this:

files.Update(protocol.LocalDeviceID, fs) events.Default.Log(events.LocalIndexUpdated, map[string]interface{}{ “folder”: folder, “items”: len(fs), “filenames”: filenames, “version”: files.Sequence(protocol.LocalDeviceID), })

files.Update seems to do the db update job and the events.Default.Log seems to notify the sender to send indexes to other machines. Maybe make the LocalIndexUpdated event be triggered after all db updatejobs are done is a way to keep the file messages could be sent together in once?

calmh · February 9, 2017, 10:34am

In some cases, yes. But given that we hash all files first and then scan for deletes afterwards, it can take arbitrary time between discovering the new file and the delete. For example if you rename a directory with a bunch of ISO files in it. In general, we want to get those index entries out there as soon as they’re completed (-ish…) and not wait for the complete scan to finish.

AudriusButkevicius · February 9, 2017, 10:39am

I am surprised you were not aware of this.

calmh · February 9, 2017, 10:41am

No, I agree, it’s obvious for the cases where there is lots to do. I think the one case where it’s somewhat annoying that it doesn’t work reliably is if a single file is renamed within a folder. This should, ideally, always work.

enfuzion · February 9, 2017, 10:49am

Ok, I get it. Thanks for the reply.

AudriusButkevicius · February 9, 2017, 10:50am

To be honest, the only way I can see to solve this is to do the “was deleted” checks sender side and make sure they end up in the same batch.

calmh · February 9, 2017, 11:00am

Mmm. I doubt the added complexity warrants it, but we could

keep an index of full file hash => FileInfo
when the scanner detects a new file, it checks this index, and
- stat each file with the same hash, in the same folder
- if it doesn’t exist, commit the new entry and the delete together.

But this is still no guarantee. In the index sender we might put the message boundary between the add and the delete. For example, if we’ve already queued enough entries that adding the delete as well to the message would exceed the desired message size, or if the file itself is large enough to warrant it’s own message.

To do it “atomically” we’d need to add a field to the FileInfo itself indicating that it’s a rename. That will probably have vast negative consequences and cause weird situations by itself…

I think this is just a caveat to be aware of - renames are best effort.

Changing this constant to a few seconds instead of zero would probably bring up the hitrate a fair amount though.

AudriusButkevicius · February 9, 2017, 11:45am

I think we could add logic that makes sure that any deletes after the boundry are merged into the same batch.

Yet I think we feed index senders from the database, not from the scanner

calmh · February 9, 2017, 12:03pm

Exactly. So the batching is only half of the equation.

enfuzion · February 14, 2017, 7:47am

I found that Syncthing inotify will sometimes notify the old file before the new file separately when a rename is detected, and if the old file notification is recived by Syncthing firstly, it will send a delete message to the remotes before the the scanning on the addition file is over, so the remote Syncthing could delete the old file on its machine before reciving the addition file message, and this also makes the rename operation mismatched. So I think there may be a litte work to do with inotify to adjust its handling on rename notification to make rename case works properly?

calmh · February 14, 2017, 9:02am

Probably. I bet this is handled differently on all OS:es and correlating it in the inotify layer is going to be painful.