How does syncthing detect files to be synced?

I understand that syncthing detects file changes by changes in modification date, size or permissions. But what happens if I have a file where contents changes without changes in modification date, size or permissions? For example, rsync allows me to use the “-c” option in such cases.

Thank you Frank

The change is not detected if nothing changes in the metadata.

What a pity. It would be great to see such feature in a future release.

Thank you Frank

What’s the actual use case that would benefit from this?

The use case is, for example, virtual encrypted disks (also called volumes) as created by Veracrypt/Truecrypt. These are encrypted files of fixed size, which can be mounted and used like a disk. File modification time does not change when changing file contents (intentionally, though can be switched off).

That’s a very niche use-case for which you already have a work around (disable fudging mtime), so I think it’s suboptimal to spend effort on a feature that already has a workaround and no other use cases.

I guess from your point of view you think it’s important to hide the mtime of the encrypted container, but it’s a false assumption, as last modification time would still be tracked by syncthing, so it’s information that is leaking even if you decide to hide it.

3 Likes

Not sure, that having some things encrypted is really niche today. And actually it is not my decision to keep modification time unmodified. That is intentional standard for Veracrypt (and for Truecrypt). Personally, I do not care, I just kept the standard behavior. By the way, I do not quite get why information about modification time should still leak. I acknowledge that many file syncing or backup software fails on this scenario. Yet, I think it is at least worth a note in the documentation that in such a scenario file changes may remain unnoticed (just to avoid a pitfall some people may stumble into). And may be in the future somebody is interested in adding this feature to syncthing :wink:

It’s not that it would be hard to add, it’s that it would be painful to the point of uselessness to read and hash your possibly multi-gigabyte encrypted disk image on every scan just to see if it has changed.

3 Likes

This is not a fringe case. There are lots of times where file contents change without metadata changes. Many databases have that as on option, for example.

A scan should not be necessary to detect changes, even on Windows. On every operating sytem except Windows, it should be rather trivial. You can tell inotify, for examply, to notify you on file writes, and it will regardless of whether the modification timestamp on the file changes or not. In Windows you might have to resort to using change journal records. Means you have to sift through a while volume’s worth of change notifications looking for the ones you are interested in, but this should be able to be done efficiently. I think the fsnotify peeps were working on change journal support at one point.

1 Like

Databases don’t update mtimes because databases usually use memory mapped files and not real files. Memory mapped files don’t fire inotify events as its effectively modifying memory and not files, so this would not work.

I don’t think nor veracrypt nor mmapped files is a real use cases that need solving.

And even if they are real use cases, they are certainly fringe cases – in the context of things that Syncthing syncs.

My files are not multi-gigabyte. But I acknowledge, that other people may have files of sizes that render regular hashing prohibitive. Should you ever run out-of ideas what to add to syncthing, please feel free to re-consider this subject.

1 Like

inotify does indeed fire on mem mapped files when they are msynced to disk. Windows update journalling certainly captures database file writes. And both APIs are eminently usable for VeraCrypt containers, which should make file write detection without metadata changes pretty simple. Syncthing’s rsync-lite-ish protocol also seems eminently applicable for VeraCrypt containers. Honestly Syncthing seems like a perfect candidate technically and philosophically to embrace the usage case of sharing encrypted containers. I’m rather surprised it doesn’t already.

From my tests a few years ago, “inotify” which I guess we are using as a term for many implementations, does not fire consistently across all platforms/implementations.

I still don’t see a use case that can’t get away without having this. You already have a work around for veracrypt, so it does not feel like a significant enough argument to spend effort on this.

Syncthing absolutely embraces syncing shared encrypted containers. Lots of people do precisely that. Syncthing just doesn’t embrace detecting changes in setups that intentionally hide the fact that a file changed.

knarf, I have the exact same use case as you (an encrypted container that does not change date upon mod). I have a workaround that I’ve used for years and it works fine. I have a shortcut on my Windows desktop that includes:

%windir%\system32\cmd.exe /k copy file.tc +,,

After I close the repository, this shortcut updates the date and then the file is caught in the sync. It essentially duplicates a Linux touch command. (I also have shortcuts that automate opening and closing the repository, but they are incidental to the point of this thread.)

1 Like

Wouldn’t it be possible to add a Full Rescan with file re-read interval? I could even see it useful to have it fire once a month or so by default and throw a sync conflict if the files change unexpectedly with the possibility to surpass the warnings for people with encrypted containers and stuff rather than not doing anything at all and let it bite people who might use Syncthing for backups and whatnot.

By default?! Hashing potentially a whole lot of data using tons of resources and causing potentially long timespan without synchronisation is a terrible idea (by default!) - just look around for topics about long scans.

Functionality to recheck files has been brought up, more in the context of detecting accidental data corruption (disk defects and the like). I think that would definitely have merit, not on my roadmap though.

I believe it is a good idea in principle. It could be folder’s option (not the global one) and, of course, should be disabled by default. Another counter to full rehashing could be added to the interface together with some radio-button to select behavior: 1. As when changed regularly; 2. Conflict; 3. Replace by remote copy.

Another option that I would like to see is access to the hash value through REST API. And, potentially, to each piece of a big file to control partial downloading. For example, I have a huge zip archive and what to extract few files. So, I could sync only the header and the part which contain the data.

It seems such options do not contradict the general concept of Syncthing.

1 Like

Sorry, but no, syncthing as a transport is definately out of scope.

You can implement your own client that speaks the protocol to do that, you can even reuse the go libraries to get you 90% there.

I don’t think the feature would have enough general users to justify the required indefinite maintenance of the code behind it.

The protocol is pretty simple, the protobuf schema allows you to easily generate the protocol messages in close to any language.