Detecting a corrupt file and replacing using the 'swarm' ?

Hi again, just wondering, conceptually could Syncthing detect a corrupt file, internally flag it, causing that Syncthing instance to download a valid copy from other machines? Thus in effect you now have an automatic file repair system. I’m just wondering if conceptually this is a feature that a future release of Syncthing could have? If so I could submit a feature request.

If there were a way to detect a corrupted file we could in principle do that. I’m not sure what it would be though, apart from stuff like trying to read it and hope for I/O errors. But you’re on FreeBSD, presumably using ZFS, you don’t have corrupted files. :slight_smile: Or if you do it’s at the application level and I don’t see how Syncthing could detect it.

I thought files included a checksum, and software could read the file and calculate the checksum, thus a corrupted file would calculate a checksum different then the one actually written with the file? I guess I got that wrong. If not could Syncthing run a cryptographic hash function on each file to generate its own checksum, include that in a DB, etc, and then use that as a ‘flag’ input to its logic mechanisms. I was just curious really; thought it was a decent idea.

I actually use UFS. Have considered using ZFS, but my content is low value stuff mostly, and I do backups, etc, (although I guess its conceivable I have been backing up a corrupted file for years if I never check a file for its integrity), so ended up not doing ZFS yet.

No, you’re right, there is a cryptographic hash of every block in a file. But what differentiates corruption from an intentional change? We can’t tell. A file system with hashes can tell, because it calculated the hash when the file was written.

We can consider any write to be unauthorised, but that’s just a recvonly folder then.

Hmmm, yeah, duhhh, well now I feel like a dummy lol. Thank you. :slight_smile:

I still think that detection of data corruption though comparison of checksums is very useful feature. It is of course only intended for ‘write once’ files like photos. I first step might be a syncthing ‘write once’ mode in which syncthing only copies, but never overwrites existing files. Then syncthing could send a warning email when a a difference in checksums is detected.

The existing file versioning might be a work around in syncthing to achieve this bitrot detection with existing tools.

1 Like

The problem with this approach is that it takes syncthing further from its original purpose, which is “continuous file synchronisation”. A write-once system is probably best handled by a tool which is meant for the purpose, IMO.

1 Like

I’m sure there’s already an abundance of tools that will calculate hashes for a file tree to a file, and then verify them again against that file. Wrap that with something that deletes files that fail the check and hits the revert endpoint in Syncthing and you’re good to go.