Preventing Data Corruption using Syncthing

I’ve been looking for a tool that has backup/parity data already (e.g. BTsync, CCC, Syncthing) to provide protection against bitrot, too.

I have a ZFS NAS which provides protection against bitrot by comparing the current hash against a stored hash on every read, and transparently repairing bitrotted files by then reading from the mirror/parity disks, and assuming that read passes the hash test it replaces the corrupted data.

I don’t want other devices which don’t have this protection to propagate bitrotted files in their less robust file systems to the ZFS NAS, which would negate the benefit of having it.

Having file versioning (provided by ZFS, BTSync, Syncthing, and backups like Time Machine and Arq) don’t do much to help. By the time you realise a file has become bitrotted, it may be too late.

It’s not always even obvious a file has been damaged when you do access it. Imagine a few bits are flipped in a long plain text document. Over months or years you edit this document, not noticing it has some corruption. When you eventually discover it, all your incremental backups also have it by that time, and even if they didn’t just reverting to an older version would lose you any legitimate edits you’d made since the rot started.

Syncthing would have to constantly (or at least periodically) recalculate and verify the hash for already synced files, and when it finds a mismatch, replace the file with fresh data from other devices.

But it would need to be able to distinguish between bitrot and legitimate edits. Imagine Syncthing is shutdown (or has crashed) for a period of time, during which legitimate edits are made. How would Syncthing reliably know that those changes should be propagated, and are not bitrot?

The creator of CCC told me that there are often files which are legitimately changed but have the same size and mtime, making it difficult to know for sure.

doylep, you might want to look into the use of another tool dedicated to the task of detecting betroth, and then manually restore from other peers or from Syncthing’s version history or another incremental backup. I recently discovered https://github.com/ambv/bitrot and intend to give it a workout.

2 Likes