Bit Rot Prevention Scheme

I believe syncthing could protect against bit rot. Consider the following scheme, which could be used as a pre-processing step for any syncronization tool.

On each device store a record of the time stamp and checksum for each file. This information will be used to detect bit rot. Consider the following example.

Suppose device A has a corrupted file. Then device A could verify that the file is corrupt by checking against its records (the timestamp would be the same but the checksum would be different). To restore the file, we would then just retrieve it from device B.

Obviously, this scheme would take longer to execute but it wouldn’t need to run as frequently, especially, since most syncronization tools don’t syncronize files with the same timestamp, i.e., the corrupted wouldn’t be propagated unless the file was changed before the next scan for bit rot.

How do we determine the difference between bit rot and intentional change?

Also, of you are concerned with bit rot, why not use tools that are meant to check for bitrot?

The fact you can do something, doesn’t mean you should do it.

We could provide tunneling capabilities over the sync/relay etc protocols, but I don’t think we should.

The timestamp.

The reason why we should is because detecting bit rot and fixing bit rot are two different things. Combing a bit rot detection scheme with a syncronization scheme allows us to replace the rotten files with one of the “good” redundant copies in the network.

There are unfortunately lots of cases where the contents change but not the time stamp, enough so that we’ve had to expend code in a whole way to detect this on the fly. Mem mapped files, low precision time stamps, metadata editors, secure containers, etc.

That’s not to say we couldn’t have a mode where you say “I’m certain the data shouldn’t have changed; check it all and undo all differences”. We kind of already have that in receive only mode, it’s just the scan part that’s lacking. If another tool did that and deleted the bad files, Syncthing would redownload them after someone clicked “revert”.

2 Likes

Hmm, that does seem to be an unfortunate limitation. Overall tho, bit rot would be a rare occurrence, if we were able to limit the scope of this tool to files that we knew were corrupt, e.g., static files like a raw photo, then we would still be ahead.

One possible mitigation would be to only raise a bit rot error if the timestamp has been recorded into our records and synced across multiple devices (we can’t do anything if we don’t have a backup anyways), say only check for bit rot errors after 24 hours. There may be other timestamp limitations that I’m unaware of but I imagine this would address a number of them.

Example:

  • Assumption for this example system:
    • All files are syncronized within 24 hours.
    • Our system checks for bit rot errors on files that have a timestamp older than 24 hours.

Then if file01 gets corrupted at hour 25 (or later) we would be able to catch this error.

This system obviously wouldn’t catch bit rot errors that occurred close to the file’s timestamp but that should still handle the lion share of errors.

There is another feature request to do a full always-rehash scan, to catch modifications when none of the metadata has changed. If we did that, it could be used for bitrot detection by combining it with a receive-only folder mode. So that’s where I’d suggest starting.

2 Likes

It would be awesome if syncthing could claim some element of bit rot protection. Let me know if I should participate in a github discussion or something.

When bit-rot occurs and there is only one source file Reed Solomon can be used to repair the rotted bits. Backblaze and Minio uses this to recover from rot.

I also think this would be a cool feature, especially since many of the items I’ve synced are also finished projects, photos, movies, music, software packages, disk images, etc - that rarely change. Due to cpu usage, I’d prefer a user-settable range, and I’d probably choose to do re-check every 2-3 months (probably up to 6 months, depending on the exact content).

I would also agree that it would be best per-folder, so that you could re-check important files more often, as well as not having to suddenly re-hash several TB of data at once.

I would propose the ability to mark a folder as an “archive”. If a folder has been marked as archive:

  • New files can be added to the folder and they will be synced to other devices without prompting
  • If a file is modified or deleted, syncthing will prompt the user for approval before syncing changes to other devices.

This would prevent bit rot from propagating to backups, as well as preventing things like unintentional deletions or ransomware encryptions from propagating. No need to do any fancy bit rot detection stuff, just allow the user to express intent that files in a given directory are not supposed to change.