How/when does Syncthing know it has to rescan files?

Hi, Let’s say:

  • a 100GB folder has been fully scanned (IIRC, ST splits the data into chunks, and a hash is computed for all pieces?)

  • then we shut-down Syncthing (i.e., it is no more listening for changes)

  • we modify a few files

  • we restart Syntching

How does ST know which files need to be re-scanned? (obviously it doesn’t rescan the whole 100GB folder)

Does it look at the filesystem “last modification datetime”?

Also, what is the format of the database of all hashes? Does it contain something else than: filename, last modification datetime, hash? Is it easy to display the information inside this database with Python for example?

I’m curious about how it works!

Thanks in advance.

Here are two relevant links:

https://docs.syncthing.net/users/syncing.html

https://www.kastelo.net/blog/2018-06/syncthing-scanning/

Thanks! This is the answer indeed:

During a rescan (regardless whether full or from watcher) the existing files are checked for changes to their modification time, size or permission bits. The file is “rehashed” if a change is detected based on those attributes, that is a new block list is calculated for the file.

PS @calmh : The 100GB folder is on a micro-SD card which goes in my phone. Is it possible to:

  • put the micro-SD card in my PC, do the hashing there (10 times faster!), make sure the hash database is in the micro-SD card (in a .syncthing subfolder I guess?)

  • put the micro-SD back to the phone, and profit from the already-computed hashes! (no need to recompute the SHA256 on the phone: during a full scan, the phone will notice no file has changed and that no new SHA256 is required)

This would save at least 10 hours (process during which the phone often goes to sleep, and the indexing is interrupted… I tried to let the phone during a full night, but it was not successful)

Thanks!

If that card is the only thing in your Syncthing setup then you can do that and move the entire index database from machine a to machine b. Otherwise no, there is no export or import of folder hashes.

Do you mean just copy the index-v0.14.0.db folder from PC to phone?

Don’t these files contain device-specific information (UUID device identifier, etc.)?


Last question (I won’t take more of your time, thanks already :slight_smile: ):

When scanned on PC, the path for files in my folder are

D:\Android\data\com.nutomic.syncthingandroid\files\music\example.mp3

so the database will be created according to these filepaths, right?

So when moving the index-v0.14.0.db to my phone, will ST on Android understand the files are now in

/storage/extSdCard/Android/data/com.nutomic.syncthingandroid/files/music/example.mp3

and so that the files already scanned/hashed on PC are in fact these files?

Yeah. The database only includes folder-root-relative paths in an OS independent format, so it’s portable. Assuming the folder ID is the same.

1 Like

Once you start playing with the internals of Syncthing please, please take a backup of all affected data before you start.

6 Likes

Completely agree on that, I just can’t help but piggy-back on it:
Once you have any data you care about somewhat (and if you sync it you do), take backups - regular, automated* backups that you know you can restore (i.e. you tested it).

*automated: Completely hands-off automated is great, but an “automated” regular task with reminder and a few steps written down to do manually is also good :slight_smile:

Yes, for sure, I have many many backups of everything in multiple physical locations, distant of a few 100 kms from each other. I completely agree with that :slight_smile:

2 Likes

Is there maybe an update on this? On ST Android, it takes a lonnnng time to do an initial scan of a 120 GB folder.

Is there maybe a setting in newer versions of Syncthing in which we can enable weak_heashing mode? i.e. the hashing will be less efficient in terms of avoiding unnecessary network transfer (if for example only a few bytes of files change), but the hashing would be much faster. Does such a “weaker hash algorithm” mode exist?

I think on Android the problem is probably I/O and not CPU usage, so changing hashing wouldn’t help much. The “weak hashing” you envision doesn’t exist in Syncthing, see the other thread.