Large Numbers of Files Updated at Once (sparsebundle)

I am trying to decide on the best approach to sync my macOS sparsebundle with syncthing.

  • use default 8 MB bands (roughly ~850 files)
  • use smaller 2 MB bands (roughly ~6700 files)
  • use larger 128MB bands (roughly ~54 files)

As you can see, setting multiple bands (which you can think of stripes or sectors) can vastly adjust the number of physical files that get sync’d.

Would it be better to sync only a couple of bands (files) updated at the same time? Or, would it be better to sync several dozen (or hundreds) of bands (files) at the same time with Syncthing?

Would Syncthing be able to handle several dozen modifications to the same file if it was currently syncing it - and it would change a dozen more times within a few hundred milliseconds?

I ask because it is generally advised against using Syncthing with .git repos. While I am not syncing .git repos with Syncthing, I am however considering using Syncthing to sync my Sparesbundles to my server - mostly as a backup.

Let’s walk through a use case under a number day of work:

Large swathes of files will get updated instantly under normal use, within a second or two. So, I am trying to decide if it is better to sync less or more.

For example, if I do a git pull within a mounted sparsebundle, I can easily get a few 1000 files downloaded - which would spread out across several bands (files). If I use a smaller band, this could make dirty several hundred band files instantly - which Syncthing would have to sync. If I use a large band size, most likely only 1 to just a couple of files would change for Syncthing to sync.

The catch is “working”: I work fast. So, after a git pull, I might generate diff files, delete some directories, decide that a branch is junk and delete it, create a new branch, modify version files - all with scripts I’ve written to speed things up.

That workflow could easily “mark dirty” the same file over and over and over again within a few seconds, or even several 100 times within 1 second.

It probably doesn’t matter too much. In the end, Syncthing splits everything into 128 KiB blocks and syncs those. Too large bands will have more overhead in that lots of unchanged data will have to be hashed and copied (but not transferred). Too small bands will have metadata overhead, but that’s probably negligible under the circumstances. 2, 8 or 128 MiB all sounds like reasonable choices.

1 Like