Brainstorming - whether Chrome download files (.crdownload) should be ignored in Syncthing

Any input is appreciated!

tl;dr

yes, probably should ignore .crdownload files

Syncthing default behaviour

The .stignore is blank by default, so these files are synced by default.

Question

whether Chrome download files (.crdownload) should be ignored in Syncthing

Purpose

  1. The time taken for such a download to be usable on a remote disk could be maximum t = 2n (download time + sync time, assuming the two are equal).

  2. The minimum, if the partial download is being sent incrementally while it is downloaded, could be just above t = n (source: my intution, but it makes sense).

  3. Recently I have been thinking about this a lot in the shower.

What I know

  1. Chrome downloads are stored in .crdownload files

  2. These are replaced with the actual file when complete

  3. Ignoring them (whether locally or remotely) does not prevent the local Chrome from resuming the file (it doesnā€™t delete it). But the remote host will only receive finished files and not .crdownload files.

What I think

  1. The files are the sequential raw data of the finished file, up to the % last downloaded, and not in some esoteric format. (KISS principle).Ā¹ Ā²

    Significance: block reuse applies

  2. The files are renamed to the final filename. It would make no sense for Chrome to copy from the .crdownload into the finished file then delete it (2x space usage momentarily).

    Significance: none. Syncthing probably sees the same delete .crdownload then create actual file regardless.

Reasons to keep

  1. Block reuse. A remote .crdownload could have Syncthing reuse its data used to piece together the final file. More on ā€œBlock reuse caveatsā€ below.

  2. File could be reused for Firefox or wget to use as resume data if stopped.Ā¹ Ā²

Reasons to ignore

  1. File unusable in Chrome if Downloads page record cleared.

  2. File unusable in any browser or tool if original URL lost.

  3. User confusion. So long as the download is in the Downloads page, it remains on disk. On remote devices, there is no such connection, and the fileā€™s presence is confusing:

    • If the Downloads page is never cleared, the file remains perpetually.

    • If a machine is decommissioned without ever clearing its Downloads page

Block reuse caveats

  1. Syncthing needs to scan the whole file first to be aware of it (and send it remotely). If the file changes while scanning, this process is aborted, until the scan can complete the next time. This tug of war may depend on the following:

    • If the download is paused / speed is 0 / speed is slow, and/or the disk read speed is high, Syncthing has a higher chance of winning (completing scan first).

    • And the inverse is true as well

  2. Once a file is completed, Syncthing will see delete .crdownload and create actual file.Ā³ Depending on the order of these operations, either:

    • the block is reused. The download is available sooner on the remote device rather than t = 2n.

    • the .crdownload is delete remotely and is not able to be reused.

  3. .crdownload files could be reused remotely after remote deletion if:

    • a currently-for-block-reuse file is blocked from remote deletion until its blocks are completely reused. Seems unlikely. No idea without knowing Syncthing internals.

    • Files are moved to .stversions folder AND this folder is added as another folder in Syncthing ā“ āµ

Conclusion

The benefits are real but seemingly marginal and the caveats are numerous.

For a partial download to be reusable remotely and compatible with Syncthing, it should maintain the original filename throughout, such as when created using wget or curl. Otherwise, itā€™s probably a good idea to exclude .crdownload files.

Sources

  1. browser - How can I resume a download started in Chromium? - Ask Ubuntu
  2. windows 7 - Resume interrupted downloads in Chrome - Super User
  3. Reuse of files between different folders - Support - Syncthing Community Forum
  4. Does Syncthing reuse blocks from ignored files? - Development - Syncthing Community Forum
  5. Re-use as many existing file pieces as possible Ā· Issue #1909 Ā· syncthing/syncthing
1 Like

This is very well reasoned and written!

It seems like a configuration edge case to me, rather than necessarily a feature to be developed. For a .crdownload file to be synced:

  1. A user must use Chrome to download files into a synced directory
  2. While the filesystem watcher is enabled by default, the .crdownload file would have to accumulate no changes for 10 seconds before itā€™s hashed

On the plus side, block re-use is likely in this scenario, as deletions are delayed an extra minute by default.

Thanks for the feedback!

TIL, on the ā€œaccumulating changesā€ behaviour. So if the watcher is enabled, changed files are allowed to ā€œrestā€ for 10s before the file is hashed. That makes sense. And the docs are crystal clear, as they always are.

And if watching is not enabled, then as you seem to be implying, block re-use would not work, because either:

  • the download is in progress, so the periodic scan fails on it, and it never gets hashed until the next scan (and itā€™s not changing the next time around, which is rare, I can only think that this will happen if itā€™s paused)
  • the download is complete, in which the file doesnā€™t exist anymore
    • and no way of knowing if the rescan will notice the creation or the deletion first

Perhaps I did not make it clear while banging it out, that I did not mean to imply that Syncthing needs any change, but rather this was my ā€œthinking out loudā€ the strategy for ignoring files (specifically .crdownload).

The context is through many helpful users on here and some personal experience, I have accumulated a large list of .stignore patterns, mainly related to OS-generated files, and just useful ways to use ignore patterns in general. But how or why they are used, is probably indecipherable for a new user.

So I am starting a project to make this process easier. The goal is to give something back to the Syncthing community once itā€™s done.

Anyway, this was a way to talk out loud to wonder why I saw someone ignore .crdownload years ago but chose not to (thinking block re-use), then realizing that I didnā€™t need to manually clear out these files littered over the syncs for the users I manageā€¦

1 Like

Awesome.