Requesting an option in the “Advanced” area similar to:
“Purge deleted items in database”
With a default of zero (never) but with the option to do so by changing zero to another number of days. This would remove deleted files in a given database after x number of days so files could be “brought back.” However for data minimalists this may be desirable after 365 days or any chosen number. Please note however that the space taken in the database for deleted items is negligible.
If this would be easy to implement that would be great. If not not a big deal. For a small cluster recreating a folder each time get a new piece of hardware or something similar will have same effect in keeping database clean.
I’d like the feature as well. While the solution of @AudriusButkevicius is what I do at the moment, I do think it would be great to have an expiry option of tombstones in the database.
I also do understand that it’s maybe not so easy to implement, so that not one side removes the tombstone and it gets synced back from another instance, because the clocks are not entirely in sync. So it may be more effort than it looks in the first place.
There are lots of corner cases that will pop up if you do this. There be dragons, and the gain is not huge for most people. If the entries cause practical problems in some setup we should work to fix or optimize that instead.
Would it be fewer dragons, if Syncthing could at least obfuscate/replace the filename in the database? My concern is not so much in DB size, but that filenames sometimes in themselves are sensitive information which I’d like to be “lost” at some point.
We need the file name. But I’m sure it’s feasible to, for example, encrypt the database using the device key, and encrypt the device key using a passphrase.
Adding to this if one edits many files that create a temporary .lock file (most office documents) the .lock files will continuously be being added to the database. The solution presently is to add these to ignore if do not want this.
If support an option to keep the Syncthing database clean for data minimalism or privacy please post. This project is excellent at responding to user input if there is demand.
In terms of privacy I don’t understand the issue: If you are concerned about your metadata getting public, encrypt your home partition/disk/… (or wherever you keep your db).
That’s not how it works. If there is user demand or developer interest and it fits into the project and is doable with reasonable resources to be invested - it happens. So talking about use cases or potential ways to implement stuff is useful, but lots of “I want this too” post wont help at all.
If there is enough well reasoned interest have no doubt any given feature would be implemented.
But do not wish to venture too far away from the feature request in the last thing a new user will read in this thread.
Bottom line a database will continue to grow (because it has to to prevent old files from being resurrected when a long off node comes online). Allowing the option to automatically purge old (“old” as defined by the user) information from a database based only on input from from one node would be a step in the right direction to account for sustained long-term (again as defined by user) usage of a shared folder. For many use cases a node is not going to come online after a year. Of course a feature to purge database would also never be a default because notwistanding understanding what are doing could really mess up a shared folder if an old node does comes online bringing hundreds if not thousands of old files with it. But if understand the consequences of this feature it would be of great use to again sustain long-term usage without eventually worrying about database maintence.
Am not in the privacy group on issue but can see why database privacy could matter as taking the database file is trivial. Taking files though is also trivial if gain access to a machine. Privacy wise guess it is mostly relevant if temporarily or accidentally add a file to a share but later remove it. Right now since the delete exists forever in database some might have to create a new share simply because put a given file into a database accidentally (change will be picked up instantly if use file system watch feature).
See sustained long-term usage as most compelling reason to allow for automatic purging of old database entries.
Developers do care about user input and needs (although understand what was meant). If not there wouldn’t be a response from three for a very minor issue.
That being said the developers are wrong in this particular instance. However key word “minor issue” and understand need to triage time spent working on the project.
We care about this case too; there is a bug ticket, and as noted the entries should ideally be removed at some point. It’s just a bit difficult to determine when.
I’ll still take the liberty of doing so: I have also looked into it, and the main problem is
(and just to state the obvious: “a big difficult” is an understatement).
In https://github.com/syncthing/syncthing/issues/863 there’s a proposal of how to do it, and then a problem that can lead to data loss has been pointed out - no remedy so far.
If you or anyone can state an approach of how to prune deleted files from the index without potential for data loss, please do so.
Only suggestion to implement would be a table that lists all devices in the cluster for a given folder. Once all devices in the cluster meet this check then and only then can an entry older than X be purged.
It can never be done 100% safely in a systematic manner, because clusters do not need to be fully connected. That is, you can have a cluster A <-> B <-> C where A and C never know about each other. They won’t ever be able to draw any conclusions about when the entire cluster has seen any given file version.
In practice this doesn’t matter much. We could clean it out after a year. If things haven’t been in sync for that long there are worse problems. But it’s also not a huge problem to just leave the entries in there, so …