I agree that ‘storing files at non-trusted servers’ is a great feature for Syncthing.
But if ST has no interest in implementing it, what are the other options? Is there an efficient solution for this use-case?
Check out https://cryptomator.org/
This could be possible as well with git hooks?
Cryptomator is not Free for Android neither OpenSource
Thumbs down for cryptomator from me as well. I had looked at it for a long time, it looked nice, but when I downloaded it to my Linux box and tried it it turned out to be a humongous bloated Java application with its own ideas about how it should be installed, which incidentally makes it useless with just a window manager - it needs one of the ‘desktop’ systems. Huge, heavy stuff. I could almost hear the sigh of relief from my server when I purged the junk from my system. Fortunately there are alternative options out there.
Has there been any progress in the last year or did someone find an alternative?
While it is not a real solution, I built a Docker image that should make it more difficult to access your SyncThing filesystem on an untrusted source here: https://github.com/PhracturedBlue/syncthing-docker-encrypt
The idea is to encrypt the filesystem with gocryptfs and then make it difficult to either (a) get access to the running image (by using distroless base-image and multi-stage build) or (b) get access to the gocryptfs key/password
It is likely that (with some effort) both of these can be subverted by someone with root access to your host while the image is running, so don’t consider this real security, but it does raise the bar on how easy it is to do so.
I still hope for a proper client-side solution to encryption someday so that hacks like this aren’t needed
Rclone has crypt remotes feature. Rclone is also written in Go Can that help implement crypt repositories for syncthing
Hey folks, I see this feature is highly requested for last 5 years and here is even a 2K$ bounty for this right now. I am also dreaming about optionally encrypted remotes, since this would allow me to replace Google Drive with Syncthing.
My guess is that certain effort has been made in this direction, but there appeared to be certain architectural problems. Am I right about this or there is just no clear way how to provide real security for the solution?
Right now I am thinking about:
- investigate Syncthing code for possibility to create a PR with this feature myself, but I don’t know Go so this could take a lot of time;
- start to build my own solution from scratch keeping security in mind, but this should be even longer way
The whole project is pretty awesome, especially the fact that it’s source is open.
P.S. Thanks @PhracturedBlue for sharing this Docker-based solution, I’m gonna to give it a shot.
Resilio has a good reasonable solution to this problem. basically you choose whether to trust a server. A trusted server has the encryption key and will store files decrypted on disk. A non-trusted server keeps them encrypted. Network sync is always done using the encrypted file. Of course it isn’t open-source, and it uses the BT protocol…
I also want to make it crystal clear that my work-around is NOT a solution. Someone with root access to the server can absolutely get access to your files. It just makes it a bit harder to do.
Probably a good start to read through a draft is here : https://github.com/syncthing/syncthing/pull/4331
I was thinking about encrypting blocks for deduplication.
One thing which came into my mind, was to use the hash of the unencrypted block. Encrypting (e.g. salt + symmetric key) this hash to form the block ID should make sure it doesn’t leak the hash of plaintext data. This way, two blocks with the same plaintext result in the same ID and can be deduplicated. A commonly used binary format would be
IV+ciphertext. When IV is random, two clients can have different binary representations (when two clients independently introduce the same file at the same time, when they sync and store the blocks it stays the same). A malicious attacker could then request the same block from multiple clients and would receive multiple (still fairly limited with the number of clients in a usual setup) ciphertext for the same plaintext. Would this be an acceptable risk?
But it could be solved with deterministic encryption (the same input always creates the same output for one key). If the same plaintext always produces the same ciphertext, the untrusted nodes can compare the hashes of ciphertext blocks, so they don’t store files multiple times , if those were added on different trusted machines while offline. And trusted hosts can compare a list of hashes of encrypted blocks to their own list of hashes of encrypted blocks , which means they don’t waste traffic on files they already have. (Note: I used the term deterministic encryption a bit misleading here. AES is for example deterministic, but made non-deterministic by using different IVs/nonces.)
Which algorithm would you use for this?
- AES without IV/nonce isn’t very secure, is it?
- Suggestion 3. below?
3. Nearly everything is the same as in 2. but instead of a completely random nonce we use the (first 96 bits of the) hash of the unencrypted block (plus a shared secret to protect against file confirmation and similar attacks) as the nonce. This way the ciphertext is always the same for identical plaintext blocks, but it leads us to the barren lands of not well researched crypto and doesn’t sound like a good idea: https://crypto.stackexchange.com/questions/3754/is-it-safe-to-use-files-hash-as-iv
Making sure identical blocks from different files encrypt the same only saves a little in transfer costs. On the other hand it increases complexity and reduces safety by leaking information to an attacker. I don’t think it’s worth that tradeoff.
I think its even an attack vector to some extent, because if you are able to put plaintext on a folder and make it encrypted, and the iv/nonce is somehow deterministic (for the purpose of block reuse it has to be), you could over time recover the key.
So things like reusing content from old file, reusing content from other files, rename detection etc are all not possible in encrypted folders.
@calmh I only care about deduplication to solve the efficient move/rename feature. With deduplication in place it is solved implicitly.
When storing the full path (or unique parent folder id) in addition to the file name, the meta data for two unknown files with identical content is different within the repository. The file data for two identical files is stored only once anyways.
@AudriusButkevicius Your doubts about deterministic IVs are reflected in the quoted link for 3:
You obviously lose semantic security when you use deterministic encryption. This means an attacker can tell if two files are identical.
So when an attacker knows that a known file does exist in the repository, he can generate the deterministic IV/hash/ciphertext with a guessed password and perform dictionary/brute-force attacks for the secret. When using a salt, the effective secret contains a lot of entropy.
Tradeoffs always have to be made, that’s why a threat model is so important! So I was wondering which (or none) of two outlined solutions (availability of multiple different BlockCipher for same BlockId; deterministic encryption) is acceptable with your threat model in mind.
Another approach to rename/move detection (implicit when content equals) would be to not detect it at all. When using FUSE, edit/move actions are explicit. Synchronization would then exchange some sort of journal since the last successful sync. In a distributed context, it might be difficult to identify and refer to a necessary common base state (similar to git).
There will be no efficient move/rename with encrypted files. Files have individual encryption keys. This is a feature as it prevents tracking data between files and correlating copies/moves that might have happened.
@calmh What is the problem with my proposed solution since it prevents tracking and correlation if I am not mistaken!
It’s completely not clear what you suggestion means in the context of syncthing and it’s protocol.
Rename detection and deduplication can still happen for peers that have the decryption key, but I don’t think it can happen on encrypted peers.
Yeah it’s unclear to me what you mean. Deduplication of blocks happens on the receiving end based on the block hash. In the encrypted case the block “hash” is in fact just an opaque token (regular block hash encrypted with the file key). Being able to dedup on the receiving side means having identical block hashes / tokens for identical blocks in different files, which lets the attacker draw conclusions about your data that they shouldn’t be able to.