Latest encrypted devices proposal, what & how

Performance and slight worry of collisions on the shorter random nonces in AES-GCM. I’ve gotten some feedback on the overall design from a certain very knowledgeable cryptogopher that I’m acting on, and I’m writing up some better specs on how the whole thing works for another round of review. Hopefully this thing will have some sort of theoretical seal of approval on launch day, so we only need to worry about actual implementation bugs and not me accidentally holding the algorithm upside down.

6 Likes

There are some spec notes here: https://docs.syncthing.net/branch/untrusted/html/specs/untrusted.html

2 Likes

I know a little bit about cryptography and have taken a look the spec. It does look decent, but I have some questions about the integrity between blocks and version rollbacks.

  • Is the order of the encrypted blocks authenticated? It shouldn’t be possible to reorder the encrypted blocks, even when the contents cannot be changed.

  • Is there anything that prevents an untrusted device from replacing a file with a previous version (or even deleting it) while claiming it is a new update? I think every file needs some kind of version number that increases and is authenticated, so the untrusted device cannot do a rollback. The trusted devices should also check that it does in fact increase.

  • How does syncing between untrusted devices work in case of a conflict? Trusted devices create a new file for the conflict, but untrusted devices cannot do that.

Thanks for looking and thinking!

The block order could be tampered with on the untrusted device to the extent that it can modify the fake file metadata and block order on disk. However, this wouldn’t matter to the trusted device, or rather it would be detected and rejected. A trusted device doesn’t look at the fake metadata, but decrypts and uses the attached encrypted metadata (or uses its own original copy of that metadata). When requesting a block that has been changed on disk on the untrusted side it would then get data which might decrypt properly but will fail the hash check after decryption, much like bad data on a normal trusted device.

Rolling back is prevented by the same metadata wrapping mechanism – it doesn’t matter how new or improved the untrusted device claims the file is, the original attached metadata is what’s used, and that metadata will contain the truth. The untrusted device will just look out of date.

This also affects the conflict handling. The untrusted device doesn’t see or understand the real history of a file given that it can’t decrypt the real metadata. It just passes on whatever is the most recent metadata from another trusted device, much like a dumb wire. Conflict resolution will then happen on some trusted device.

Thanks for your reaction! Encrypting the metadata as a whole indeed works well, and the versioning already done by Syncthing prevents rollbacks.

I understand that conflict resolution normally happens on the trusted device, but my question was about syncing between untrusted ones (maybe it was not clear). Consider the following setup that you wanted to support:

A <-encrypted-> B <-normal-> C <-encrypted-> D

(A and D trusted, B and C not.) Let’s say they all share a file, but A and D make concurrent modifications to it, and they share these to B and C. Now B and C have to handle the conflict themselves. How do they do it? If they use the normal Syncthing algorithm they will create .sync-conflict files, but their filenames will not be of the proper encrypted form. Maybe it is enough to allow these files as well, to allow the resolution to happen at the trusted devices? But it should be documented explicitly.

I also think that an untrusted device is able to prevent/reject specific updates. Of course it can’t give any updates at all, but preventing some may be undesirable if trusted devices don’t connect directly. Solving this is complicated I think, so you may as well say that availability attacks are not in the scope of the proposal.

1 Like

Normal conflict detection is based on the version vector of the file, where we can detect concurrent modification. The untrusted variant of files don’t get the real version vector, they get a fake one that always has just one member with a version that is the send time in Unix nanos. This means that a wall-clock-newer untrusted file will always look like a linear descendent of all other versions of the same untrusted file (assuming correct clocks). So it will be accepted as the newest without conflict by any untrusted device, and in the end passed on to a trusted device who might see the real conflict and act on it. But there will never be a conflict between untrusted devices. (Unless one is intentionally introduced, but that will then be a file without encrypted metadata and it will be rejected by normal devices.)

You’re entirely correct on the second part. An untrusted device can elect to ignore any and all updates and not pass them on. They could be even devious and announce them back so they look in-sync to the originator, but then not tell anyone else.

1 Like

Semi OT -> Can this forum link or the current proposed PR be posted to the (to my filter bubble) first google result for syncthing untrusted devices github -> Support for file encryption (e.g. non-trusted servers) ?

I would post it myself (can’t; collaborator lock) since I just found myself taking the scenic route through several dead-ended hyperlinks before I gave up and found this link in my email.

1 Like

I’d like to talk about UX and the PR is going to be long/cluttered, so lets do it here (here we can split it into a separate topic here if necessary later on).

The UX consideration prompted one potential protocol/crypto extension - please jump right down to that at the end if you want to way in on it but not UX.

On terminology:
I will use encrypted and unencrypted for now, not meaning to preempt any decision/discussion on what it will be in the UI in the end (e.g. trusted).

UX

I see the follow points that will affect how the UI will look like (some came up before, other I may have missed - please let me know of omissions). At the beginning I’ll try to shortly make the topic clear, then add some considerations and in the end my proposed way forward (i.e. skip the middle part for speed).

  1. Should a device have only encrypted or un-encrypted folders exclusively or do we allow to mix?
    Making it exclusive makes the UI simpler: You don’t need to set a device as encrypted in the folder, just set a password. It also makes it safer, in that you can’t forget to set a device as encrypted when sharing.
    What would be use-cases for mixing? I can imagine having a folder with insensitive content and not wanting the encryption overhead. However given we already do expensive hashing per blocks, I think that’s not much of an issue (as in small relative slowdown).
    I propose to make encryption a per device settings (of course still with per folder passwords).

  2. Indicating encrypted device-folder-combis in cluster
    Adding a folder that already exists on other devices needs to infer encryption status. Otherwise chaos (or at least an error) will ensue when you share a folder with a device unencrypted that already has that folder, but encrypted. Sure it’s a user error, but we shouldn’t let the user make the error if it’s avoidable. And there’s also auto-accept to consider where it wouldn’t be a user error anymore. This should be solveable by adding a encrypted field to the device message in the cluster config and acting on that when adding/sharing the folder with said device (PR “Store pending devices and folders in database” might come in handy). Acting on as requiring the user to add the PW before sharing with the “encrypted device”.

  3. How to handle password consistency between devices
    Ideally we can ensure password consistency between devices sharing a folder un-encrypted. That’s obviously not just UX but protocol/crypto, so I separated it (see at the end). If that’s not possible I think the encrypted device will store everything twice for the different PWs, but afaik that shouldn’t cause any problems beside the multiples of storage used and might thus even be tolerable.

  4. Change between un- and encrypted, PW change/data wipe Simplest solution I see would be to allow all of this without restricitons and then retransfer the index to the affected devices. This will trigger a full retransfer of everything to the encrypted device, which is desired in the PW change/data wipe case e.g. because the PW was potentially compromised. In case the device is now considered trusted it would be nicer to decrypt the existing data, but I believe that’s niche enough to not automate it (i.e. user needs to run the decrypt tool on the encrypted device and reconfigure the folder) - at least in a first step.

Once there’s consensus on this, I’ll propose (and/or invite others to propose) concrete options to add the required controls to the UI.

Protocol / Crypto

Related to 3. above I’d like to get input from people knowledgeable on crypto (anyone else of course, but them especially :wink: ) on how (in-) sane the following is and possible alternative approaches:
It would be good to be able to check that passwords are identical between un-encrypted devices. The naive option is to send the pw properly hashed in the cluster config. That obviously adds a prominent attack vector, but as far as I understand (I am not a cryptographer) that’s the most basic requirement any auth scheme needs to solve and is secured by making it sufficiently computationally expensive to compute the hash (if that argument is flawed I am now quite squared about my pw-manager database :slight_smile: ).
We already generate a key like this that’s used in encryption, and I right now can’t come up with the reason why it wouldn’t be save to share that. However that’s obviously an insufficient reasoning. So if I’d have to decide myself now I’d share a separate scrypt key with a different salt, e.g. scrypt(password, "verification" + folderID), or the encrypted password, i.e. using the existing encryption key (key = scrypt(password, "syncthing" + folderID)) to compute e.g. XChaCha20-Poly1305.Seal(password, "verification", key).
As state at the beginning I am likely off-mark, I mostly stated a concrete plan to make it plain what I want to achieve - I am looking forward to considerations backed by more experience :slight_smile:

2 Likes
  1. I’d say no, I think people will want to opt in for that overhead for some folders only.
  2. New things in the protocol are fine. I suspect this will be a 1.6 as minimum anyway.
  3. We probably need to solve the crypto problem, but we should definately not just merge content from two devices with different passwords, as neither of them will be able to decrypt the part they own.
  4. Redownloading or user having to run a manual step seems fine.

I think we definitely need to allow the mix. If nothing else there’s the “two buddies, backing up their secret stuff to each other” scenario. My normal folder is encrypted on your end, and vice versa.

Yeah… I was kind of hoping that the encrypted side could be a normal unaware device, but maybe it actually needs to be an explicit encrypted folder type there. That also avoids the whole thing about whether the folder needs to be set to receive-only or not, etc. It still opens up for misconfigurations, but at least the errors can be explicit (“you need to change the folder type to encrypted, or disable encryption on the other side”)…

We will always have the topology N1 <-> E <-> N2 where only E is encrypted, N1 and N2 need to have the same password, yet will never communicate. I think it might be a good idea to have something like a magic token in the cluster config be dependent on the password, so that when a new device joins it can easily determine that it has the right password or not. If it doesn’t, it won’t be able to handle any info from that folder and also it shouldn’t pass any new info. If the magic token is blank (newly created encrypted folder) it’s alright to “set” it and push out encrypted data. (Only the normal/trusted side devices would be able to draw any conclusions from the token, so it’s not for the encrypted side device to reject connections with a bad password.)

How do we change passwords? The data at E becomes invalid, so I guess the clean option is to remove the folder, delete the data, then re-add it.

I envision being able to stop syncthing on the encrypted side and say stdecrypt /path/to/folder mySecretPassword to decrypt the data offline. Then the folder could be removed and re-added, the usual caveats applying.

I don’t think there’s a need to be able to offline-convert to encrypted storage…

In my mind, the following setup makes the most sense from both a crypto and a UI perspective.

Every folder has its own password, with the key derived from it. There is no need to communicate keys, passwords or derivatives of them. You probably have a keyfile where the key is stored, but encrypted and authenticated with a key derived from the password (and also the folder id and maybe more). This way, you can verify that the password is correct on other devices.

When you create a new folder, you have the option to make it an encrypted folder, in which case you have to give a password to encrypt the folder with. The folder is also explicitly marked as encrypted. This is for the UI only, if an adversary removes it and shares it, the new device just thinks that the folder shares encrypted files.

When a device receives an encrypted folder, it initially just works as a relay to help synchronizing the encrypted folder. If you also want to see the plain files, you have to enter the correct password. As it also contains an authenticated keyfile, you can verify whether the password is correct or not.

When you share a normal (not encrypted) folder to an ‘untrusted’ device (either it is marked as one, or it already has an encrypted folder, or …?), you get a warning and have the ability to convert the folder to an encrypted one. Maybe you should even have the possibility to mark a device as ‘always untrusted’, in which case it becomes impossible to share a normal folder with it. When converting to an encrypted folder, other devices should of course be notified and the user has to enter the correct password on them in order to receive updates.

I think that you cannot really have a simpler setup than this, as every folder should have its own key and you should not have to communicate the keys, because you also want to be able to decrypt it from just the encrypted files and the password. This setup is also very flexible in mixing etc. and you could add more conveniences using the keyfile. For example, if there is some shared secret between two trusted devices, you may also add a line to the keyfile containing the folder key encrypted by this shared secret. That way, the password is not needed when sharing between ‘very trusted’ devices. There are still some details left for these additions, but I think there are some possibilities for extra conveniences, if needed.

  1. Mixing encrypted and unencrypted folders with a device it is then.
    I’d still propose to have an option remote devices to enforce encryption (or at least enable by default when sharing).

I imagine the encrypted folder type (or whatever) will be auto-enabled - so while it’s aware of it’s status, there’s no need to do anything. I think there will be enough behavioural differences (no scanning, automatically mending broken/altered local stuff when discovered on request, …) that it will be cleaner to have a separate folder type than lots of conditions within the existing ones.

I keep ignoring that that’s the requirement. So a verification mechanism over the protocol won’t work indeed - magic token it is then. Probably a file like .stfolder/encryption-token with custom handling.

The encrypted device could wipe all data and index when the encryption token changes. I am pretty sure there’s gotchas with that, but can’t think of them right now :slight_smile:

@39aldo39 Most of that sounds in line with the plan as described before my post, and that’s still valid. Automatically switching from encrypted to unencrypted will probably not be there, at least not initially, but that seems minor.

You lost me here: You say there’s no need to communicate keys or derivatives, but then that they keyfile storing the pw allows to verify the correctness of the pw on other device - so we do communicate/transfer that keyfile, right?

My question in the previous post was likely related to such a keyfile, as in what exactly it is respectively how it’s encrypted. As when we share it to verify the password, it will be available on the untrusted device (as that’s the only intermediary). Password obviously still needs to be communicated out of band.

It is indeed mostly inline with your plan, but I also thought a little more about the details that the crypto needs in order for it to work.

You are right that we still need to communicate the keyfile. However, this file can be public and an untrusted device is allowed to see it, which means that we can use the normal file synchronization to send it. The device is even allowed to modify it, but that will be detected as an error later. So, no custom protocol is needed.

The keyfile contains a ‘folder-key’ that is used for encrypting the shared folder. However, this key is encrypted with another key, the ‘password-key’, which is derived from the password and folder-id using scrypt or similar. If we want to know the ‘folder-key’, we have to know the keyfile and the ‘password-key’ (which we get from the password). If either one is missing, we cannot know the ‘folder-key’. It also verifies the password as encryption will fail for an invalid one. It is similar to your idea, but it is more standard and allows for more flexibility when changing passwords, adding other authentication options etc.

Referring to the draft spec from https://github.com/syncthing/docs/pull/480 the key used for encryption is folderKey = scrypt.Key(password, "syncthing" + folderID). There’s no intermediate key planned, i.e. folderKey takes the function of both your folder-key and password-key.

If I understand you correctly you propose to have an “arbitrary” (as in not derived from a password) encryption key. Then use a key like the folderKey above, that due to the different use you name password-key, to encrypt that encryption key and share the encrypted output. Is that correct?
That has the problem that there’s a timing problem if two trusted devices set up a folder with an encrypted one at the same time: Even if they use the same password, they will use different encryption keys, which breaks the scheme. With the current encryption key, derived from the password (and commonly known stuff like folder ID), that can be avoided. The shared token would then really only be there to detect misconfiguration (different passwords used for the same folder and device).

(In addition a file key is derived from the folder key, but that doesn’t change the above.)

We could derive a metadata key from the folder key and use that to encrypt something that gets sent in the cluster config. The encrypted device would stash this somewhere and pass it on to others. If there is something useful to convey (folder label?) we could use that as the encrypted data, otherwise it really doesn’t matter if it’s just a few random bytes - being able to decrypt it and verify the message integrity indicates that we agree on the key.

1 Like

What is the purpose of having separate passwords for devices on a single folder? It would be a simpler UX to have a single password per folder (obviously still possible to have both encrypted and unencrypted devices on a folder). I am unable to come up with a use-case for separate passwords. If you e.g. go for multiple encrypted devices, which also share between each other, you need to set the same password on all of them anyway (e.g. T1 connected to both of E1 and E2, E1 connected to E2 and T2 connected to both E1 and E2).
Does anyone know a use-case for different passwords on a single folder?

1 Like

Well you can build something like

                 ┌────┐       ┌────┐
           ┌────▶│ E1 │──────▶│ T2 │
┌────┐     │     └────┘       └────┘
│ T1 │─────┤
└────┘     │     ┌────┐       ┌────┐
           └────▶│ E2 │──────▶│ T3 │
                 └────┘       └────┘

where E1/T2 and E2/T3 have different passwords for the same folder… But I’m not sure what the gain is.

There’s also

┌────┐                  ┌────┐
│ T1 │◀───Encrypted────▶│ T2 │
└────┘                  └────┘

where each device thinks that the other is encrypted, but both know the password… (I didn’t actually try this but it ought to work.) So to get a folder you need to both be added and shared-with in the usual manner, and then also know the password…

That’s the crux - I could come up with a topology for different PWs, but not with a reason for such a topology.

Thanks for bringing this up. I didn’t consider that it would just work and labelled it as an “impossible situation” in my plan (aka misconfiguration).

2 things:

  1. Thanks for continuing to look into this. I’m avidly watching this and glad ya’ll are sticking it out.

  2. I can live with either of the two options @calmh showed above but for at least my uses, a single password is good enough and prevents the confusing scenario where the UI tells you a folder is encrypted on another device, so you associate that device with not knowing the contents of that folder, but then it turns out due to it having a different password, the contents are known after all. With the single password, it becomes a binary choice, either the contents are known to that device or not. Much easier on the wetware.

I noticed in the encryption documentation that it says “which identical blocks are reused within any given file” is something the untrusted device will be able to observe. This seems like it would open the door to some pretty significant fingerprinting concerns, and encryption mechanisms usually go to lengths to avoid this (e.g., by including the offset as part of the initialization vector). The document also says that this is “required by the syncing mechanism, in order to avoid transferring all unchanged file data when a file block changes.”

While the untrusted device being able to see “which parts of files are changed by the other devices and when” is clearly needed for efficient syncing of changes, the bandwidth saved by allowing the untrusted device to occasionally reuse a block from a different part of the file (only on initial sync, or when a block is modified to exactly match another existing block in the file) doesn’t seem at all worth leaking the additional information to me.