Security of hashed file blocks

schnappi · January 4, 2018, 9:53pm

As understand Syncthing (from the docs) each file is sliced into a number blocks, and the SHA256 hash of each block is computed. Say that use Syncthing to Sync a folder on an external drive that is encrypted with LUKS or Veracrypt or Bitlocker or whatnot. This means that the Syncthing file hashes are stored separately from the location of the actual synced files. How dangerous is it if an adversary recovers the SHA256 hash of synced files by gaining control a Syncthing node that has since been disconnected from a cluster (this is theoretical only)? Basically how secure is the SHA256 hash of the file blocks?

Noticed that there is an item when starting up Syncthing called “weak-hash enabled.” Does “weak-hash” being enabled cause the hash of file blocks to be less secure?

calmh · January 4, 2018, 10:02pm

What is the threat you worried about - that is, what do you envision the attacker doing with the hashes, weak or strong?

schnappi · January 4, 2018, 10:07pm

For example if root control is gained over a running server Syncthing the syncthing data folder (and the corresponding file block hashes) would be in the hands of an adversary while synced files on an unmounted encrypted partition or USB drive would be secure in themselves.

calmh · January 4, 2018, 10:09pm

I’m not sure what you’re getting at. The index contains the block hashes, and other metadata like the actual file names. If those are in the hands of an attacker, they know your file names, modification times and block hashes.

Having the hashes does not help in reconstructing the data, other than if it’s publicly available such as an mp3 they could download based on the name of the file.

schnappi · January 4, 2018, 10:15pm

Pretty much answers it. If a Syncthing node is compromised file names, modification time, (assume folder trees too), and hashes for files synced on an unmounted external drive or partition will be known by an adversary. Did not/ do not have the greatest understanding of cryptography. Thanks for answering question.

As dangerous as those European bus drivers are don’t get hit by one!

generalmanager · January 13, 2018, 3:15pm

If your node is compromised you have much bigger problems than someone knowing your file names and hashes. If your LUKS, Veracrypt or Bitlocker encrypted block devices are mounted, they can obviously also access the files. If not, an attacker could always inject files with malware and sync those over to the other connected devices. Or make the synced computers delete all files in the sync folder. The same goes for sending infected emails to all your email contacts etc.

Then it always depends on how quickly you would notice such an attack. If you use your device normally (because it appears to behave normally) for maybe a week, they’ll probably have the your login details to everything in your life. If you store those in the browser or a password manager those are gone instantly (or when you first have to unlock the password storage if it’s encrypted).

But basically of this is not a syncthing issue, but a computer security issue. Most importantly: keep in mind that syncthing is NOT a replacement for multiple securely encrypted and authenticated offsite backups!

schnappi · January 15, 2018, 6:44pm

Respectfully disagree with @generalmanager premise as it relates to the question. One may wish to temporarily from time to time sync a USB or external drive with say a relatives or company computer that do not otherwise control or that do not fully control.

Knowing what information is leaked if any of the above machines is compromised or even outside of ones control is an important consideration when deciding whether or not to sync an external or USB drive in a situation as the above.

generalmanager · January 15, 2018, 7:50pm

What do you disagree with? You didn’t mention that this is your usecase.

The short answer is that SHA256 is considered one of the strongest cryptographic hash functions and you don’t have to worry that it’s possible to calculate file contents from the hashes. There isn’t anything meaningful anybody could do with the hashes besides infecting the files in the drive. The biggest danger is always plugging your USB drives into untrusted and possibly infected computers and plugging it back into yours.

In one post you stated explicitly:

For example if root control is gained over a running server Syncthing the syncthing data folder (and the corresponding file block hashes) would be in the hands of an adversary while synced files on an unmounted encrypted partition or USB drive would be secure in themselves.

However this doesn’t make much of a difference from a security perspective. Once the host computer you plugged the USB drive in is infected you have a problem. At least if you ever plan to re-use the USB drive, as it may very well be infected with malware which you can’t remove by deleting the files or even by replacing the HDD inside the USB enclosure. If you want to know more information on such threats, you can search for badusb.

A similar threat is that one of the computers you are syncing files with is/was infected with malware at any point in time before or during you are syncing with it. The malware could inject exploit code into one or more of the files you are syncing with such the infected device. So the next time your operating system scans your filesystem to update it’s search cache or once you open the file, your computer can become infected as well.

I am not saying that this is incredibly likely (it isn’t, because Syncthing isn’t very well known and popular). But that’s what’s technically possible. From a security perspective the danger that existing weaknesses in your operating system and other software are exploited is much bigger than that somebody will be able and willing to spend the time and money to try and attack the pretty strong authentication and encryption syncthing uses. Additionally Syncthing is written in Go, which is a memory safe language. This means it makes it impossible for developers to make the kind of mistakes which most commonly lead to the software beeing vulnerable to exploits.

If I wouldn’t think that Syncthing was the safest application to sync files, I certainly wouldn’t use it myself

Regarding your question: If your relatives can’t decrypt the LUKS/Truecrypt/Veracrypt container (which also needs to have the syncthing config directory) then only the scenario of malware in the USB controller and in files on the unencrypted partitions applies to you.

If your relatives can decrypt it, they can also delete files from your directory so that the deletes will propagate to your syncthing devices once you plug it into your own computer. Maybe they could also modify the hashes, but as I’m not sure in which situations syncthing recalculates them, I can’t exactly tell you if this would have any impact at all. My guess would be that the receiving computer notices the hash mismatch and at worst you will get some unsyncable files. If ST acts smart and recalculates them on the sending side, nothing happens at all.

schnappi · January 15, 2018, 8:09pm

Great additional information. To other interested parties and to conclude. If a Syncthing node is compromised in any type of situation but synced files are on an unmounted device or partition the files names, file hashes (and maybe and probably file sizes) in addition to folder structure will be visible. It is extremely unlikely that file contents could be derived from the file hashes.

A partial solution could be putting Syncthing config files in a custom location on an encrypted partition but until the Syncthing node is restarted the mounted volume with the Syncthing configuration files will still be visible to anyone with local control of a machine.

Thanks for everyone who helped with this issue with the detailed responses.

generalmanager · January 15, 2018, 8:31pm

If a Syncthing node is compromised in any type of situation but synced files are on an unmounted device or partition the files names, file hashes (and maybe and probably file sizes) in addition to folder structure will be visible.

Of course there are things like the operating systems login screens and file system access rights which protect against just anybody walking up to the machine and having a snoop.

But in the end anybody with enough knowledge, preparation and money can get full control over a device if they have physical access. This is one of the most basic rules in IT security. The local attacker (google evil maid) is one of the strongest attack scenarios (threat models) in existence. Any software (and it’s data) which isn’t run inside a TPM or HSM is free for the taking.

This isn’t something specific to syncthing but applies to every software on any normal computer.

If you want to get serious about security you either need to spend some real time (read: several months to years) to protect yourself against such sophisticated attacks. Or you can always buy something from the same companies selling government-approved hardware and software. But then you’d always have to ask yourself: are they really motivated to protect you or will they change their mind if their biggest customer comes asking them to put something special into your machine.

You can hopefully see that this is a rather deep rabbit hole to crouch into

If you want to get serious about it, you can start by reading about TRESOR, a kernel module allowing you to put your encryption keys inside the CPU debugging registers and kernel hardening (see linux-hardened grsecurity and pax are good keywords too).

schnappi · January 15, 2018, 8:37pm

You do not understand what the whole above question pertains too. This question pertains to what happens when one loses control on a Syncthing node whether that be in a data center, in a relatives home, or in an office.

All your points about local security are great and correct. But this questions pertains to what to expect when one knows local security has been breached on a Syncthing node with files being synced from an unmounted partition.

calmh · January 15, 2018, 8:48pm

Your first point to worry about then is that the certificate present on that compromised machine allows it access to download all files from any of the other devices. Until you revoke access by removing the compromised device from their list of accepted devices.

generalmanager · January 15, 2018, 8:53pm

I do understand (at least since you specified it in more detail). But as the answer to that is basically the metadata, which calmh answered and you summarized it correctly in your last post. I assumed that you’d have removed the now untrusted machine already. The problem is that this can only be done once you know something fishy has happened. At this point it’s usually too late.

After I had already written so much about the different possibilities with local access I just wanted to take the opportunity to put some more info in there which is closely connected (the whole “once someone might have had the chance to touch it, it should be thrown out” routine. Of course this depends on the threat model.) and might also be interesting to you or anyone reading this at a later point in time.

schnappi · January 15, 2018, 9:00pm

Thank you very much @generalmanager for your thoroughness and pointing out that anyone with local access can compromise a machine that is on. This is all only theoretical. Revoking a compromised node is important but if an unmounted partition or device is only folder being synced the Syncthing config file would need to be modified first to change the sync folder from the unmounted partition to an existing folder before files can be sent from a compromised node.

calmh · January 15, 2018, 9:02pm

I’d just copy out the configuration and keys to my own machine and sync the files to there.

vsespb · January 27, 2018, 7:26am

you don’t have to worry that it’s possible to calculate file contents from the hashes There isn’t anything meaningful anybody could do with the hashes

Consider the following scenario:

User has a file .netrc (example Unix -> .netrc ; Git uses .netrc for auth btw ). with following content:

machine myownmachine
	login myusername
	password mypassword

Attacker might know that user has login myusername on myownmachine and bruteforce file content to search for password. This will be offline bruteforce (i.e. attacker is limited only by CPU power) compared to online bruteforce (trying to login to myownmachine, which could be slow due to network delays or not possible due to attempt limit on server side ).

And there is no user fault, it he use in this case password which is strong enough for online bruteforce, but to weak for offline bruteforce.

This is just example. There can be other files where all content except one single string is predictible, and one single string can be bruteforced. Including passwords.

The short answer is that SHA256 is considered one of the strongest cryptographic hash functions

This is irrelevant here how strong is the hash function.

calmh · January 27, 2018, 7:33am

That’s a quite far fetched vulnerability (depending on knowing the exact file layout, new lines, comments, tab width used, etc), and again you should worry more about the keys and config that are stored beside the index. With those, the attacker can just download the netrc from another device.

vsespb · January 27, 2018, 9:36am

depending on knowing the exact file layout, new lines, comments, tab width used

Tab vs spaces that is something that can just assumed. And the fact the there is probably no layout at all - too. Attacker I can just assume that .netrc consists of only three lines - machine, login, password. Nothing else. Tabs or 0-4 spaces used. No new lines. That assumption can be simply true for most setups. And there can be other software with similar config files.

you should worry more about the keys and config that are stored beside the index. With those, the attacker can just download the netrc from another device

I think, if we talk about security we should not assume possbile “positive” scenarios. Only negative. By “positive” I means that there is not security issue in SyncThing and nothing to worry.

You propose that there is a “positive” scenario: “System comprommised, and attacker stolen SyncThing key and downloaded all data, so fact that SyncThing store sensetive data on another media did not disclose nothing new to attacker”. Then there is no security issue in SyncThing.

I propose negative scenario:

“System comprommised. But user revoked all security keys used by the system, before attacker was able to use them. However attacker got some sensetive data”.
“User harddrive start failing. User was no aware that there is sensetive SyncThink data on the media, except security keys. User revoked security keys and send device to RMA”
“User notebook lost or stollen. User revoked all security keys”.

If negative scenarios exist, then there is a security issue. No matter how many positive exist. IMHO that is how whole security works.

Problem is some sensetive information about user data is stored on another device. And that’s poorly documented. That’s it.

imsodin · January 27, 2018, 10:01am

This is somewhat covered in point 2 under “Protecting your Syncthing keys and identity” in our “Security Principles”:

If you’re syncing confidential data on an encrypted disk to guard against device theft, put the Syncthing config folder on the same encrypted disk to avoid leaking keys and metadata. Or, use whole disk encryption.

I thought we also document what information is stored in the db, but I couldn’t find it right now. If this really does not exist, I agree it would be useful to have a list of what information can be obtained through that. You are welcome to propose something in the docs repo.

That also may include potential attack vectors like the offline brute force attack against known files you propose, but (if at all) only with a very clear statement about the relevance of this scenario, which is indistinguishable from 0 for the average user. Personally I think that this isn’t needed, because for any user who cares this is already clear from knowing what information is in the db, and for users who don’t care about security, there are much lower hanging fruit/pitfalls (e.g. no encryption of the data itself).

calmh · January 27, 2018, 10:22am

I have no idea what the actionable outcome of this discussion is supposed to be.