Website with public discosrv database

generalmanager · September 20, 2015, 4:39pm

While I’m an avid advocate of privacy and anonymity, I don’ think syncthing is or should be designed to be used as a tool to sync files in perfect anonymity.

That is because it’s incredibly difficult to reach at least ok anonymity as you can see with all the bugs the TOR project has to fix. Most of those problems are with hidden services, because those have to be accessible from a fixed address/ID over a long time. Which is basically the same requirement we have when we want to find other ST devices. If even a project with several million dollars in funding and many highly skilled coders with experience in anonymity projects can’t reliably make this work, its impossible for us.

Oh and if your threat model includes an adversary with global passive listening skills (as it should for perfect anonymity, but even the TOR Project doesn’t do this), you have to protect agains traffic correllation attacks like Adam Langleys Pond. This might be kinda usable for the messaging usecase Pond has. But you’ll never be able to get any usable transfer speeds for file sync. At least not if you want normal people to use it, because Dropbox will always be several orders of magnitude faster than any truly anonymous file sync/sharing tool.

Because of all those problems the usecase of anonymous file sync can only ever be appropriately addressed by tools based on anonymity networks like I2P, GnuNet and partly TOR.

That’s why a DHT or a blockchain-based discovery would be nice in the long run. But building either with the privacy and usability on mobile devices provided by the current and especially the upcoming 0.12 release is next to impossible.

DHTs aren’t easily encrypted because of the way they work. Blockchain-based technology could fix the problem that the social graph of our devices is exposed, because everybody downloads EVERY IP-ID pairing and just uses the ones it’s looking for. It can’t fix the fact that hash-IP pairs are publicly accessible.

But downloading a huge blockchain and staying connected to a swarm is not viable on mobile devices because their storage and energy resources are rather limited.

There are things like “light” clients but they generally just trust the answer of a node with the whole blockchain. Light clients using snapshots of trusted points of time in a blockchain could also work but are still experimental.

If you are interested in those blockchain-based solutions, you should take a look at DNSChain by okTurtles.

But to be honest, I think Syncthing has bigger fish to fry at this point like selective sync and diff-based index exchanges.

As soon as the cost of hosting the discovery server(s) gets too high or ST gets popular enough to be attacked via DDOSing them, we can take another look at those options.

As for the anonymity requirement, this is basically impossible to do on our budget of financial, community and developer resources. And because the government can easily surveil everything connected to your official identity you would have to use a near-perfect anonymity network. Which brings us back to the first point.

The blockchain-based solution I described above would fix this, but it’s A LOT of work and I’d consider this technology to still be experimental.

calmh · September 20, 2015, 7:15pm

For the TLS discovery in v0.12, I’ve setup three geographically redundant discovery servers - Sweden, the US and Singapore. Each is reachable on IPv4 and IPv6. So to take out global discovery, you now need to DDoS three separate servers on three continents, which is at least a little more work. Things will keep working as long as least one of them is not dead.

Eddy2909 · September 20, 2015, 8:00pm

great news for the sunday I also spoke to a friend of mine whis working at a german hosting company… maybe they will sponsor a disco server for the community

calmh · September 20, 2015, 8:03pm

No need, to be honest, as they are very small and simple to host. My instances above are the cheapest $5 instances from Digital Ocean. However if they want to help host a relay server that would be awesome.

(It supports total and per-session rate limits, so can be made to behave. I’ll have a document on this up by the time 0.12 goes live.)

Eddy2909 · September 20, 2015, 8:24pm

I could ask them for that too when .12 has been released

Eddy2909 · October 4, 2015, 8:35am

btw: I would like to tell them the specifications that are needed to run a relay smoothly. are there some experiences?

rumpelsepp · October 4, 2015, 9:39am

I recently have created a PKGBUILD for arch linux. It is almost ready, I just have to

add a install script which creates the appropriate user
finish the systemd service files.

But you can run it now in, for instance tmux under your user of choice.

Eddy2909 · October 4, 2015, 9:57am

I mean CPU, RAm etc…

AudriusButkevicius · October 4, 2015, 3:33pm

It doesn’t need any of that, it’s just reading packets and sending packets, so anything should work

klausenbusk · October 9, 2015, 9:00am

Wondering if Digital Ocean may want to sponsor a few instances.

kahun · October 10, 2015, 3:28am

Well that is false.

Knowing which ips connect to which ips is quite bit a knowledge by itself.

canton7 · October 10, 2015, 8:59am

This has already been discussed at length earlier in the thread, including this exact point.

The conclusion was that we can’t avoid discovery servers gaining this knowledge without going to a blockchain-like setup (where everyone can see ID<->IP mappings for everyone else, so hiding who’s looking for who), which has significant complications and other downsides. The conclusion being: if you’re concerned about this, don’t use a discovery server run by someone you don’t trust (i.e. don’t use the ones hosted by @Eddy2909).

Eddy2909 · October 10, 2015, 9:59am

@canton7 such a post is not much qualified and sad

even if this will bring up a fundamental diskussion:

why should I am less trustworthy than someone other? (and btw: your ids & ips don’t interest me at all)

canton7 · October 10, 2015, 10:05am

The “if you’re concerned about this” bit is there for a reason This only applies if you’re worried about someone figuring out the IPs of the devices in your network.

You’re not less trustworthy than someone else. You’re of unknown trust, just like everyone else here. I, or anyone else, has no reason to either trust or distrust you.

If someone has decided to trust the Syncthing devs enough to run their software, then by extension they also trust the discovery servers hosted by those same devs. Therefore the official discovery servers are always more trustworthy than any 3rd-party ones.

Eddy2909 · October 10, 2015, 10:09am

does this apply to the hosting company such as digital ocean too?

and why should then anybody be able to run a relay server for the users? compared to

spoofing all traffic would be really easy by mitm or is the relay server mitm secure?

in the end it will always be a human that administrates a server…

canton7 · October 10, 2015, 10:15am

Sure: you would only put stuff on a Digital Ocean server if you trusted Digital Ocean. But we’re not talking about that: Digital Ocean might be entirely trustworthy, but someone malicious can host something on a Digital Ocean server.

Again, you’re missing the words I keep repeating. This only applies if you’re worried about someone figuring out the IPs of the devices in your network. Anyone can run a discovery server, and anyone can run a relay server. If you are worried about someone gaining information about your network, then don’t use third-party discovery or relay servers. Most people are not worried about this, and so using third-party discovery or relay servers is fine.

This is what I said originally. Please re-read it, and see that it starts with the phrase “if you’re concerned about this”.

Interestingly, I believe a relay server has far less insight into your network than a discovery server. A relay server only knows about the IPs of 2 devices in your network, whereas a discovery server can potentially map out the whole thing.

Nope. Everything’s encrypted using keys which the two devices know, but the relay server does not know. The relay server cannot inspect traffic, nor can it insert its own traffic.

calmh · October 10, 2015, 11:58am

This deserves emphasis. It knows device IDs, by necessity because it helps them talk to each other, but that’s all.

AudriusButkevicius · October 10, 2015, 12:34pm

It can insert traffic, but it would cause a disconnect, as the stream is encrypted, so anything injected would probably cause a protocol level error.

canton7 · October 10, 2015, 5:40pm

That’s pretty much what I assumed

NickPyz · October 10, 2015, 7:26pm

This has been an interesting (and at times, passionate) discussion.

In business use cases where privacy REALLY matters, it’s great that St encrypts data in-transit. For some projects, we additionally encrypt data inside TrueCrypt containers. However, we also make efforts to minimize the meta-data leakage about our cluster.

In these use cases, we use a private discosrv. And taking our paranoia 1 step further, we don’t download version updates from any of the devices in the private cluster. The objective is to contain all the data and meta-data within our cluster.

For regular use, when syncing files and photos between my own devices, or with friends - I am unconcerned about meta-data leaks, and have no problem using the official public discovery and relay servers - and updating versions etc.

Finally, since the Syncthing project already provides these servers - I am struggling to understand the benefit of having private individuals make their servers public. It’s not that I fear they are evil or planning some kind of meta-data forensics, it’s just that this seems redundant, adds complexity, and won’t solve any capacity issues until the official servers are swamped - which seems to be far into the future.