Great comments. I really like your candor LinuxBaby. I agree with what you’ve expressed. Honesty and frankness is what enables Syncthing to grow and improve, and it improves (speaking for myself) my own talent, expertise, and understanding.
There is alot of potential in competing with BitSync, and I think Syncthing may displace soome of it’s user base without even trying, but there is also the issue of coordination and prioritizing what we are all working on. But I think that will all work itself out, as there are some good minds here.
But managing the features and workload aside (I have no idea how the core team did this in a year, it’s astonishing)… Anyway but in regards to the problem of “Bypassing NAT” and other features, enhancements, etc – I think Audrius and the other main devs are well aware of these issues – including the benefits and drawbacks etc. It’s a balancing act.
I know Audrius and I have discussed the idea of using UDP at length… Actually last weekend, I tested out the throughput of three UDP transfer protocols… (A) UDT , (B) UTP, and © Tsunami UDP … All gave performance increases of approximately 10% to 40% over TCP in terms of throughput. UDP has the benefit of being both faster and anti-firewall … uDP is also easier to ‘rendevouz’ without using a relay because it’s stateless… But UDP is harder to code and debug, and it is also sometimes filtered more harshly than TCP.
So Audrius and I have gone back and forth on the transport protocol, TCP vs UDP, that and other issues, because any major changes to the client or protocol could turn into a huge project. Yeah, my experiments show UDT is a bit faster in experimental conditions, but if it takes a year to implement, is it worth the trouble? Tough questions.
One advantage of UDP, and one reason it’s so commonly used in P2P networks is that it is … (A) More scalable than TCP for large networks, (B) It does give you that 'firewall punching ’ ability, and © Is less vulnerable to TCP congestion problems, (D) takes less memory on the server, (E) and is easier to multithread, and (F) can easily punch through common NAT configurations found on linksys home routers , especially with the help of a rendevouz server.
Anyway, everything in Syncthing is TCP now because of a few reasons… I agree with Audrius’s inclination to stick to TCP in the short-term …
Benefits ot TCP for short-term:
- Way easier to debug than UDP
- Ensures ‘built in’ packet ordering, session management, congestion avoidance, ACKs of data, etc.
- TCP and TLS work great together, and means we don’t have to deal with rolling up custom crypto or UDP TLS
- TCP will scale well enough for small network (say up to maybe a dozen or two dozen nodes)
- So TCP will be fine to get the concept perfected.
The big disadvantage , however, for TCP is exactly what you’ve described in the original post… If I have one Syncthing endpoint behind NAT, and a second Syncthing endpoint in another city behind NAT, how can they establish a TCP connection directly to one another ?
Peer A ←→ Gateway A (NAT-a) ← … Network … → Gateway B (NAT-b) ←→ Peer B
I might be wrong or misunderstanding the issue, but I don’t think they can directly connect to one another in the current implementation. At least not using standard means (they can’t just open a TCP socket to one another).
This brings us to the topic you’ve brought up … the concept of bypassing the two-sided NAT … What are the ways to do this?
Audrius pretty much summed up the options. Here they are in detail…
(1) Relay Server.
Both NAT’d hosts connect to a relay server (usually via TCP), and the relay acts as an intermediary. All traffic flows through the relay.
Peer A ←→ Gateway A (NAT-a) ←→ Interwebz ←→ Relay ←→ Interwebz ←→ Gateway B (NAT-b) ←→ Peer B
The downside is someone has to pay for the relay server . And bandwidth can be expensive . There is no way we could afford to host this on Amazon . It’d cost a fortune. Even a VPS or dedicated server might be too pricey.
This IS be a possible solution for how SyncThing could afford a Relay Server for all users, but we’d have to find a low-priced colocation with unlimited inbound and outbound traffic for a reasonable price. Kind of like the Syncthing discovery server that’s already running, this colocation machine would act as the ‘Official Syncthing Relay’ – as a public service for anyone behind NAT.
(2) TCP Hole Punching.
This is hard to describe, but is pretty much a complex technique of tricking the NAT/ Firewall into thinking both endpoints are establishing outbound TCP connections, but really it opens a temporary ‘gap’ in the firewall so that the two peers may rendevouz and establish a TCP session.
This involves port prediction and/or tricky manipulation of raw sockets (manipulation of the TCP handshake to trick the firewall with SYN flags). Usually there’s a limitation here , because to manipulate raw sockets (send crafty firewall-busting TCP packets that deviate from the RFCs) you need admin / superuser privs.
This technique is difficult and not 100% reliable. And I think it’s used in Bittorrent . It might need root / admin depending on the method. But it might be an option . Perhaps others can weigh in on TCP hole punching.
(3) UDP Hole Punching.
Similar idea to TCP hole punching … Several different methods of doing it . If I recall, most involve a third party rendevouz server that helps the NAT’d parties agree on the details of their UDP packets – mainly which ports the endpoints are going to send and receive on. The rendevouz server also helps the NAT’d boxes open up the holes in the firewall.
The nice part about UDP hole punching is that traffic flows directly from Client to Client once the rendevouz server has helped them punch the holes. It doesn’t require all traffic to flow thru a relay … Just the first few packets need the third-party server. This technique is WIDELY deployed in P2P protocols like Bitorrent.
Now remember UDP is ruled out for the moment. I think Audrius mentioned a developer was adding this in a feature , but it could take a bit to get finished, tested, integrated, etc etc.
So that leaves us two options to connect two peers both behind NAT, in the short-term (<12 weeks)… We can
(1) Set up our own Syncthing relay server .
Not a VPS – A real 1U colo on a high-speed datacenter somewhere in U.S… This would be the public relay for any connections. Peer1 connects to relay on a control port, and is given a random port like 31000. It reconnects on the designated port. Peer2 does the same, and is issued a different random port like 31001.
The server uses iptables, netcat, a reverse proxy, custom code, whatever, and bridges the traffic between relay:31000 and relay 31001.
We’d just need to make sure the relay server only allows Syncthing , and not other random traffic. And then NAT is not a problem any more.
For the NAT-busting features discussed here, I’d say catch up with the main devs and see what they’ve got going.
What I’ve written here may already be possible. I’m not clear on what parts of this are completed, planned, or on hold.
But if I were operating ‘from scratch’, I’d say let’s make a Relay Server. I know I have a lot of ideas – but this is actually relaly easy. I already have two colos, a half dozen VPS, and at least a dozen EC2 instances.
So here’s how it’s done… We get a 1U server – something used. Maybe like an AMD multi-core with a ton of RAM. We either set up RAID0 on a couple of small SCSI drives (40GB tops) , or else we put in a couple of name-brand SSDs.
We install either OpenBSD or a solid Debian-derived Linux distro on this 1U rack server. We set it up with dm-crypt and crypsetup with full-disk encryption (probably like aes-xts-plain64or aes-cbc-essiv:sha256), with ramdisk for /tmp and encrypted swap partition. We can make it extra-hardened with grsecurity kernel patch
This Relay Server would be the rendevous point for any Syncthing client that can’t find a peer because of NAT – The Relay routes which was registered with announce.syncthing.net, but where two peers attempted to talk (and failed) because they are behind NAT / firewall. The relay would accept two incoming connections, do some sort of validation, then we just bridge them (relaying traffic between them).
This could easily be done on the Relay Server via ssh, iptables, custom solution , or some combination. Technically the ‘relay’ is a bit like a reverse proxy. Here are some good details on how it’s done…
Bridge between two ports on the same network interface on a Linux server
But how do we afford to pay for a dedicated colo? Easy. Cheaper than you think.
The cost of hosting a 1U colo would be around $60 a month for unlimited bandwidth at a Chicago datacenter.
–^ example of $60 a month colo for 1U rack with unlimited bandwidth… peering point is near Chicago, which is a pretty good place to host a colo for continental U.S.
OR… we can…
(2) Implement TCP Hole Punching into the codebase.
This sounds like a major headache… Mainly because I don’t know how to do it. But maybe someone here does know how to do it – or maybe there’s libararies we can use to accomplish this.
Those are my thoughts.
OH last hing: The Network should absolutely be fully meshed at small scales – unless there are a large number of peers. But up to say maybe like 6 to 18 nodes, I strongly believe the network should be fully meshed (or close to it).
For larger networks it’s not going to be possible to go fully meshed because of the overhead of TCP , especially if things get chatty. You might be able to get up to 40machines mostly fully meshed if you switch over to UDP, but that ain’t gonna be pretty. And UDP is not going to be feasible in the short term.
For larger networks you’d have to copy a sort of partially meshed style, with distribued structure. There’s a reason bittorrent isn’t fully meshed – haha… It’s not possible. All the traffic would grind to a halt.
That’s because the number of connections (aka total TCP sockets) between nodes in a fully meshed network grows quadratically with the number of machines in the network. In Big O notation, the number of TCP connections grows relative to the number of machines as O(n^2).
Let c be number of total connections in the network (graph edges)
Let n be the number of nodes in the network (machines in fully meshed net)
So fully meshed is really convenient , nice for small nets, but is awful at scaling.