TCP SYN flood attacks in the wild?

desbma · February 2, 2019, 11:25pm

I am running a Syncthing instance (for my personal use), and also a relay (my way of giving back to the project) on a small dedicated server I rent.

The server is reachable on a public IP and I run other services on it.

Since I started running Syncthing an opened TCP ports 21027 and 22067, I quickly received SYN flood attacks on those ports.

This has very little consequences, because Linux automatically enable SYN cookies, and the server goes on as usual. Sometimes though, the volume of incoming traffic triggers my host’s automatic hardware DDoS mitigation, which adds latency/packet loss.

Does anyone have idea who would run such attacks, and for what benefit?

AudriusButkevicius · February 2, 2019, 11:35pm

I’d look at the source of where it’s coming from and try to backtrack from there.

desbma · February 2, 2019, 11:37pm

Yeah I thought about that, problem is the source IPs are usually forged in such attacks.

Otherwise I would quickly write a Fail2ban filter to ban the offenders, and be done with it.

AudriusButkevicius · February 2, 2019, 11:59pm

Sure. If you had a relay on that port I would understand that. Now, I have no clue how someone would get your ip address and port.

desbma · February 3, 2019, 12:09am

I run both a relay and a normal (private) Synthing instance on the server. So maybe they grab the IPs from https://relays.syncthing.net/? Or they run automated port scans on IP ranges?

Still I understand the incentive to try to probe for badly configured SSH servers to hack, or crappy PHP websites to infect, but why the heck would anyone try to DDoS Syncthing services?

calmh · February 3, 2019, 8:14am

When Syncthing starts up, it does TCP connect to a handful of relays to measure the RTT and select the closest. I suspect this is the flood you’re seeing.

AudriusButkevicius · February 3, 2019, 10:38am

Also, vica versa if you run a relay, clients connect to you to proble latency.

desbma · February 3, 2019, 11:59am

Interesting, thanks, so this may not be an attack after all.

I have suspected something like this, but though it was unlikely that valid connections would trigger this.

I tried increasing sys.net.ipv4.tcp_max_syn_backlog and will see if the SYN cookies protection still triggers.

I don’t understand though, normally the SYN cookies are enabled if the number of incoming connections is > to the tcp_max_syn_backlog value, but if the client is legit and responds with an ACK the connection is open and is not counted as coming from a potential SYN flood.

Or does Synthing use very low TCP timeouts or something else that is unusual?

Anyway if the cause is confirmed as a normal Syncthing behavior, I think this deserves a word in the documentation.

AudriusButkevicius · February 3, 2019, 12:20pm

The latency check literally times the connect syscall, and shutdowns and closes the socket immediately afterwards, so based on this https://lwn.net/Articles/508865/ it’s just syn and synack.

This is only relevant in the relay, which on it’s own is a fairly nasty thing to run as it can be abused, flooded and what not, so I don’t think we can define all possible things that can happen with it.

desbma · February 3, 2019, 12:50pm

The latency check literally times the connect syscall, and shutdowns and closes the socket immediately afterwards, so based on this TCP Fast Open: expediting web services [LWN.net] it’s just syn and synack.

If the connect syscall is completed successfully then the connection is open, to that should not cause what I see (lots of half open connections). Unless of course connect has a very short timeout, and returns before it completes.

This is only relevant in the relay, which on it’s own is a fairly nasty thing to run as it can be abused, flooded and what not, so I don’t think we can define all possible things that can happen with it.

Yes I agree, the user is responsible to configure/secure what he runs. However if the expected Syncthing behavior can lead to receiving bursts of hundreds of TCP SYN in a normal situation (not an attack), that is rather unusual/surprising and may deserve a mention.

AudriusButkevicius · February 3, 2019, 1:04pm

I am not sure that’s true, juging on the lwn article, it looks like connect returns after the 2 out of 3 packets are exchanged, so perhaps the socket gets a RST before it sends the ACK for the SYNACK. Anyways, this works exactly the way we expect it to, namely, it measures the latency which is what the purpose of this is. The timeout for the connect is 1s, but I am not sure why that matters. We check a few hundred relays once on startup and then pretty much never again.

Again, I don’t think it’s syncthing, it’s the relay. Syncthing shuts down and closes the connections, sure, they end up in TIME_WAIT in the kernel, but I don’t see how this relates to syn flood.

desbma · February 3, 2019, 1:24pm

Sorry when I wrote “Syncthing” I meant the whole project’s code, which includes the relay and the “normal instance”.

About the TCP exchanges, it is possible that the last ACK is lost, but that would be statistically unlikely.

The timeout does matter because if you are in high latency situation, connect will send the SYN and then just consider the remote is unreachable after 1s and just give up. The result is that the server will maintain a half open connection for a while (I’m not sure but I think the Linux default is 1min). If that happens a lot this of course becomes a problem and the SYN flood protection kicks in to avoid maintaining too much idle connections in memory.

So that is currently the most plausible explanation: the short timeout for high latency connections will cause the server to receive the SYN, and the client to give up before receiving the SYN-ACK.

AudriusButkevicius · February 3, 2019, 2:41pm

I don’t think 500ms latencies are that common, if anything, you’d have it only on a handful of relays.

desbma · February 3, 2019, 3:08pm

Mobile phone connections maybe? Or cross-continent latency?

Even with 50ms latency, a single packet loss of the SYN-ACK, and there you reach the 1s timeout, because the default RTO is 1s.

desbma · February 4, 2019, 8:31pm

I still get SYN flooding (on port 21207 only), even after increasing net.ipv4.tcp_max_syn_backlog to 1024.

AudriusButkevicius · February 4, 2019, 8:55pm

That does not look like a standard port.

desbma · February 4, 2019, 9:03pm

Sorry, that is indeed another service unrelated to Syncthing.

So nothing unusual on 21027 and 22067.

AudriusButkevicius · February 4, 2019, 11:45pm

Right, so the syn flood is from somewhere else?

desbma · February 5, 2019, 11:38am

All I can tell is that I before I had SYN cookies automatically enabled by Linux on port 22067, because of excessive SYN packets received. The warning Linux logs is: TCP: request_sock_TCP: Possible SYN flooding on port XXXXX. Sending cookies. Check SNMP counters.

Now, when increasing tcp_max_syn_backlog to 1024, SYN cookies are not enabled.

So that only means I don’t have more than 1024 half open connections at the same time, not that the SYN flooding is gone.

I have enabled the gathering of TCP counters in Sysstat, and will use that to plot some graphs and investigate when I have a few days worth of data.

EDIT: To clarify I never has SYN cookies enabled for port 21027, I confused the value with another port, so only the relay is affected which is more logical I guess.

desbma · February 14, 2019, 10:34pm

I now have some TCP stats for the previous days, and I have a quite constant number of open sockets, with a very low count (<10) of “passive/s” TCP sockets transitions.

According to the syssat doc:

passive/s
The number of times TCP connections have made a direct transition to the SYN-RCVD state from the LISTEN state per second [tcpPassiveOpens].

So that should raise with SYN floods but does not, even though I still get a few:

$ for boot in {-10..0..-1}; do LANG=c journalctl -k -b $boot | grep SYN | grep -E '(22000|22067|22070)'; done
Jan 17 02:19:39 hostname kernel: TCP: request_sock_TCP: Possible SYN flooding on port 22067. Sending cookies.  Check SNMP counters.
Jan 19 08:15:34 hostname kernel: TCP: request_sock_TCP: Possible SYN flooding on port 22067. Sending cookies.  Check SNMP counters.
Jan 31 03:54:40 hostname kernel: TCP: request_sock_TCP: Possible SYN flooding on port 22067. Sending cookies.  Check SNMP counters.
Feb 12 04:43:33 hostname kernel: TCP: request_sock_TCP: Possible SYN flooding on port 22067. Sending cookies.  Check SNMP counters.
Feb 14 04:45:01 hostname kernel: TCP: request_sock_TCP: Possible SYN flooding on port 22067. Sending cookies.  Check SNMP counters.

Anyway I tried moving some services to other ports and I quickly receive SYN floods on those too, so this is definitely not related to Syncthing, sorry for the noise.