Errors when trying to run strelaysrv for the first time

There are no active relays in my country, so I thought that it might be helpful if I run one.

This is the very first time that I’m trying to run strelaysrv though. I have been following the instructions from the Docs. The binary is a self-compiled v1.16.0 for Android/ARM running on an old Android phone.

I have set up port forwarding on the router, and the server actually connected fine initially. However, a few minutes later I shut it down, and since then it has kept failing to connect.

The log is basically filled with these:

# ./strelaysrv
2021/05/17 16:12:10 main.go:140: strelaysrv unknown-dev "Fermium Flea" (go1.16.4 android-arm) unknown@unknown 1970-01-01 00:00:00 UTC
2021/05/17 16:12:10 main.go:146: Connection limit 3276
2021/05/17 16:12:10 main.go:239: URI: relay://0.0.0.0:22067/?id=ZBNADY6-RDASRYT-77JCADK-OK4QWZQ-7ROJOFV-XIGD4BR-J3LJTWL-ADD5WQV&pingInterval=1m0s&networkTimeout=2m0s&sessionLimitBps=0&globalLimitBps=0&statusAddr=:22070&providedBy=
2021/05/17 16:12:10 main.go:242: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2021/05/17 16:12:10 main.go:243: !!  Joining default relay pools, this relay will be available for public use. !!
2021/05/17 16:12:10 main.go:244: !!      Use the -pools="" command line option to make the relay private.      !!
2021/05/17 16:12:10 main.go:245: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2021/05/17 16:12:14 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:12:14 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:38818->125.190.151.7:22067: i/o timeout
2021/05/17 16:13:18 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:13:18 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:42082->125.190.151.7:22067: i/o timeout
2021/05/17 16:14:21 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:14:21 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:45384->125.190.151.7:22067: i/o timeout
2021/05/17 16:15:25 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:15:25 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:48684->125.190.151.7:22067: i/o timeout
2021/05/17 16:16:28 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:16:28 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:51984->125.190.151.7:22067: i/o timeout
2021/05/17 16:17:33 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:17:33 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:55294->125.190.151.7:22067: i/o timeout
2021/05/17 16:18:37 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:18:37 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:59886->125.190.151.7:22067: i/o timeout
2021/05/17 16:19:40 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:19:40 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:34954->125.190.151.7:22067: i/o timeout
2021/05/17 16:20:44 pool.go:88: Failed to join https://relays.syncthing.net/endpoint: request or check error
2021/05/17 16:20:44 pool.go:89: Response data: getting invitation: read tcp 82.196.13.137:38258->125.190.151.7:22067: i/o timeout
2021/05/17 16:24:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp 82.196.13.137:443: connect: invalid argument
2021/05/17 16:25:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp: lookup relays.syncthing.net: No address associated with hostname
2021/05/17 16:26:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp: lookup relays.syncthing.net: No address associated with hostname
2021/05/17 16:27:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp: lookup relays.syncthing.net: No address associated with hostname
2021/05/17 16:28:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp: lookup relays.syncthing.net: No address associated with hostname
2021/05/17 16:29:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp: lookup relays.syncthing.net: No address associated with hostname
2021/05/17 16:30:54 pool.go:54: Error joining pool https://relays.syncthing.net/endpoint: HTTP request: Post "https://relays.syncthing.net/endpoint": dial tcp: lookup relays.syncthing.net: No address associated with hostname

The relay itself is still visible on https://relays.syncthing.net, even though it has not been running for at least an hour or so.

I mean, I’d rather you didn’t run it at all, if you are planning to run it on the phone.

The error message is saying the relay is not contactable from the server side.

It could be that your outgoing routed ip does not match the ip you expect connections to come in, so it gets routed somewhere weird.

You can see the ips to validate the assumption.

1 Like

Also, idk, fail2ban or something like that built into android or something?

I mean I really don’t waste time trying to understand why you can’t run a relay on a phone, as I don’t trust it will be a sensible experience.

1 Like

Thank you for the information, but the culprit was that Android dropped the Internet connection for some reason, even though it was specifically set to keep it alive when idle. Just in case the Android’s Doze was responsible for that, I have disabled it now with dumpsys deviceidle disable just in case.

Why do you say so? It is indeed a phone, but I plan to just keep it always on and connected to the WLAN at home. Just for the record, the specs are dual-core Samsung Exynos 4210 with 1GB RAM, so I would assume that it should be enough to run a relay…

The device itself is not going to be used for anything else. It also has no mobile connection, etc… Only WiFi. It is risky to run a relay like this?

I mean, exactly because of the reasons above, networking switching to mobile or disconnecting, doze, random oom app killing and all other crap that android ships with that makes it suitable as a mobile device, but unsuitable for an always on device.

I know you can do it. But I genuinely think this is a “bear’s favour”.

Well, I’m going to give it a few days and see how the situation looks like then.

The network seems to have stabilised, and Doze has been disabled. As for the OOM app killing, I don’t think that it applies to processes run from shell, so unless the device runs out of memory and crashes, the strelaysrv process should stay there up and running. At the moment, it is consuming roughly ~80 MB of RAM.

Also, this is a completely fresh installation of LineageOS, so there is nothing really running in the background or such that could possibly interfere.

Contrary to the worries, the phone and the server on it seem to be running well (no dropped connections, etc.). I did have to change one additional setting on Android to prevent the device from sleeping, but I will come back to it later.

However, I have had a weird problem with being unable to access https://relays.syncthing.net while the server is running. Just a disclaimer, as my knowledge about networking is quite limited, so please let me know if I’m talking about something obvious.

Basically, I have tried the following.

  1. Run strelaysrv -nat with no port forwarding. It connects and works fine with no additional configuration, but the outside port is randomly chosen by the router.
  2. Run just strelaysrv and forward the port 22067 on the router. It also works OK.
  3. Run strelaysrv -ext-address=:443 and forward the port 443 to 22067 on the router. Works as well.
  4. Run strelaysrv -listen=:443 as root and forward the port 443 on the router. Also works.

The problem is that whatever I do, once the server goes online, then I seem to be unable to open https://relays.syncthing.net from other devices on the same network.

What is weird is that only the Windows computers seem to be affected, as I can still open the site from a different Android device. I have tried using three different browsers, and also disabling Windows Firewall, but to no avail. I thought that this might have been a problem with my Windows settings, but one of the affected devices runs freshly installed Windows with basically stock configuration.

Am I missing something obvious, or is there more that I need to configure in order to get everything to work properly?

I don’t think running a relay would have any effect on being able to access the website. I’d make sure its not a http vs https issue, but otherwise, call an exorcist.

If you do not do the port 443 forwarding, can you access the website then?

If that’s the case you have a dodgy router that does port mappings in a weird way.

PS: Nevermind, I just read your post again and saw that you’re talking about other devices on the network. That doesn’t match, as port forwardings are per-device.

Yeah, I know that this sounds very strange. For the record, this is how the port forwarding looks on the router’s side. 10.0.0.4 is the IP address of the phone running the server.

I also can say for sure that this is not the https://relays.syncthing.net website issue, as I can still access it if I use Tor Browser on the very same Windows devices.

I also thought that this might rather be a DNS-related problem instead, but why would the site still open on the Android devices then? I’m saying this, because now, after several hours, the website has began to work again on one Windows device (but then stopped working again after a few minutes). However, it still refuses to load on the two others at all.

The strelaysrv itself seems to be running fine though, so I will probably leave it like this regardless, especially since I don’t see anything else in the router’s configuration that could have an impact here…

I have been trying to resolve the not being able to open https://relays.syncthing.net issue again today.

I can confirm that indeed enabling and disabling port forwarding corresponds to the website being accessible or not. However, it feels almost as if once the port forwarding is up and the relay server connects, then the site stops opening too. Then, if I disable the port forwarding, the server disconnects, and the site starts working again.

Just for the record, everything here, including the Windows computers and the phone running strelaysrv, is using the same public IP address. There is a cable modem provided by the Internet service company, which is locked and password protected from any user access, and then my router is connected to it. The port forwarding and/or any other configuration is done on the router.

I have another update to the issue.

Syncthing logs on all devices on the same network are now filled with

2021-05-21 13:53:19 c.S.listenerSupervisor: Exiting backoff state.
2021-05-21 13:53:19 Relay listener (dynamic+https://relays.syncthing.net/endpoint) starting
2021-05-21 13:53:40 Listen (BEP/relay): Get "https://relays.syncthing.net/endpoint": dial tcp 82.196.13.137:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2021-05-21 13:53:40 Relay listener (dynamic+https://relays.syncthing.net/endpoint) shutting down
2021-05-21 13:53:40 c.S.listenerSupervisor: Failed service 'dynamic+https://relays.syncthing.net/endpoint' (1.000000 failures of 2.000000), restarting: true, error: %!s(<nil>)
2021-05-21 13:53:40 Relay listener (dynamic+https://relays.syncthing.net/endpoint) starting

which would mean that in addition to the actual https://relays.syncthing.net website being inaccessible in a browser, the devices themselves cannot connect to the address too. Everything goes back to normal if I disable the port forwarding for the relay server, but then the server itself becomes disfunctional.

I’d try to check this using Wireshark. Sometimes it’s easier to just look at the raw network traffic.

1 Like

I have just tried doing so. This is the very first time for me to use the software though, so I have little idea how to properly operate it.

Anyhow, I first closed all other software, leaving only Palemoon open, and then tried to go to the relays website. Then, I filtered the Wireshark results by the IP which gets listed in the destination field and corresponds to web.syncthing.net.

This is the Wireshark output:

Is this of any use, or do I need to do something different to actually get into this?

From what you showed to us, you actually make two port forwardings: Port 443 and the default port.

What happens if you do port forwarding, but not with port 443? Does it work then, or does any forwarding break it?

This shows lots of TCP SYN’s that do not get any answers, e.g no reply at all on the TCP level to outgoing connection requests. This is in line with other error messages posted above, e.g it’s equal to “connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.”

No reply to any TCP SYN (even after many retries) to the same host (that is not down) usually means that one of the following has happened [other causes, e.g firewall are possible too, but do not match the symptoms]:

  • The outgoing packet was dropped somewhere along the path, never reaching the destination, or was routed into the void
  • The response (= incoming packet) was dropped somewhere along the path, never reaching the source, or was routed into the void (e.g to a different device)

I could imagine that a system with completly broken port mappings incorrectly routes the response packet, sending every related packet to your phone’s OS, where they get simply dropped as the phone cannot match them to a half-open connection. This is just a guess though, we don’t have enough data to show this.

A packet trace made directly on the router could be interesting, as that could show everything in/out through the WAN/LAN interface. Not all routers have this functionality though.

All of this really sounds like there’s a router involved with horrible bugs/broken implementations. Are you behind Carrier-Grade NAT (DS-Lite or some other weird thing)? What’s the make and model of the router(s) involved?

Thank you very much for such a detailed answer.

I can say that this seems to happen regardless of the port. I listed all the options that I had tried so far in https://forum.syncthing.net/t/errors-when-trying-to-run-strelaysrv-for-the-first-time/16818/7, but basically regardless of whether the forwarded port is 22067, 443, or even when just using strelaysrv -nat and letting the router select the ports automatically, shortly after the server goes up, the relays website becomes inaccessible.

The router is ipTIME A3003NS (local manufacturer, but produced in China anyway). The specs are available at http://iptime.com/iptime/?page_id=11&pf=15&page=&pt=331&pd=3.

These are all the options that are available in the configuration.

image

I can also add that the router runs the newest available firmware. I may try to reset the whole thing and see whether there is any difference. Unfortunately, I don’t have access to any other router right now that I could swap and test instead of this one.

I just wanted to say that regardless of the website access problem, the relay server itself has been running flawlessly on the phone. It has been up for about 7 days now, and there have been no issues regarding maintaining the connection and such. The phone has also remained connected to the WiFi network for all that time with no drops.

I use the Terminal Emulator from F-Droid to run the server directly on the device. The binary can be executed from /data/local/tmp/, although root is required for this. A different way to run it is by using ADB, where root access is not needed.

This is how the current state looks like.

image

{
    "buildDate": "2021-05-05T07:37:23Z",
    "buildHost": "tomasz86",
    "buildUser": "tomasz86",
    "bytesProxied": 9521768515,
    "goArch": "arm",
    "goMaxProcs": 2,
    "goNumRoutine": 334,
    "goOS": "android",
    "goVersion": "go1.16.4",
    "kbps10s1m5m15m30m60m": [
        0,
        0,
        448,
        149,
        75,
        38
    ],
    "numActiveSessions": 16,
    "numConnections": 137,
    "numPendingSessionKeys": 0,
    "numProxies": 32,
    "options": {
        "global-rate": 0,
        "message-timeout": 60,
        "network-timeout": 120,
        "per-session-rate": 0,
        "ping-interval": 60,
        "pools": [
            "https://relays.syncthing.net/endpoint"
        ],
        "provided-by": "tomasz86"
    },
    "startTime": "2021-05-18T17:42:30.047684049Z",
    "uptimeSeconds": 530215,
    "version": "v1.16.1"
}

Considering that Syncthing is kind of niche where I live, this seems to be a considerable amount of data transferred. I can also confirm that an old Android phone is indeed capable of running the server with no real issues.

However, I also need to add that the key was to set

image

because otherwise the Internet connection would stall when idle. Of course, this means that the screen needs to stay on all the time, but I don’t think that this is really a big drawback, since a) the phone uses very little power anyway, and b) I have turned on the built-in clock screen saver, so it also doubles as a desk clock now :smirk:.

I have also set a script to scrape the status page and send it to myself by e-mail every hour or so, so that I have detailed logs about the server (and thus can quickly know in case something has been acting up).

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.