Hi,
One of the devices I help administer in a cluster of ~10 devices sharing one big (1Tb+) folder seems unable to connect reliably to other devices. A lot of the logs report failures to set up connections to other devices. A common pattern is (after a fresh restart) :
- All devices appear disconnected (see log for failures)
- Suddenly one device connects
- They start exchanging indexes (up to ~5%)
- Connection then drops
- All devices appear disconnected for some time again (5min to 1hour)
What has been tested :
- Manually port mapping 22000/tcp to the device (simple home router)
- device set into DMZ for full exposure and router firewall bypass
- Network and internet connection seems stable (ping’d some other devices without noticing anything strange)
Here are the redacted logs
2020-09-03 21:58:48 My ID: _MY_ID_
2020-09-03 21:58:50 Single thread SHA256 performance is 8.9 MB/s using minio/sha256-simd (7.6 MB/s using crypto/sha256).
2020-09-03 21:58:52 Hashing performance is 8.26 MB/s
2020-09-03 21:58:53 Overall send rate is unlimited, receive rate is unlimited
2020-09-03 21:58:53 Using discovery server https://discovery.syncthing.net/v2/?noannounce&id=_ANNOUNCE_ID_
2020-09-03 21:58:53 Using discovery server https://discovery-v4.syncthing.net/v2/?nolookup&id=_ANNOUNCE_ID_
2020-09-03 21:58:53 Using discovery server https://discovery-v6.syncthing.net/v2/?nolookup&id=_ANNOUNCE_ID_
2020-09-03 21:58:53 TCP listener ([::]:22000) starting
2020-09-03 21:58:53 Relay listener (dynamic+https://relays.syncthing.net/endpoint) starting
2020-09-03 21:58:53 QUIC listener ([::]:22000) starting
2020-09-03 21:58:53 ...
2020-09-03 21:58:53 GUI and API listening on [::]:7070
2020-09-03 21:58:53 Access the GUI via the following URL: http://127.0.0.1:7070/
2020-09-03 21:58:53 My name is "_MY_NAME_"
2020-09-03 21:58:53 Device BOAL7RP is "somename" at [dynamic]
2020-09-03 21:58:53 Device FI2QNC7 is "somename" at [dynamic]
2020-09-03 21:58:53 Device KMIBINI is "somename" at [dynamic]
2020-09-03 21:58:53 Device UP56BS7 is "somename" at [dynamic]
2020-09-03 21:58:53 Device UYGS6US is "somename" at [dynamic]
2020-09-03 21:58:53 Device YWHOXL6 is "somename" at [dynamic]
2020-09-03 21:58:53 Device FEUN355 is "somename" at [dynamic]
2020-09-03 21:58:53 Device SHJMJ7H is "somename" at [dynamic]
2020-09-03 21:58:53 Device SRQKFGJ is "somename" at [dynamic]
2020-09-03 21:58:53 Device V4ATM5R is "somename" at [dynamic]
2020-09-03 21:58:53 Device 4GM7OLJ is "somename" at [dynamic]
2020-09-03 21:58:54 Ready to synchronize _MY_FOLDER_ (sendreceive)
2020-09-03 21:59:03 Listen (BEP/relay): Get "https://relays.syncthing.net/endpoint": net/http: TLS handshake timeout
2020-09-03 21:59:03 Relay listener (dynamic+https://relays.syncthing.net/endpoint) shutting down
2020-09-03 21:59:03 c.S.listenerSupervisor: Failed service 'dynamic+https://relays.syncthing.net/endpoint' (1.000000 failures of 2.000000), restarting: true, error: "{dynamic+https://relays.syncthing.net/endpoint dynamic+https://relays.syncthing.net/endpoint} returned unexpectedly", stacktrace: [unknown stack trace]
2020-09-03 21:59:03 Relay listener (dynamic+https://relays.syncthing.net/endpoint) starting
2020-09-03 21:59:06 New NAT port mapping: external TCP address _IPV4_1_:57391 to local address 0.0.0.0:22000.
2020-09-03 21:59:15 quic://0.0.0.0:22000 detected NAT type: Port restricted NAT
2020-09-03 21:59:15 quic://0.0.0.0:22000 resolved external address quic://_IPV4_1_:22000 (via stun.syncthing.net:3478)
2020-09-03 21:59:15 Listen (BEP/relay): Get "https://relays.syncthing.net/endpoint": net/http: TLS handshake timeout
2020-09-03 21:59:15 Relay listener (dynamic+https://relays.syncthing.net/endpoint) shutting down
2020-09-03 21:59:15 c.S.listenerSupervisor: Failed service 'dynamic+https://relays.syncthing.net/endpoint' (1.759434 failures of 2.000000), restarting: true, error: "{dynamic+https://relays.syncthing.net/endpoint dynamic+https://relays.syncthing.net/endpoint} returned unexpectedly", stacktrace: [unknown stack trace]
2020-09-03 21:59:15 Relay listener (dynamic+https://relays.syncthing.net/endpoint) starting
2020-09-03 21:59:17 Detected 3 NAT services
2020-09-03 21:59:26 Listen (BEP/relay): Get "https://relays.syncthing.net/endpoint": net/http: TLS handshake timeout
2020-09-03 21:59:26 Relay listener (dynamic+https://relays.syncthing.net/endpoint) shutting down
2020-09-03 21:59:26 Entering the backoff state.
2020-09-03 21:59:26 c.S.listenerSupervisor: Failed service 'dynamic+https://relays.syncthing.net/endpoint' (2.367933 failures of 2.000000), restarting: false, error: "{dynamic+https://relays.syncthing.net/endpoint dynamic+https://relays.syncthing.net/endpoint} returned unexpectedly", stacktrace: [unknown stack trace]
2020-09-03 21:59:51 Established secure connection to UYGS6US at [::]:22000-_IPV4_2_:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256
2020-09-03 21:59:51 Device UYGS6US client is "syncthing v1.8.0" named "somename" at [::]:22000-_IPV4_2_:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256
2020-09-03 21:59:51 Device UYGS6US folder _MY_FOLDER_ has a new index ID (0xDACE31D7AF38291E)
2020-09-03 22:00:20 Failed to exchange Hello messages with V4ATM5R at 192.168.1.100:55863-62.171.170.70:22067/relay-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256: EOF
2020-09-03 22:08:17 Failed initial scan of sendreceive folder _MY_FOLDER_
2020-09-03 22:08:19 Connection to UYGS6US at [::]:22000-_IPV4_2_:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256 closed: restarting folder _MY_FOLDER_
2020-09-03 22:10:00 Established secure connection to 4GM7OLJ at 192.168.1.100:49906-_IPV4_3_:22067/relay-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256
2020-09-03 22:10:00 Device 4GM7OLJ client is "syncthing v1.8.0" named "Boursny" at 192.168.1.100:49906-_IPV4_3_:22067/relay-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256
2020-09-03 22:13:44 Exiting backoff state.
2020-09-03 22:13:45 Relay listener (dynamic+https://relays.syncthing.net/endpoint) starting
2020-09-03 22:13:50 Ready to synchronize _MY_FOLDER_ (sendreceive)
2020-09-03 22:13:50 Restarted folder _MY_FOLDER_ (sendreceive)
2020-09-03 22:14:16 Connection to 4GM7OLJ at 192.168.1.100:49906-_IPV4_3_:22067/relay-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256 closed: reading length: read tcp 192.168.1.100:49906->_IPV4_3_:22067: read: connection reset by peer