I’m having a recurring issue, well - more of a naggle than an issue.
11 devices have a send-only relationship with 1 “central” device.
They backup a folder to a NAS, across WAN.
The NAS is Openmediavault with the Syncthing plugin, the 11 remote devices are Windows boxes.
This setup has been absolutely rock solid for nearly a year, digesting 1.5 Million files, with versioning. Great job, devs !!
Infrequently, and irregularly, a random device will show as Disconnected in the NAS’s webif, and the NAS will also show as Disconnected on the client side.
That device will remain Disconnected untill I either restart Syncthing on the device, on the NAS, or sometimes i need to do both to get the connection back up.
The question is not why does a disconnect happen, as that will most likely be a connectivity issue (ISP disconnect…).
The naggle is, why does ST not reconnect, in some rare cases ?
What I would like to do at this point, is to make sure I’m gathering the right log data to troubleshoot this further.
Which STTRACE should i be running, if any? On which device(s)?
This is what happened at the time the “client” was last seen by the central Syncthing Host:
(logged at client side)
[JDF5B] 08:21:24 INFO: Connection to JKZ4TOI-VTZPV4T-7FBFGD4-BZ2KPWW-3XCSH23-YNCRCTO-N44WZIZ-DFM6BQ4 at 10.4.7.11:49349-83.166.144.57:22067/relay-server closed: read timeout
[JDF5B] 10:32:39 INFO: Restarting
On the NAS side, turns out I had sttrace=connection active, but all it shows is a reconnect-loop untill after i did the Restart client-side , as seen in the snippet above.
But here is the cat syslog | grep JDF5B > JDF5B.log : https://www.hastebin.com/upavavijef.sql
(The timestamps are within 5 secs between those 2 machines)
So it seems it’s failing to connect directly and tries to connect over a relay.
You should probably try to setup port forwarding on the routers for better connectivity.
From the log, it seems to try to connect over a relay at some point, but it takes a while.
Having an IP is not enough, for A and B to establish a direct connection, one of them has to be available on the internet (without a NAT) or have a port forwarded to get through the NAT.
If you don’t have that, then you are at mercy of relay connections, which can break, not work, etc.
Your log does have something suspicious, as in, it disconnects from the relay and then takes a long while to decide to try and connect it via again, but looking at the code I can’t understand why this would happen. I guess your log is filtered, so it’s missing some messages that explain that.
Feb 19 10:48:37 backup-nas syncthing[888]: [JKZ4T] DEBUG: connected to JDF5BY7-BSGD3RV-VT26EB2-XC53DF3-YGIVJRY-UYUSZCU-MNTFTMD-UJNMIAH 200 using 10.5.1.4:43794-213.239.205.247:22067/relay-client 200
Feb 19 10:48:37 backup-nas syncthing[888]: [JKZ4T] DEBUG: discarding 0 connections while connecting to JDF5BY7-BSGD3RV-VT26EB2-XC53DF3-YGIVJRY-UYUSZCU-MNTFTMD-UJNMIAH 200
Feb 19 10:48:37 backup-nas syncthing[888]: [JKZ4T] INFO: Established secure connection to JDF5BY7-BSGD3RV-VT26EB2-XC53DF3-YGIVJRY-UYUSZCU-MNTFTMD-UJNMIAH at 10.5.1.4:43794-213.239.205.247:22067/relay-client (TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305)
Feb 19 10:48:37 backup-nas syncthing[888]: [JKZ4T] INFO: Device JDF5BY7-BSGD3RV-VT26EB2-XC53DF3-YGIVJRY-UYUSZCU-MNTFTMD-UJNMIAH client is "syncthing v0.14.44" named "KATE-SERVER-ANTW" at 10.5.1.4:43794-213.239.205.247:22067/relay-client
Anyways, you should run both sides with STTRACE=connections and provide full logs, otherwise it’s a pointless witch hunt. We’re big boys and can filter out the stuff we need ourselves.