Remote Device sporadically connecting

I have a private infrastructure with Relay / Discovery where I am trying to setup a connection between two remote peers, although connection most of the time fails. Right now I have no clue what can be wrong aside from these logs:

From one of the peers:

[5BIAI] 2020/02/20 13:25:41.222801 service.go:325: DEBUG: Reconnect loop
[5BIAI] 2020/02/20 13:25:41.222801 cache.go:83: DEBUG: cached discovery entry for PASWHCC-B7AI776-W7VZ5V7-IHO4NJK-7QAG6AF-H3GHCU5-N5ABZGS-NDZ3VAJ at global@https://10.22.66.39:8443/
[5BIAI] 2020/02/20 13:25:41.222801 cache.go:84: DEBUG:   cache: {[relay://10.22.66.39:22067?id=ONNSXHC-I5GNATG-XMFZJVO-JO6X6J5-MXNCFJ3-AI7WZ3X-5R2SXZR-XT7DXQD] {13802348182749212580 2111062101 0x14734a0} true {0 0 <nil>} 0}
[5BIAI] 2020/02/20 13:25:41.222801 cache.go:124: DEBUG: lookup results for PASWHCC-B7AI776-W7VZ5V7-IHO4NJK-7QAG6AF-H3GHCU5-N5ABZGS-NDZ3VAJ
[5BIAI] 2020/02/20 13:25:41.222801 cache.go:125: DEBUG:   addresses:  [relay://10.22.66.39:22067?id=ONNSXHC-I5GNATG-XMFZJVO-JO6X6J5-MXNCFJ3-AI7WZ3X-5R2SXZR-XT7DXQD]
[5BIAI] 2020/02/20 13:25:41.222801 service.go:362: DEBUG: Reconnect loop for PASWHCC-B7AI776-W7VZ5V7-IHO4NJK-7QAG6AF-H3GHCU5-N5ABZGS-NDZ3VAJ [relay://10.22.66.39:22067?id=ONNSXHC-I5GNATG-XMFZJVO-JO6X6J5-MXNCFJ3-AI7WZ3X-5R2SXZR-XT7DXQD]
[5BIAI] 2020/02/20 13:25:41.222801 service.go:372: DEBUG: Not dialing PASWHCC-B7AI776-W7VZ5V7-IHO4NJK-7QAG6AF-H3GHCU5-N5ABZGS-NDZ3VAJ via relay://10.22.66.39:22067?id=ONNSXHC-I5GNATG-XMFZJVO-JO6X6J5-MXNCFJ3-AI7WZ3X-5R2SXZR-XT7DXQD as sleep is 1m0s, next dial is at 2020-02-20 13:34:58.3046201 -0300 -03 m=+647.096493301 and current time is 2020-02-20 13:25:41.2228017 -0300 -03 m=+90.014674901
[5BIAI] 2020/02/20 13:25:41.222801 service.go:447: DEBUG: sleep until next dial 1m0s

From the relay server:

2020/02/20 16:39:51 listener.go:164: 5BIAI3P-DCPC7VP-TC47OIK-ZJMHS5J-NCWNCHT-EJYR6WW-HKJ6YAL-444EDQU is looking for PASWHCC-B7AI776-W7VZ5V7-IHO4NJK-7QAG6AF-H3GHCU5-N5ABZGS-NDZ3VAJ which does not exist
2020/02/20 16:39:51 listener.go:221: Closing connection 5BIAI3P-DCPC7VP-TC47OIK-ZJMHS5J-NCWNCHT-EJYR6WW-HKJ6YAL-444EDQU: read tcp 10.22.123.183:22067->10.22.67.175:63706: use of closed network connection

Any idea?

I have an autoscaling relay server cluster and I’ve just noticed that when we have only one instance it works perfectly. We are using same cert and same device ID for all of the instances, can it be the issue? If so, how to setup a relay server cluster?

Certainly it’s an issue. A device will connect to a relay and ask to meet with another device who is supposedly waiting there. If they are in reality waiting on another instance that can’t happen.

I think if you need a relay server cluster you have an architecture problem. Better enable your devices to talk to each other or some well placed coordinator devices.

@calmh any suggestion on how to setup the relay server autoscaling cluster? Connecting peers directly is not an option.

You can set up multiple standard relays.

Do we have any gossip router for relay servers? I mean, something like dynamic+https://relays.syncthing.net/endpoint ?

I don’t know what that would be. I’ll just reiterate that building on needing lots of load balanced relays is almost certainly the wrong solution.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.