v0.14.40 fails to connect

[JWK46] 2017/11/19 10:46:58.375836 structs.go:191: DEBUG: dialing AGR6NY2
 tcp://85.216.59.171:22000 outcome %!v(PANIC=runtime error: invalid memory
 address or nil pointer dereference) dial tcp 85.216.59.171:22000: i/o
 timeout

WTF :astonished:


Eh, ok, bad log message but not that panicky really. I guess that’s the expected WAN address failure. The other stuff you highlighted is odd though…


OK, I see the problem with that one.

@AudriusButkevicius I don’t understand this log segment (and a bunch of similar ones):

[JWK46] 2017/11/19 10:53:14.480808 service.go:343: DEBUG: Reconnect loop for AGR6NY2
[JWK46] 2017/11/19 10:53:14.491870 service.go:365: DEBUG: Not dialing relay://163.172.179.61:8443/?id=FU55S7D-DO2JQEP-DYE6SLV-UKU4FS2-BMDTHUE-L5633N2-CIFX32Y-QQUVQAC&pingInterval=1m0s&networkTimeout=2m0s&sessionLimitBps=0&globalLimitBps=0&statusAddr=:22070&providedBy=harry.lu - Paris2, FR as sleep is 1m0s, next dial is at 2017-11-19 10:59:28.23293814 +0100 CET m=+830.558230924 and current time is 2017-11-19 10:53:14.480721018 +0100 CET m=+456.806013872
[JWK46] 2017/11/19 10:53:14.492070 service.go:365: DEBUG: Not dialing tcp://192.168.1.10:22000 as sleep is 1m0s, next dial is at 2017-11-19 10:54:14.480721018 +0100 CET m=+516.806013872 and current time is 2017-11-19 10:53:14.480721018 +0100 CET m=+456.806013872
[JWK46] 2017/11/19 10:53:14.492954 structs.go:189: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 prio 10
[JWK46] 2017/11/19 10:53:14.493904 structs.go:189: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 prio 10
[JWK46] 2017/11/19 10:53:24.493693 structs.go:191: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 outcome %!v(PANIC=runtime error: invalid memory address or nil pointer dereference) dial tcp 85.216.59.171:22000: i/o timeout
[JWK46] 2017/11/19 10:53:24.494959 structs.go:191: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 outcome %!v(PANIC=runtime error: invalid memory address or nil pointer dereference) dial tcp 85.216.59.171:22000: i/o timeout
[JWK46] 2017/11/19 10:53:24.495831 service.go:781: DEBUG: failed to connect to AGR6NY2 10
[JWK46] 2017/11/19 10:53:24.496601 service.go:343: DEBUG: Reconnect loop for ESBYS6E
[JWK46] 2017/11/19 10:53:24.531000 service.go:365: DEBUG: Not dialing relay://144.76.71.91:22067/?id=EWO4HFG-YCA3UYC-ATA4RR7-XB3UK4C-D24VKXD-5MUL42X-U3C32BV-PH6IHAT&pingInterval=1m0s&networkTimeout=2m0s&sessionLimitBps=0&globalLimitBps=0&statusAddr=:22070&providedBy=tuxli.ch - hetzner.de as sleep is 1m0s, next dial is at 2017-11-19 10:59:28.23293814 +0100 CET m=+830.558230924 and current time is 2017-11-19 10:53:14.480721018 +0100 CET m=+456.806013872
[JWK46] 2017/11/19 10:53:24.531419 structs.go:189: DEBUG: dialing ESBYS6E tcp://79.231.96.156:22002 prio 10
[JWK46] 2017/11/19 10:53:34.532129 structs.go:191: DEBUG: dialing ESBYS6E tcp://79.231.96.156:22002 outcome %!v(PANIC=runtime error: invalid memory address or nil pointer dereference) dial tcp 79.231.96.156:22002: i/o timeout
[JWK46] 2017/11/19 10:53:34.532851 service.go:781: DEBUG: failed to connect to ESBYS6E 10
[JWK46] 2017/11/19 10:53:34.533829 service.go:343: DEBUG: Reconnect loop for TLRPHPS
[JWK46] 2017/11/19 10:53:34.542148 service.go:398: DEBUG: Not dialing using KCP Dialer as priority is less than current connection (50 >= 50)
[JWK46] 2017/11/19 10:53:34.542814 structs.go:189: DEBUG: dialing TLRPHPS tcp://79.231.96.156:22000 prio 10
[JWK46] 2017/11/19 10:53:34.579731 tcp_dial.go:46: DEBUG: Dial (BEP/tcp): setting traffic class: operation not supported
[JWK46] 2017/11/19 10:53:35.591287 structs.go:191: DEBUG: dialing TLRPHPS tcp://79.231.96.156:22000 outcome 192.168.1.11:56850-79.231.96.156:22000/tcp-client <nil>
[JWK46] 2017/11/19 10:53:35.592034 service.go:771: DEBUG: connected to TLRPHPS 10 using 192.168.1.11:56850-79.231.96.156:22000/tcp-client 10
[JWK46] 2017/11/19 10:53:35.592292 service.go:343: DEBUG: Reconnect loop for UNIFKNA
[JWK46] 2017/11/19 10:53:35.594507 service.go:774: DEBUG: discarding 0 connections while connecting to TLRPHPS 10
[JWK46] 2017/11/19 10:53:35.603017 service.go:343: DEBUG: Reconnect loop for X2KPZT7
[JWK46] 2017/11/19 10:53:35.621231 service.go:247: INFO: Connected to already connected device (WHDJVNU)
[JWK46] 2017/11/19 10:53:35.832203 service.go:426: DEBUG: sleep until next dial 1m0s
[JWK46] 2017/11/19 10:54:35.832874 service.go:317: DEBUG: Reconnect loop
[JWK46] 2017/11/19 10:54:35.833080 service.go:343: DEBUG: Reconnect loop for AGR6NY2
[JWK46] 2017/11/19 10:54:35.842017 service.go:365: DEBUG: Not dialing relay://163.172.179.61:8443/?id=FU55S7D-DO2JQEP-DYE6SLV-UKU4FS2-BMDTHUE-L5633N2-CIFX32Y-QQUVQAC&pingInterval=1m0s&networkTimeout=2m0s&sessionLimitBps=0&globalLimitBps=0&statusAddr=:22070&providedBy=harry.lu - Paris2, FR as sleep is 1m0s, next dial is at 2017-11-19 10:59:28.23293814 +0100 CET m=+830.558230924 and current time is 2017-11-19 10:54:35.832998468 +0100 CET m=+538.158291323
[JWK46] 2017/11/19 10:54:35.842222 service.go:365: DEBUG: Not dialing tcp://192.168.1.10:22000 as sleep is 1m0s, next dial is at 2017-11-19 10:55:35.832998468 +0100 CET m=+598.158291323 and current time is 2017-11-19 10:54:35.832998468 +0100 CET m=+538.158291323
[JWK46] 2017/11/19 10:54:35.842671 structs.go:189: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 prio 10
[JWK46] 2017/11/19 10:54:35.842978 structs.go:189: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 prio 10
[JWK46] 2017/11/19 10:54:45.843332 structs.go:191: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 outcome %!v(PANIC=runtime error: invalid memory address or nil pointer dereference) dial tcp 85.216.59.171:22000: i/o timeout
[JWK46] 2017/11/19 10:54:45.844019 structs.go:191: DEBUG: dialing AGR6NY2 tcp://85.216.59.171:22000 outcome %!v(PANIC=runtime error: invalid memory address or nil pointer dereference) dial tcp 85.216.59.171:22000: i/o timeout
[JWK46] 2017/11/19 10:54:45.844368 service.go:781: DEBUG: failed to connect to AGR6NY2 10

Looking at the dials for AGR6NY2.

First pass we see DEBUG: Not dialing tcp://192.168.1.10:22000 as sleep is 1m0s, next dial is at 2017-11-19 10:54:14.480721018 which is fine, then we see a couple of dials (one being duplicated due to the bug above that we just fixed so we’re not really sue what it wanted to dial), presumably none of the dials were for tcp://192.168.1.10:22000 as it said it skipped it.

Next pass is at 10:54:35 and it says Not dialing tcp://192.168.1.10:22000 as sleep is 1m0s, next dial is at 2017-11-19 10:55:35.832998468 despite as far as I can tell never having tried that address at all…

@Alex Please try v0.14.41-rc.2-2-g72d6458 from https://build.syncthing.net/latest/ and we’ll have one less bug to confuse the issue…

@calmh v0.14.41-rc.2-2-g72d6458 connects instantly, tried multiple restarts to confirm it, always connects. Reverting to 14.41-rc.2 breaks it again so I guess it is fixed.

1 Like

Cool, that’s one thing. Different from everything else in this thread, but at least something.

@calmh so looking at your comments I can’t explain it. We don’t log anything in case we do dial which would explain it, but in this case that doesn’t seem to have a dial for AGR6NY2 in-between the two dials that refused tcp://192.168.1.10:22000, so this makes no sense.

I’m experiencing the same problem, the connection bahaviour is really strange. Sometimes I can’t connect at all, then somehow it connects and starts syncing and stops again (devices offline). I have 3 devices, 2 of them are in the same LAN, one is a VPS reachable direct over IP without firewall. Connection type is “dynamic” for the LAN peers and tcp://… for the VPS - which doesn’t make any difference since both are randomly disconnecting. One thing which fixes it in most cases is when I restart the syncthing client - afterwards in most cases the peers connect more or less instantly and continue to sync again. Don’t have any logfiles right now, but will look if I can record anything the next days - but I can remember that I saw the same tcp timeout messages like posted above.

I just wanted to say i am experiencing this too. I noticed that if i remove all my shares that it seems to sync but then eventually just stops. This has been happenign ever since 0.14.40 … its making me pretty sad right now. How do I roll back to a previous version that isn’t broken?

So the reports are being convoluted. Some users report as being able to connect, but the folders don’t sync, others report as failing to connect. If you want to chip in, please state the exact symptoms you are seeing (wether it does not sync or does not connect)

The ones that are having connection issues, can you please upgrade to the latest RC and see if this makes the situation better.

1 Like

A screenshot of at least the affected device(s) and ideally also connected devices is a good start to establish context about your situation and setup. It isn’t a substitute for descriptions and logs, but with screenshots it is easier to talk specifics about your problems and ask clearer questions.

1 Like

I’m a heavy syncthing user and have been for over a year. I have a “main server” that’s running ubuntu 16.04 (linux-amd64) with all my shares, then various copies on:

  1. A windows 10 pro desktop (windows-64bit)
  2. A windows 10 home laptop (windows-64bit)
  3. An android phone (running android 6.0)
  4. A macbook pro (darwin-64)
  5. A raspberry pi 2 model B (arm)

I generally have had pretty good luck with syncthing and found it to be fairly reliable… but as of 0.14.40 I’m having all kinds of issues:

  • The raspberry pi and the windows laptop can’t talk to each other at all even on the same lan.

  • The raspberry pi and the android version both regularly fail to contact the listeners

    [V6YDQ] 04:48:18 INFO: c.S.listenerSupervisor: Failed service ‘dynamic+https://relays.syncthing.net/endpoint’ (2.241128 failures of 2.000000), restarting: false, error: “{dynamic+https://relays.syncthing.net/endpoint dynamic+https://relays.syncthing.net/endpoint} returned unexpectedly”, stacktrace: [unknown stack trace]

  • The windows and mac regularly lose connection to each other and the only solution is to remove each other and all folders and re-sync from the main ubuntu server

I can get logs if that’s helpful but bear in mind a lot of people aren’t going to come here and talk about their issues - so “a handful” of people saying there are issues is probably indicative of much wider-spread issues.

I have tried the latest rc on the rasberry pi and windows laptop and so far things are still not working reliably / at all.

I’m probably going to try downgrading everything to v0.14.39 to see if the issues go away.

Happy Thanksgiving to all and thanks for this great software.

Go for the 0.14.41 current rc.X build e.g. https://github.com/syncthing/syncthing/releases/tag/v0.14.41-rc.6

in this and other threads the recent rc.x builds have eased the situation and the trouble with connecting again.

the developers unfortunately never clarified if or what they have exactly discovered or if they have understood this bug or related bugs and changes since .39 to .40

I have never seen a clear statement on what happened between .39 and .40 Good luck.

Thanks for reporting - we are aware of the issues and have already made changes to address them. Up until now it isn’t clear whether these changes included in the latest RC addresse all of these issues. If you still have connection problems with the latest RC, please do provide logs with the env var STTRACE=connections set.

That’s not really fair: The release information establishes what changed: Release v0.14.40 · syncthing/syncthing · GitHub . The reason why there has not been a “clear” statement as to the connection issues, is that a lot has changed since the last release and only a few of the reports had either a clear cause or were reproducible. Those that were have been addresses (that’s why we are already at RC6). If you want technical details about that, you find them in the discussion on issues and PRs linked in the release notes.

C’mon here, we all know that its not about mere release notes, but I meant that the developers never said or understood the core of the problems of this thread. Even only a few reports but constant number of new reports and odd stuff appeared here on the forum, and eventually rc.x builds of .41 gotton released but you guys never took pains to slightly hint at the userbase, that all people having trouble with current .40 release ought to try and look into rc.x releases of .41 and so on. Anyways, keeping the userbase a bit current on the development and bug side would do some good as this and the other probably related posts and threads have shown. Ultimately would be nice if someone found out what algorithmic stuff or what logics internally have most likely cause the original mess here. Or what combination thereof. Thanks nevertheless for the efforts you guys put into development.

Well we did do some tweaks which are described in the RC release notes, but there are still people complaining with the RCs, so we don’t really have an RCA, as I am not sure it’s solved

There were some issues where KCP was discovering/advertising a ton of addresses, and given connection attempts are sequential it took ages to connect (200 * 10 seconds etc).

1 Like

FYI this issue is not fixed on .41. .39 was working great, and .40 came and all of a sudden I have two machines that can’t see each other… I’ve been seeing the threads that other people have had similar issues, and I figured .41 would fix things up. What can I do to help? It’s broken for me.

Run with STTRACE=discover,connections on both sides, post the logs somewhere, with a note on which are the device IDs you are expecting to be able to connect.

I have it running on a Ubuntu distro on one machine and as a FreeNAS plugin on another. Is there an easy way to add that parameter to how it runs?

You should check with your package maintainer. We ship a few service files with syncthing, but if you haven’t set them up it’s probably something custom you’ll have to handle.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.