syncthing connects, fails to sync and then disconnects


(Joha1) #1

I’m running syncthing 0.14.40 on a linux and windows 7 machine and I can’t get the two to sync.

Used to work before, but I got into issues when trying to sync a ~6GB file, where it took me about 2 days to sync half of it, with all other folders getting ignored.

In the meantime, I did an update on the linux machine (fedora25 -> 26), so this might have something to do with it. But if so, syncthings output seems kinda lacking.

Whats happening: the two machines appear to find each other and proceed to the “Syncing” status in the web ui, but on both machines, this is stuck on 0% with 0 B/s up or download (though both show something like .5 kB of uploaded datat) and the connection fails after a minute or so. The log is essentially a repetition of this:

[SG***] 16:00:18 INFO: Established secure connection to FB*** at 10.*** (relay-server) (TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305)
[SG***] 16:00:18 INFO: Device FB*** client is "syncthing v0.14.40" named "14-***"
[SG***] 16:02:48 INFO: Connection to FBP*** closed: read timeout
[SG***] 16:03:40 INFO: Established secure connection to FBP*** at 10.*** (relay-client) (TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305)
[SG***] 16:03:40 INFO: Device FBP*** client is "syncthing v0.14.40" named "14-***"
[SG***] 16:06:10 INFO: Connection to FBP*** closed: read timeout

both on the linux and the windows machine.

I tried removing and readding (on one device) some folder that I know is out of sync, but I don’t even get a notification on the other machine about this folder. Every single folder on every machine just shows a nice, green “Up to Date”, even though it is not.

I did syncthing -reset-database on both, but that did not help either.

Essentially, I have the same issue as here: Syncthing v0.14.38, sender sync stuck at 0%, reciever folder "Up To Date" when it's not [SOLVED]

However, my connection should be OK. ping under 10 ms, and 40/60 Mbps down/up on the linux machine (Wifi) and 320/180 on the windows machine (wired).

What could be the problem? The log isn’t exactly helpful.


(usernamegoeshere) #2

Maybe its more related to “our” 0.14.40 fails to connect issue over here

v0.14.40 fails to connect

if you are on linux and have that 0.14.40 client running, maybe you can provide extensive logs to the developers here and we can sort it out. something is odd about 0.14.40 regarding connectivity

you could also later one when advised, switch or downgrade to 0.14.39 and compare results and logs especially.

thanks for participating.


(Joha1) #3

I saw the thread you linked, but the issue described there seems different to what I am experiencing (though with the same result: nothing getting synced), so that’s why I made a new thread.

Also, when switching to 14.39 (the one since recently provided in the fedora repos) on the linux side, the issue persists, and I think I have not seen the “syncthing has updated” message in between syncthing working and not working, so I don’t know if that is related.

I’ll try to get through what has been posted in the other thread sincem and see if I can add anything. It’s friday night where I am though, so it might take a while.


(Simon) #4

In your logs, you should not censor the port part of the IPs, that is an important debug indicator for us. If what you posted is the only log output related to connections, you can run it with STTRACE=connection env var set.


(Joha1) #5

What’s the syntax/behaviour for that? ./syncthing -STR… gives me an error and the help page and ./snycthing STR… also get me the help page, but no error. Do I simply set it, and then run it again? If so, here’s the output, but doesn’t seem like much changed:

[SG***] 08:44:19 INFO: Detected 0 NAT devices
[SG***] 08:44:21 INFO: Established secure connection to FBP*** at 10.***:22067 (relay-client) (TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305)
[SG***] 08:44:21 INFO: Device FBP*** client is "syncthing v0.14.40" named "14-***"
[SG***] 08:44:23 INFO: Joined relay relay://163.***:22067
[SG***] 08:46:51 INFO: Connection to FBP*** closed: read timeout
[SG***] 08:48:08 INFO: Established secure connection to FBP*** at 10.***:22067 (relay-client) (TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305)
[SG***] 08:48:08 INFO: Device FBP*** client is "syncthing v0.14.40" named "14-***"
[SG***] 08:48:19 INFO: Connected to already connected device (FBP***)
[SGCIO] 08:50:38 INFO: Connection to FB*** closed: read timeout

and so on, on both devices. The port :22067 is also reported by the windows machine for the linux one.

However, I left my office PC (windows, wired connection) running over the weekend, and my linux laptop was used quite a bit on my home wifi and now I am on “syncing, 97%” on that one, and on 60% on the windows machine, where I also was greeted with a message on the web UI that my laptop wants to share a folder with it. I have since allowed this sharing, but I’m stuck at 0B/s syncing speed.

So maybe it is a similar issue to this: Syncthing v0.14.38, sender sync stuck at 0%, reciever folder "Up To Date" when it's not [SOLVED] ?

Any ideas how to test that? As said, a speedtest for the wifi is quite good, but the wifi occasionally has some issues (e.g. when streaming HD content) and I see a dozen other networks here, so I would not be surprised if there is some “congestion”. However, I can’t access the router to get it’s logs or switch frequencies, since this is not my network. It’s a university network (eduroam) which is also used for the students, whereas my wired connection is a staff connection, so I would not be surprised if some things are blocked on wifi, but I have never encountered an issue and I am not aware of any “they’ll expulse you if you torrent” scares among the students.

REgarding the v0.14.40 fails to connect issue: Seems to me like this is a different issue from mine.


(Simon) #6

You set an environment variable before the command, like this
STTRACE=connection ./syncthing ...
On Windows if you use Synctrayzor, there is an option for it. On command line you can first call this command
set STTRACE=connection
and I assume that is going to be permanent for the living duration of this command line windows.

And I am sorry, but I don’t understand what the problem is from the last descriptions at all. The logs you posted have shown a problem with establishing a connection, now you seem to have one? Some screenshots and/or logs might help.


(Joha1) #7

Ah, I see.

I’m not 100% certain what the logs are trying to tell me, but I’d say that they show from the beginning that I do get a connection (also shown in the UI) but this connection fails with a read timeout after a few minutes, with nothing transferred, except for the information about a new folder over the weekend, that is.

However the contents of said folder remain unchanged. I’ll add another one later and see what happens over night, when my laptop will be in my home wifi, but as of now my problem can probably be summed up as:

connection: yes

synchronisation: no

What logs do you need? Here’s what I think should be relevant, after all the ready to synchronize messages:

[SGCIO] 2017/11/13 12:30:02.413658 service.go:61: INFO: Detected 0 NAT devices
[SGCIO] 2017/11/13 12:30:02.825342 kcp_listen.go:247: INFO: kcp://0.0.0.0:22020 detected NAT type: Symetric NAT
[SGCIO] 2017/11/13 12:30:02.825389 kcp_listen.go:260: INFO: kcp://0.0.0.0:22020 resolved external address kcp://143.***:11714 (via stun.schlund.de:3478)
[SGCIO] 2017/11/13 12:30:03.442395 static.go:93: INFO: Joined relay relay://79.***:22067
[SGCIO] 2017/11/13 12:30:23.264213 service.go:214: INFO: Failed to exchange Hello messages with FBP*** (5.***:22067): read tcp 10.***:42306->5.***:22067: i/o timeout
[SGCIO] 2017/11/13 12:30:23.264577 service.go:284: INFO: Established secure connection to FBP*** at 10.***:42340-5.***:22067 (relay-client) (TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305)
[SGCIO] 2017/11/13 12:30:23.264772 model.go:1485: INFO: Device FBP*** client is "syncthing v0.14.40" named "14-***"
[SGCIO] 2017/11/13 12:31:41.479905 service.go:247: INFO: Connected to already connected device (FBP***)

That Failed to exchange Hello messages with FBP* part seems new to me.

and screenshot from one of the machines (other looks very similar):

!


(Audrius Butkevicius) #8

To me this smells like a firewall/relay issue. You should actually check logs from both sides to get a full picture.


(Joha1) #9

I think you are right.

I have monitored the situation a bit more and left the wired office machine running over night, whereas I moved my laptop to my home wifi. And after the usual back and forth until I got a connection, it started to sync:

And I also have some stuff that was out of sync for days on my desktop, so it also works the other way from the one pictured.

Now, back at work, I’m back to the connection-nothing happens-timeout “routine”.

I guess I have to contact my IT department to figure out what they are blocking on the wifi, but I guess opening that call with “Uh, are you blocking some stuff?” is a bit to general.

Unfortunately I’m not really familiar with networking and protocols. Can anyone tell me what protocols or such syncthing uses that might be prone to getting blocked, or how I can find out what exactly gets blocked, so that I might already ask for some specifics?

I haven’t seen anything so far that indicates a firewall issue in the logs so far. What would I look for?


(Peter Ferencz) #10

I was having similar issues with Syncthing 0.14.40 between two Ubuntu servers (16.04). A ‘uname -a’ on both show the following:

Linux xxxxxxxx 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:31:34 UTC 2017 i686 i686 i686 GNU/Linux
Linux xxxxxxxx 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

They would initially connect and start syncing and then they would just disconnect. They are both on the same LAN. The 32bit server only has ‘Local Discovery’ enabled as it just occasionally backs up the 64bit server which does all the main syncing to other decices.

I stumbled across another forum discussion (Devices become disconnected) and I followed the directions regarding Enabling “Limit Bandwidth In Lan” and then setting a non-zero limit on Incoming and Outgoing rates (102400) and it looks to have fixed my issue. Just wondering if this bug has crept back in again.

Peter


(Audrius Butkevicius) #11

You should logs/screenshots. The issue is unlikely to have come back, unless you are running on ARM, which based on uname output you are not.


(system) #12

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.