Local discovery multiple instances: bind already in use


#1

I’m using v0.14.38 on various machines. There is a VPN between them and Syncthing uses local discovery. There is one instance of Syncthing per user on each machine (2 or 3 generally)

This used to work but since August 1st something broke and some devices stopped seeing each other. Upon investigation it appears the first instance of syncthing locks onto 0.0.0.0:21027 and other instances can’t use local discovery.

I’ve tried modifying localAnnounceMCAddr and localAnnouncePort, to no avail.

How to make local discovery work when multiple instances of syncthing are running?


(Jakob Borg) #2

IPv6 local discovery should work out of the box with multiple instances; you should not tweak the multicast address (for that reason, at least). IPv4 local discovery cannot work with more than one instance.

There may be platform specific considerations. What platform are you on?


#3

Mhh. There is a linux intel VM and multiple Raspberry Pi 3s involved. Do you mean to say multiple instances local disco never worked with IPv4, and something happened to my IPv6 config? (which is possible)

EDIT: said VM was booted with ipv6.disable=1. I guess aug-1st is the day this VM entered service… I’ll revert that. Thanks!


(Jakob Borg) #4

I meant operating system more than hardware. But yes, v4 local discovery “never” worked with multiple instances.

(There were some attempts with v4 multicast way back that weren’t super successful.)


#5

Something is wrong in an instance that uses IPv4. On that machine there are 2 interfaces, one connects to a LAN and one connects to a VPN. Both are bridges in fact. I’m seeing this failure:172.16.16.63:21027: sendto: operation not permitted – this address is the bcast address of the LAN bridge.

How can I tell Syncthing to use the broadcast address of the VPN bridge?

To confirm. tcpdump on this machine shows a lot of bcast traffic to the correct address, but none from the local machine. The tunnel works ok, but on the machine with 2 interfaces the broadcasts are not sent at all (due to prior failure I suppose.)


(Jakob Borg) #6

Try STTRACE=beacon to see more about what’s going on. VPN tunnels are typically not super broadcast friendly and Syncthing just goes by the interface netmasks.


#7

Ok. So this is on the Pi that has 2 bridge interfaces. I stopped the 2 instances of Syncthing. Iptables6 returns all clear, policy accept. I changed to one of the (non-privileged) users and launched “syncthing -no-browser”. For what I understand: The machine receives UDP4 packets (through the tunnel) but can’t send any as it doesn’t try sending to 172.19.255.255. It sends UDP6 packets but something is broken with joining the multicast group…

Some picks from the log file:

[H6IXY] 2017/10/11 17:58:22.277876 tcp_listen.go:69: INFO: TCP listener (172.19.0.255:39818) starting
[H6IXY] 2017/10/11 17:58:22.278726 broadcast.go:86: DEBUG: broadcastWriter@0x12007260 starting
[H6IXY] 2017/10/11 17:58:22.279922 broadcast.go:122: DEBUG: addresses: [172.16.16.63 172.19.255.255]
[H6IXY] 2017/10/11 17:58:22.280166 broadcast.go:148: DEBUG: write udp4 0.0.0.0:53612->172.16.16.63:21027: sendto: operation not permitted
[H6IXY] 2017/10/11 17:58:22.280279 broadcast.go:150: DEBUG: broadcastWriter@0x12007260 stopping
[H6IXY] 2017/10/11 17:58:22.285768 multicast.go:85: DEBUG: multicastWriter@0x1204dc60 starting
[H6IXY] 2017/10/11 17:58:22.286680 multicast.go:124: DEBUG: write udp6 [::]:59962->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 lo
[H6IXY] 2017/10/11 17:58:22.286794 multicast.go:124: DEBUG: write udp6 [::]:59962->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 eth0
[H6IXY] 2017/10/11 17:58:22.286966 multicast.go:129: DEBUG: sent 74 bytes to [ff12::8384]:21027 on wap0
[H6IXY] 2017/10/11 17:58:22.287126 multicast.go:129: DEBUG: sent 74 bytes to [ff12::8384]:21027 on lanbr
[H6IXY] 2017/10/11 17:58:22.287237 multicast.go:129: DEBUG: sent 74 bytes to [ff12::8384]:21027 on tap0
[H6IXY] 2017/10/11 17:58:22.287333 multicast.go:124: DEBUG: write udp6 [::]:59962->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 eth0.191
[H6IXY] 2017/10/11 17:58:22.287475 multicast.go:129: DEBUG: sent 74 bytes to [ff12::8384]:21027 on dom2br
[H6IXY] 2017/10/11 17:58:22.288433 multicast.go:188: DEBUG: IPv6 join lo failed: operation not supported
[H6IXY] 2017/10/11 17:58:22.288486 multicast.go:188: DEBUG: IPv6 join eth0 failed: operation not supported
[H6IXY] 2017/10/11 17:58:22.288525 multicast.go:188: DEBUG: IPv6 join wap0 failed: operation not supported
[H6IXY] 2017/10/11 17:58:22.288561 multicast.go:188: DEBUG: IPv6 join lanbr failed: operation not supported
[H6IXY] 2017/10/11 17:58:22.288598 multicast.go:188: DEBUG: IPv6 join tap0 failed: operation not supported
[H6IXY] 2017/10/11 17:58:22.288633 multicast.go:188: DEBUG: IPv6 join eth0.191 failed: operation not supported
[H6IXY] 2017/10/11 17:58:22.288668 multicast.go:188: DEBUG: IPv6 join dom2br failed: operation not supported
[H6IXY] 2017/10/11 17:58:27.169980 broadcast.go:210: DEBUG: recv 69 bytes from 172.19.20.14:58913
[H6IXY] 2017/10/11 17:58:27.171908 broadcast.go:122: DEBUG: addresses: [172.16.16.63 172.19.255.255]
[H6IXY] 2017/10/11 17:58:27.172319 broadcast.go:148: DEBUG: write udp4 0.0.0.0:56322->172.16.16.63:21027: sendto: operation not permitted
[H6IXY] 2017/10/11 17:58:27.172520 broadcast.go:150: DEBUG: broadcastWriter@0x12007260 stopping
[H6IXY] 2017/10/11 17:58:27.172930 broadcast.go:37: DEBUG: Entering the backoff state.
[H6IXY] 2017/10/11 17:58:52.281309 multicast.go:129: DEBUG: sent 74 bytes to [ff12::8384]:21027 on dom2br
[H6IXY] 2017/10/11 17:58:52.699465 broadcast.go:210: DEBUG: recv 69 bytes from 172.19.0.2:59980
[H6IXY] 2017/10/11 17:58:57.170921 broadcast.go:210: DEBUG: recv 69 bytes from 172.19.20.14:58913

:frowning: EDIT: platform info

syncthing v0.14.39 "Dysprosium Dragonfly" (go1.9 linux-arm) deb@build.syncthing.net 2017-09-25 06:05:21 UTC [noupgrade] 

Linux berck 4.9.35-v7+ #1014 SMP Fri Jun 30 14:47:43 BST 2017 armv7l GNU/Linux

lsb_release -a
No LSB modules are available.
Distributor ID:	Raspbian
Description:	Raspbian GNU/Linux 8.0 (jessie)
Release:	8.0
Codename:	jessie

(Jakob Borg) #8

I’m not in front of a computer to check but

looks like it aborts the broadcast sender after failing one destination without trying the next, which would be a bug.


#9

Sorry to be a bother :wink:

In this case the break could have happened near the beginning of August (I only run stable releases and some devices were “last seen” about that time.)

Can I install an old release like 0.14.34 without destroying my files?

And I also welcome any hint about the ipv6 error, I have strictly no idea what’s going on, there.

Thanks!


(Jakob Borg) #10

Sure. The v6 “can’t join” errors are expected on all interfaces that don’t have v6 or don’t support multicast so mostly expected - hence only visible at trace level.


#11

On second thought, I just stopped my usual instances and decided to run the service for user “admin” who normally has nothing to share :slight_smile: Then I tried different versions:

This is syncthing_0.14.21_armhf.deb (oldest version I had in cache):

[Y6VXW] 2017/10/11 19:56:35.525822 broadcast.go:122: DEBUG: addresses:     [172.16.16.63 172.19.255.255]
[Y6VXW] 2017/10/11 19:56:35.527453 broadcast.go:148: DEBUG: write udp4 0.0.0.0:49282->172.16.16.63:21027: sendto: operation not permitted
[Y6VXW] 2017/10/11 19:56:35.527633 broadcast.go:150: DEBUG: broadcastWriter@0x10e0d620 stopping
... and this time ...
[Y6VXW] 2017/10/11 19:56:35.535197 multicast.go:190: DEBUG: IPv6 join lo success
[Y6VXW] 2017/10/11 19:56:35.535327 multicast.go:190: DEBUG: IPv6 join eth0 success
[Y6VXW] 2017/10/11 19:56:35.535441 multicast.go:190: DEBUG: IPv6 join wap0 success
[Y6VXW] 2017/10/11 19:56:35.535501 multicast.go:190: DEBUG: IPv6 join lanbr success
[Y6VXW] 2017/10/11 19:56:35.535558 multicast.go:190: DEBUG: IPv6 join tap0 success
[Y6VXW] 2017/10/11 19:56:35.535655 multicast.go:190: DEBUG: IPv6 join eth0.191 success
[Y6VXW] 2017/10/11 19:56:35.535713 multicast.go:190: DEBUG: IPv6 join dom2br success

It is very successful, even at joining interfaces that do not have an address :slight_smile:

syncthing_0.14.35_armhf.deb still reports IPv6 joining success.

But with syncthing_0.14.38_armhf.deb I get:

[Y6VXW] 2017/10/11 20:04:49.950703 multicast.go:188: DEBUG: IPv6 join lo failed: operation not supported
[Y6VXW] 2017/10/11 20:04:49.950827 multicast.go:188: DEBUG: IPv6 join eth0 failed: operation not supported
[Y6VXW] 2017/10/11 20:04:49.950869 multicast.go:188: DEBUG: IPv6 join wap0 failed: operation not supported
[Y6VXW] 2017/10/11 20:04:49.950907 multicast.go:188: DEBUG: IPv6 join lanbr failed: operation not supported
[Y6VXW] 2017/10/11 20:04:49.950943 multicast.go:188: DEBUG: IPv6 join tap0 failed: operation not supported
[Y6VXW] 2017/10/11 20:04:49.950980 multicast.go:188: DEBUG: IPv6 join eth0.191 failed: operation not supported
[Y6VXW] 2017/10/11 20:04:49.951016 multicast.go:188: DEBUG: IPv6 join dom2br failed: operation not supported

The bridges both support multicast and have a link-local address (fe80::…)

I would say something happened somewhere between .35 and .38 regarding IPv6 multicast, and for IPv4/multiple interfaces, I can’t say.

NOTE I haven’t actually checked multiple instances would sync with v.35, as it has been working earlier. If you wish I can run 2 instances of v.35 and make really sure.


(Jakob Borg) #12

That’s interesting. I would guess that the major difference is a newer Go compiler and standard library - you can see this in the syncthing -version output. Also interesting to know if the v6 discovery actually worked (and now doesn’t) or if it just claimed join success. V6 discovery seems to still work for me at least on Mac and FreeBSD.


#13

And now for the conclusion, this works for me:

  • syncthing v0.14.35 "Dysprosium Dragonfly" (go1.8.3 linux-arm) deb@build.syncthing.net 2017-08-08 13:30:28 UTC [noupgrade] / Linux 4.9.35-v7+ #1014 SMP Fri Jun 30 14:47:43 BST 2017 RPI3 Raspbian Jessie: multiple interfaces, multiple instances
  • syncthing v0.14.29 "Dysprosium Dragonfly" (go1.8.3 linux-amd64) deb@build.syncthing.net 2017-05-18 05:47:51 UTC [noupgrade] / Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) Intel kvm Debian Jessie: multiple instances. Unfortunately I don’t have anything later than v .29 and prior to v.38 in my archives.
  • syncthing v0.14.39 "Dysprosium Dragonfly" (go1.9 darwin-amd64) teamcity@build.syncthing.net 2017-09-25 06:05:21 UTC / Darwin 16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64 Mac mini OS 10.12: single instance, lots of interfaces.

They all happily sync over the local network or across the VPN tunnel, as before. Luckily the OSX install is the only one that auto-updates :wink: I’ll be reading the release notes…

Thanks again, HTH.


(Audrius Butkevicius) #14

It’s most likely the version of Go rather than syncthing, as that part of the code hasn’t been touched for ages. You can download interim releases from github, and verify that it stopped working at the point we moved to a newer compiler (which would then mean a bug in stdlib)


(Jakob Borg) #15

FWIW it’s not entirely broken; I get this with 0.14.39 on Linux:

[XEQ7K] 2017/10/12 08:39:50.001949 multicast.go:85: DEBUG: multicastWriter@0xc4210581c0 starting
[XEQ7K] 2017/10/12 08:39:50.002422 multicast.go:159: DEBUG: multicastReader@0xc421058180 starting
[XEQ7K] 2017/10/12 08:39:50.011377 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 lo
[XEQ7K] 2017/10/12 08:39:50.011427 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 eth0
[XEQ7K] 2017/10/12 08:39:50.011477 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 eth1
[XEQ7K] 2017/10/12 08:39:50.011521 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on bond0
[XEQ7K] 2017/10/12 08:39:50.011586 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on docker0
[XEQ7K] 2017/10/12 08:39:50.011605 multicast.go:190: DEBUG: IPv6 join lo success
[XEQ7K] 2017/10/12 08:39:50.011844 multicast.go:190: DEBUG: IPv6 join eth0 success
[XEQ7K] 2017/10/12 08:39:50.011920 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 br-eee37f06dfc6
[XEQ7K] 2017/10/12 08:39:50.011965 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 br-a5b514f482fe
[XEQ7K] 2017/10/12 08:39:50.011997 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on veth1a2d8dc
[XEQ7K] 2017/10/12 08:39:50.012011 multicast.go:190: DEBUG: IPv6 join eth1 success
[XEQ7K] 2017/10/12 08:39:50.012213 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on vetha26d41d
[XEQ7K] 2017/10/12 08:39:50.012246 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on vethf5aeed2
[XEQ7K] 2017/10/12 08:39:50.012273 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on veth292ab8d
[XEQ7K] 2017/10/12 08:39:50.012299 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on vetha7a4deb
[XEQ7K] 2017/10/12 08:39:50.012337 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on veth329c0dc
[XEQ7K] 2017/10/12 08:39:50.012367 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 virbr0
[XEQ7K] 2017/10/12 08:39:50.012393 multicast.go:124: DEBUG: write udp6 [::]:37439->[ff12::8384]:21027: sendmsg: network is unreachable on write to [ff12::8384]:21027 virbr0-nic
[XEQ7K] 2017/10/12 08:39:50.012419 multicast.go:129: DEBUG: sent 68 bytes to [ff12::8384]:21027 on vnet0
[XEQ7K] 2017/10/12 08:39:50.012582 multicast.go:190: DEBUG: IPv6 join bond0 success
[XEQ7K] 2017/10/12 08:39:50.012604 multicast.go:190: DEBUG: IPv6 join docker0 success
[XEQ7K] 2017/10/12 08:39:50.012626 multicast.go:190: DEBUG: IPv6 join br-eee37f06dfc6 success
[XEQ7K] 2017/10/12 08:39:50.012648 multicast.go:190: DEBUG: IPv6 join br-a5b514f482fe success
[XEQ7K] 2017/10/12 08:39:50.012670 multicast.go:190: DEBUG: IPv6 join veth1a2d8dc success
[XEQ7K] 2017/10/12 08:39:50.012692 multicast.go:190: DEBUG: IPv6 join vetha26d41d success
[XEQ7K] 2017/10/12 08:39:50.012714 multicast.go:190: DEBUG: IPv6 join vethf5aeed2 success
[XEQ7K] 2017/10/12 08:39:50.012736 multicast.go:190: DEBUG: IPv6 join veth292ab8d success
[XEQ7K] 2017/10/12 08:39:50.012758 multicast.go:190: DEBUG: IPv6 join vetha7a4deb success
[XEQ7K] 2017/10/12 08:39:50.012781 multicast.go:190: DEBUG: IPv6 join veth329c0dc success
[XEQ7K] 2017/10/12 08:39:50.012804 multicast.go:190: DEBUG: IPv6 join virbr0 success
[XEQ7K] 2017/10/12 08:39:50.012826 multicast.go:190: DEBUG: IPv6 join virbr0-nic success
[XEQ7K] 2017/10/12 08:39:50.012849 multicast.go:190: DEBUG: IPv6 join vnet0 success

The sendmsg complaints are legit; those interfaces don’t have addresses. The ones that do (bond0, docker0, etc) pass without complaint. And apparently on this system it’s fine to join multicast groups on interfaces that don’t have addresses or multicast…

jb@acro:~ $ syncthing --version
syncthing v0.14.39 "Dysprosium Dragonfly" (go1.9 linux-amd64) teamcity@build.syncthing.net 2017-09-25 06:05:21 UTC

(Jakob Borg) #16

I’ve filed a PR to fix this one, it is a bug.

0.14.37 was the first release built with Go 1.9, so it makes sense that that could be the difference.

jb@unu:~/Downloads $ ./syncthing-macosx-amd64-v0.14.36/syncthing --version
syncthing v0.14.36 "Dysprosium Dragonfly" (go1.8.3 darwin-amd64) teamcity@build.syncthing.net 2017-08-10 15:31:25 UTC
jb@unu:~/Downloads $ ./syncthing-macosx-amd64-v0.14.37/syncthing --version
syncthing v0.14.37 "Dysprosium Dragonfly" (go1.9 darwin-amd64) teamcity@build.syncthing.net 2017-08-24 04:26:12 UTC

#17

Mhh, I would have thought the problem was general on linux or at least on bridges. Perhaps it is arch (ARM) or kernel-version dependent?

Or, there is some idiosyncrasy in my bridges setups. I don’t believe this but here is a failing one on ARM:

auto lanbr
iface lanbr inet dhcp
bridge-stp on
bridge-maxwait 2
bridge-fd 2
bridge-ports eth0 usb0

(Jakob Borg) #18

What does netstat -rn6 say? Specifically, do you see multicast entries like these for your interfaces?

jb@acro:~ $ netstat -rn6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
...
ff00::/8                       ::                         U    256 5  1416 bond0

#19

With 0.14.35 on Pi3:

root@berck:/home/admin# netstat -rn6 | grep ff00
ff00::/8                       ::                         U    256 4  3170 lanbr
ff00::/8                       ::                         U    256 4 12182 dom2br
root@berck:/home/admin# 

Do you want me to return to the non-working-for-me 0.14.39 and check?


(Jakob Borg) #20

If you like, but if it didn’t before I don’t think it will now… You can grab the dev snapshot (linked at the top) which includes the fix for your IPv4 issue if you like.

You don’t have any ip tables or so on these boxes that might interfere?

dom2br sounds like xen, are you running in a container or paravirtualized instance of some kind or so?