Today, I’ve been getting these particular errors on a Windows and an Android device, which basically breaks synchronisation completely.
What happens is as soon as the two devices connect, the connection gets dropped immediately, and the log gets filled with the following on Android:
Connection to 3KHRD6O at 192.168.50.2:22000-192.168.50.253:22000/tcp-client/TLS1.3-TLS_AES_128_GCM_SHA256/LAN-P10-61VQ1QRVBC86IF90LCDSIAQSCQ closed: decompressing message: lz4: invalid source or destination buffer too short
At the same time, the Windows side that is trying to send the file in question has these errors logged:
Connection to VZZQ7PU at 192.168.50.253:22000-192.168.50.2:22000/tcp-client/TLS1.3-TLS_AES_128_GCM_SHA256/LAN-P10-61VQ0HKLSP60Q2JNAEGEKKQJE6 closed: unmarshalling message: string field contains invalid UTF-8
Syncthing is v2.0.0-rc14 (arm64 on Android, and amd64 on Windows).
Have you got any idea what may the culprit be here? I’ve never seen these kind of errors before .
In fact I have seen these, though unfortunately I have as little clue to the cause as you do.
I got the “invalid utf-8” connection failures between my main Linux box and a closet Raspberry Pi and spent quite some time debugging it. Initially I expected some file naming issue but it turned out it was two bytes of the folder ID in the cluster config message that were overwritten with binary garbage – i.e., memory corruption of some kind. (That folder ID comes from the config, not the database, fwiw.) I didn’t manage to narrow down the cause, though it was reproducible for a while accross restarts. It seemed a little too reproducible to be plain hardware error.
Looking back in the logs on that rpi5 I do see some lz4 failures as well.
Linux-arm64 could be the common factor between our setups.
It’s something at the protocol/compression layer, not sure what. I tested builds with both the C and Go versions of SQLite with no difference, and with Go 1.24 and 1.23. Adding some debug messages, we have the sending side making a request (prior to compression):
Maybe with a specific type of file? Unfortunately, I cannot share the one causing the issue for me, as it’s a large zip archive with a lot of private information.
Yeah, no, I mean I have the issue on my own data above, it’s just that when writing a test on that specifically in the lz4 package I don’t see an issue. So I’m still not sure what causes it, specifically.
[RTF25] 2025/05/07 17:30:02 INFO: Lost primary connection to 3QBOJOM at 192.168.16.118:22000-188.xxx.xxx.104:22000/tcp-server/TLS1.3-TLS_AES_128_GCM_SHA256/WAN-P30-61T9EID581F98AB426R5JKSN3I: decompressing message: lz4: invalid source or destination buffer too short (0 remain)
this happened across most of the remote devices causing loads of disconnects. It got so bad that I have been back on v1 since the 7th. I figured as i’m an extreme user it might have been down to resources so been waiting to see if anyone else reported it
My set up is windows to windows and compression set to all data on both ends
edit: only the receiving end is on v2, the sending ends are all on v1
That’s my impression as well, yet there are no real changes to the protocol code in v2 (rc.14 or any other). This very much feels like a problem caused by corruption on the message sending side, or something very low level in the TLS stack which is also worrying.
I’m testing under Windows now, with both “devices” running Syncthing v2.0.0-rc.14 (amd64), using the exact same file that caused problems on Android, and it also fails here. One difference between this and what happened on Android is that the Windows devices don’t disconnect, only the file transfer stops.
The file is stuck in transfer on the receiving side as below:
This happens with “Metadata Only” compression on both sides. I can provide debug logs, etc. if necessary.
Edit:
Something very strange seems to be going on. I removed the file, re-added it, and tried to sync again. That time, there were no issues. Then, I repeated the same process one more time, and now it fails with the connection between the two devices getting dropped immediately with a flood of these errors:
2025-05-16 19:01:16 Lost primary connection to DHK2GDE at [::]:50440-[fe80::de3d:ad5d:fa83:f4bc%Wi-Fi 2]:51355/quic-server/TLS1.3-TLS_AES_128_GCM_SHA256/LAN-P20-620228PAV43T0MC93L2933SNII: unmarshalling header: proto: cannot parse invalid wire-format data 0803000000550813120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080803b28808010322008a05aa75761bc4503a1cd638aab851c9f9932c54e4811e723eb97ecd9a78e3548d80300020803000000550814120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080903b2880801032202db37b3b6f758fa17399e37b23c4809072a53a895976c889ff41949932bed86e48d90300020803000000550815120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080a03b288080103220269ce4d3f8e4d27293a1b76f42da9d3af4a11caa6e5190375e7cc716a23e61b548da0300020803000000550816120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080b03b28808010322014fc5e8f4011973212f207a5e689db841745e430368233f0e81d06bb2ae54b8148db0300020803000000550817120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080c03b28808010322061d04db1b8246a465361582db2d624e857b4070b5a15249c4537c7b1878a72fa48dc0300020803000000550818120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080d03b2880801032201951e12d7b261840e3ecf7d675ee14b0300a018216b2add3a3f7338b279a44d348dd0300020803000000550819120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080e03b28808010322020099494ecb9b7ffef6005b7320a76689c0fb60440c1a307c6da9729da67a94c48de030002080300000055081a120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080f03b2880801032201dc9995bc641a674dbf3b1fdec9811d3c91203c929526f4e5255b494bcb947e648df030002080300000055081b120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080903c288080103220295512a9d9abfb14c6a360b8321b8d4d492499d1885e5ae620a69cdf88cfb23848e1030002080300000055081c120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080a03c288080103220752788a3616dc38534f03769a19bfe6c58260ebf07108cbca856a33b5c60fb8e48e2030002080300000055081d120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080b03c2880801032208ae458284b644144811444af00e87af07ec21762b3c38927c3b8aecf60b0d64048e3030002080300000055081e120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080c03c288080103220dd8e1beea5b71511ed568108c27bff26b2649d0485792f6720189740c9be19d648e4030002080300000055081f120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080803c288080103220b51bab75c55722d832eb18d91d3337b1cbb8127e0ab56d44e1054df1c21e580248e00300020803000000550820120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080d03c288080103220ada6cb55747d9ee076d5c96f943e31b57214365b0bce3f1fa4af6b18af6dfc0948e50300020803000000550821120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080e03c288080103220d111da9fae813c8142984bc0d2609780190041f719116ad3daa00524917aee7848e60300020803000000550822120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080f03c288080103220cf75b758ae4aa315b14f8fadcc82bbe02c191a3f1f6fe74f7f0bc4f8bf47b95948e70300020803000000550823120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080803d288080103220eb7b3651a02f9dbbb5ecf003e3cc101e377cbc1e444d29ca12c68dc6dfeb228e48e80300020803000000550824120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080903d288080103220105f972b44d48b95f846fa8127134e4183742097c551d65c01e4203161cab33648e90300020803000000550825120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080a03d28808010322034d7bb2c5f0072473632edd7860530cf877ffb1c449739c79c177985b2a0aef948ea0300020803000000550826120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080b03d2880801032206c362217485bd8f92db5fe732b23dc36aadefe7516a36cf46f40ae989363a15a48eb0300020803000000550827120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080c03d288080103220a882fce3b88050d5945f3cdcd4d974ac94e2d3298e017fb01832275c25cae38548ec0300020803000000550828120b6a707474352d65676c68321a16756a7e32303235303531352d3132333335312e7a6970208080d03d2880801032207a0fa0628f15f4c84a71d42b01c8a33894d4876e3506d48ca2ff501a71fbc80148ed030002080300000055 (0 remain)
So essentially something is causing on-the-wire corruption of messages. If it’s file data, it gets caught by hashing, if it corrupts strings in protobuf messages it gets caught by that, etc.
Since you can provoke this reasonably reliably, can you check that this really doesn’t happen with current main (i.e., v1)? (I know proving a negative is impossible, but just some checks to see if it seems likely…)
Yeah, with the exact same config, I replaced the binaries with v1.29.7-dev.6.g1dd26489 and tried to sync the file about 10 times (by repeating the same copy-sync-delete steps). No problems at all.
Next, I shut Syncthing v1 down, replaced the binaries with v2 again, and the file got stuck in transfer right on the very first Syncthing launch .