Fresh setup with 2 devices, sync stuck with no progress

Hi,

I’ve made a fresh setup with two new devices (separate to my last topic) with Linux amd64 Syncthing v1.3.3 and got stuck again with no progress while local network connection is running fine.

Device A - has the “master” data which should be copied over to device B which initially had an empty, fresh syncthing folder.

Device B - ran some hours and synced fine, then stopped with “Idle - up to date” and still hasn’t got all files.

This time, I just let Syncthing do its thing. I did NOT reconfigure anything while syncing nor change folder types after initial setup.

Checked Web UI logs from both devices and didn’t find anything useful pointing to the error cause. Just…

device A has a lot of those lines with the “db” log facility enabled.

2020-01-28 22:28:15 need folder="wm9iy-ye9fd" device=CYKNZE3-RR7WDE2-CEKGBWG-Y4R2KBN-56HBUPP-YFGJ6QE-S5WSCM7-3OFQOQU name="lanprov/Spiele/RedAlert1/Maps/Snapshot" have=false invalid=false haveV={[]} globalV={[{V4GPC5C 1}]} globalDev=7777777-777777N-7777777-777777N-7777777-777777N-7777777-77777Q4

This file is indeed one of the “problematic” files.

I did not disconnect network or the USB source device. I’ve verified the files can be read properly on the source. Anything I could offer here to diagnose it further?

I’ve never had such problems with Syncthing picking me one after another, so I wonder what or if me has caused this unintenionally??? It just ever ran fine before in every constellation and now - where I need it to copy over a set of regular directories and files - it fails unexpectedly.

Thanks for your patience and help. If required, I come back with more log information. Will let this second “test case” sit as it is if we need it for further diagnosis. (The first I’ve posted before is needed in production).

In your second image for device B I can see for “Letzte Änderung” the deleted “.stfolder”. If this is deleted in real, is clear that this peer can not run. In both peer folders this foldermarker must be available.

You should call https://docs.syncthing.net/rest/db-file-get.html for one of the missing items on both ends and post it here.

Smells like we’re royally screwing something up dropping index entries along the way.

How, I don’t know.

1 Like

Hold on, one if the sides has ignore patterns, do the items missing match the patterns?

Sorry, yes, I don’t read German so I though it was ignores.

It seems we’ve screwed up something here.

Can you run stindex tool (need to build yourself) with idxck mode and see what it outputs?

Yeah the procedure is the same, just a different package. It will be more interesting for the device where the entry is missing, but it might be useful to run against both.

1 Like

@calmh @imsodin?

Did you have abrupt shutdowns on the device or something like that? Panics?

No, nothing. The service ran stable overnight and when I came back next day to check I had a xx hours plausible uptime of Syncthing.

Can I meanhwilst backup the index, nuke it and try a rescan? I don’t want to block investigation by a fast action…

Sure. Just back up the index in case there is something interesting there.

1 Like

Something seems awfully screwy with that index, indeed.

I doubt connection attempts could be to do with this. Is this something youbwere able to reproduce?

I will try next as it’s easy to let it scan and sync over again…

Same thing: https://github.com/syncthing/syncthing/issues/6304

We updated goleveldb in 1.3.1 and it feels like this is an issue that happens roughly since about that time

1 Like

Yes, seems to be my issue, too. I didn’t find the repoducer yet. Just rescanned after a db reset and ot worked again.

I’ve ran some stress tests now for a couple of hours doing continuous database updates and checking for this issue in parallel and haven’t found anything on my systems. So it’s not totally systematic at least.

1 Like

Do logs ever contain Database failed to stop within 10s?

This is every stop, for me.

That’s not ok. If you enable the app debug facility, is there also a service which fails to terminate in a timely manner? (that should probably be changed to log at info or even warning level anyway)

Yeah I was going to troubleshoot at some point. On mobile right now unfortunately.