Fresh setup with 2 devices, sync stuck with no progress

Catfriend1 · January 28, 2020, 9:39pm

Hi,

I’ve made a fresh setup with two new devices (separate to my last topic) with Linux amd64 Syncthing v1.3.3 and got stuck again with no progress while local network connection is running fine.

Device A - has the “master” data which should be copied over to device B which initially had an empty, fresh syncthing folder.

Device B - ran some hours and synced fine, then stopped with “Idle - up to date” and still hasn’t got all files.

This time, I just let Syncthing do its thing. I did NOT reconfigure anything while syncing nor change folder types after initial setup.

Checked Web UI logs from both devices and didn’t find anything useful pointing to the error cause. Just…

device A has a lot of those lines with the “db” log facility enabled.

2020-01-28 22:28:15 need folder="wm9iy-ye9fd" device=CYKNZE3-RR7WDE2-CEKGBWG-Y4R2KBN-56HBUPP-YFGJ6QE-S5WSCM7-3OFQOQU name="lanprov/Spiele/RedAlert1/Maps/Snapshot" have=false invalid=false haveV={[]} globalV={[{V4GPC5C 1}]} globalDev=7777777-777777N-7777777-777777N-7777777-777777N-7777777-77777Q4

This file is indeed one of the “problematic” files.

I did not disconnect network or the USB source device. I’ve verified the files can be read properly on the source. Anything I could offer here to diagnose it further?

I’ve never had such problems with Syncthing picking me one after another, so I wonder what or if me has caused this unintenionally??? It just ever ran fine before in every constellation and now - where I need it to copy over a set of regular directories and files - it fails unexpectedly.

Thanks for your patience and help. If required, I come back with more log information. Will let this second “test case” sit as it is if we need it for further diagnosis. (The first I’ve posted before is needed in production).

Andy · January 28, 2020, 9:49pm

In your second image for device B I can see for “Letzte Änderung” the deleted “.stfolder”. If this is deleted in real, is clear that this peer can not run. In both peer folders this foldermarker must be available.

Catfriend1 · January 28, 2020, 9:59pm

Yeah I knew that, but the source files are a backup of my old server and I cleaned the .stfolder from all subfolders, not the main folder. In general you are right, but my main folder still has .stfolder in place :-). You would also get the folder marker missing warning if deleting the .stfolder in syncthing folder’s root.

AudriusButkevicius · January 28, 2020, 10:31pm

You should call https://docs.syncthing.net/rest/db-file-get.html for one of the missing items on both ends and post it here.

Smells like we’re royally screwing something up dropping index entries along the way.

How, I don’t know.

AudriusButkevicius · January 28, 2020, 10:32pm

Hold on, one if the sides has ignore patterns, do the items missing match the patterns?

Catfriend1 · January 29, 2020, 10:16am

Hmmm… I did not setup ignores? Where do you get that from or is it misunderstanding because of “Ignore permissions” is set on both nodes? Just for info: I’ve set up ignore permissions initially because it’s the first time I try to sync data which all belongs to the “root” owner from ext4 (device A) to zfs (device B). I’m not using restrictions, so Syncthing is running as root and the zfs root is also owned by root. Not expecting trouble here.

Catfriend1 · January 29, 2020, 10:21am

Strange, device B outputs:

curl -X GET -H "X-API-Key: XXX" "http://localhost:20080/rest/db/file?folder=wm9iy-ye9fd&file=lanprov/Spiele/RedAlert1/Maps/Snapshot"
No such object in the index

Tried to add front and trailing slashes, still no object found in the index.

Catfriend1 · January 29, 2020, 10:28am

Device A outputs:

curl -X GET -H "X-API-Key: XXX" "http://localhost:20080/rest/db/file?folder=wm9iy-ye9fd&file=lanprov/Spiele/RedAlert1/Maps/Snapshot"
{
  "availability": null,
  "global": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-07-29T12:30:09.7972859+02:00",
    "modifiedBy": "V4GPC5C",
    "mustRescan": false,
    "name": "lanprov/Spiele/RedAlert1/Maps/Snapshot",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0777",
    "sequence": 4034,
    "size": 128,
    "type": "DIRECTORY",
    "version": [
      "V4GPC5C:1"
    ]
  },
  "local": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-07-29T12:30:09.7972859+02:00",
    "modifiedBy": "V4GPC5C",
    "mustRescan": false,
    "name": "lanprov/Spiele/RedAlert1/Maps/Snapshot",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0777",
    "sequence": 4034,
    "size": 128,
    "type": "DIRECTORY",
    "version": [
      "V4GPC5C:1"
    ]
  }
}

AudriusButkevicius · January 29, 2020, 11:20am

Sorry, yes, I don’t read German so I though it was ignores.

It seems we’ve screwed up something here.

Can you run stindex tool (need to build yourself) with idxck mode and see what it outputs?

Catfriend1 · January 29, 2020, 12:08pm

I guess I can build stindex with the build procedure already known from st-android? The only difference would be to build all packages and not just target ‘syncthing’, right? Sure, will try that. Do you require stindex idxck from both nodes or just B (who from my point of view has missing sth in index)?

AudriusButkevicius · January 29, 2020, 1:20pm

Yeah the procedure is the same, just a different package. It will be more interesting for the device where the entry is missing, but it might be useful to run against both.

Catfriend1 · January 29, 2020, 3:41pm

How do I get “idxck” mode?

I did:

Build stindex from v1.3.3 source
Copied it over to the machine having problems (B)
systemctl stop syncthing
cd /.config/syncthing/index-v0.14.0.db
/root/tools/stindex > stindex.log

The file I get is very big (> 100 MByte).

EDIT: Is the correct syntax

cd /.config/syncthing/index-v0.14.0.db
/root/tools/stindex --mode idxck

Catfriend1 · January 29, 2020, 3:48pm

Can you run stindex tool (need to build yourself) with idxck mode and see what it outputs?

Device B outputs nothing:

root@vm-fs01:~/tools# ./stindex --mode idxck
root@vm-fs01:~/tools#

Will check device A later when I have access to.

Catfriend1 · January 29, 2020, 8:43pm

Device A outputs a lot of:

Sequence entry missing for FileInfo "Backup/Savegames/Dead_Space_3/7", folder "wm9iy-ye9fd", seq 4240
Sequence entry missing for FileInfo "Backup/Savegames/Horizon_PS4/2", folder "wm9iy-ye9fd", seq 4224
Sequence entry missing for FileInfo "Backup/cf-LG-G4/xposed-v90-sdk27-arm64-beta3.zip", folder "wm9iy-ye9fd", seq 4194
Sequence entry missing for FileInfo "Backup/NETSRV/OS_Sicherung_Win10/#present_on_EXT-Trans-01", folder "wm9iy-ye9fd", seq 4108
Sequence entry refers to wrong name, "Backup/Installationsquellen/ExeMSI/MSI_Wrapper_7.1.9.0.msi" (seq) != "Backup/Images/Win10/1903/2" (FileInfo), folder "wm9iy-ye9fd", seq 4391
Sequence entry missing for FileInfo "Backup/Savegames/Final_Fantasy_X/2016-07-17_ffx_040_nach_sin_schuppe.zip", folder "wm9iy-ye9fd", seq 4248
Sequence entry missing for FileInfo "Backup/Savegames/Half-Life/2018-12-20_hl2ep1_saves.zip", folder "wm9iy-ye9fd", seq 4220
Sequence entry missing for FileInfo "Backup/cf-LG-G4/02_TWRP_flash/twrp-3.2.3-0-g4.zip", folder "wm9iy-ye9fd", seq 4155

(…) truncated (…)

I’ve filtered for the above mentioned “RedAlert1” folde and found:

Sequence entry refers to wrong name, "Backup/.streams/E1/71/00FC0000000000005600E20200000000/:Zone.Identifier:$DATA" (seq) != "lanprov/Spiele/RedAlert1/.stfolder" (FileInfo), folder "wm9iy-ye9fd", seq 4036
Sequence entry refers to wrong name, "Backup/.streams/E1/71/00FC0000000000005300E20200000000/:Zone.Identifier:$DATA" (seq) != "lanprov/Spiele/RedAlert1/Maps/Snapshot" (FileInfo), folder "wm9iy-ye9fd", seq 4034
Sequence entry refers to wrong name, "Backup/Net02/OS_Sicherung_Win10/Info.txt" (seq) != "lanprov/Spiele/RedAlert2/.stfolder" (FileInfo), folder "wm9iy-ye9fd", seq 4111

How could Syncthing fail the scan on A so badly? Everything was freshly set-up and disks are ensured to be okay.

AudriusButkevicius · January 29, 2020, 9:31pm

@calmh @imsodin?

AudriusButkevicius · January 29, 2020, 9:31pm

Did you have abrupt shutdowns on the device or something like that? Panics?

Catfriend1 · January 29, 2020, 9:46pm

No, nothing. The service ran stable overnight and when I came back next day to check I had a xx hours plausible uptime of Syncthing.

Can I meanhwilst backup the index, nuke it and try a rescan? I don’t want to block investigation by a fast action…

AudriusButkevicius · January 29, 2020, 10:19pm

Sure. Just back up the index in case there is something interesting there.

calmh · January 30, 2020, 7:05am

Something seems awfully screwy with that index, indeed.

Catfriend1 · January 30, 2020, 7:46am

It is very strange how it got damaged. Checked kernel logs of A, the USB device seems okay and did not disconnect unexpectedly while scan/sync. Renamed the index folder while st was off and went again to a fresh start. Let it be over night, checked in the morning and all states show the same on A+B and all files seem to be there. Now running a comparison of file contents with total commander between both datasets of A+B.

Could it be some racyness because of unsuccessful connection attempts? I noticed during the second run in the logs that device C was permanently trying to connect to A. C was there before I got the new OS installation on A because A had another syncthing device id running in the past. A correctly rejected every connection attempt of C . I have stopped C’s syncthing instance short after starting my second try after index reset.