[v1.8.0] Local and global state swapped between two nodes

Hi,

it’s been a while and I’ve checked again after updating to v1.7.1. Counting with SSH “find all” I now have:

  • dev B/install: 2174 files, 507 folders- 172.623.928.912 B
  • dev C/install: 2197 files, 506 folders- 173.922.801.015 B

Syncthing reports “up2date” on both ends. There are still no failed items and no revert local change buttons visible. Restarting the Syncthing instances doesn’t help either.

From my overlook position, I expect the SR -> RO pair between dev A and dev B is okay. Both show the same file/dir counts. The pair dev B RO -> dev C RO is still wrong (like shown in the first post).

I’ve now exported the file lists of dev B and dev C to compare them manually. (find | sort command, stripping the folder root path so only contents of the /install/… folder will be compared)

It’s remarkable dev C has too many files. Looking at the manual comparison on the OS/FS level, I clearly see it’s out of sync because it dev C missed some “deletions” where I can’t say if it dropped them internally or if dev B “forgot” to announce them to dev C. I have never set any special options on that folder pair before (so never used ignoreDelete, for example). This smells like a bug, but let’s wait what experts say.

Here’s a screenshot, dev B on the left, dev C on the right.

To make your life easier than reading text, the current Web UI output is…

dev A:

dev B:

dev C:

Ok, next I’ve tried to update to v1.8.0-rc.2 on all nodes.

—> dev A + dev B are still properly “in sync”. —> dev B sees dev C as “up2date” and is not transferring anything according to bitrate.

dev C: now sees dev B as “out of sync - 24 elements, ~1.21 GiB”

image

When I click the “24 elements” label, it doesn’t say anything in the popup dialog (another bug?), the window is just empty.

I suspect the “24 elements” are those I’ve found by the manual comparison between devices B and C with the SSH find command output. dev C still shows no button or hint to resolve the issue. Shouldn’t the “revert local changes” button pop up on dev C?

I’ll capture a copy of the DB in case we need it.

I presume it also shows the same (missing 24 files) state for device A?

I suspect if you run index recheck you’ll see problems on one of the devices. By the looks of it, it looks like C caused the trouble here, and claims it sent stuff which it really didn’t, in which case I think running -reset-deltas on both A and B would fix it.

We know this happens sometimes, but we never got to the bottom of why or where this happens.

I guess this gets us closer to “where it happens”, as it seems both A and B have the issue, pointing the fingers at C causing this.

Ok, I’d try this next time when on the cluster. Device A + B don’t show anything out-of-sync, meaning everything shows “up2date”. It’s only been device C causing trouble. I’ll come back when I did the --reset-deltas.

What I was saying is that C should declare both A and B out of sync.

It would be interesting to get the output of: https://docs.syncthing.net/rest/db-file-get.html

From all 3 before you do -reset-deltas.

1 Like

Hi,

before performing any action:

#
$folder = "n4q4z-ohpfz"
$file = "Adobe/AdobeFlashPlayer/Adobe_Flash_Player_CleanUp.exe"
#
"*** devA ***"
c:\windows\curl.exe -s -X GET -H "X-API-Key: $apiKeyA" 'http://'$ipDevA'/rest/db/file?folder='$folder'&file='$file
#
""
"***  devB ***"
c:\windows\curl.exe -s -X GET -H "X-API-Key: $apiKeyB" 'http://'$ipDevB'/rest/db/file?folder='$folder'&file='$file
#
""
"*** devC*** "
c:\windows\curl.exe -s -X GET -H "X-API-Key: $apiKeyC" 'http://'$ipDevC'/rest/db/file?folder='$folder'&file='$file


*** devA ***
{
  "availability": [
    {
      "id": "7E4EKDA-DEXQAEY-DIKXM2L-6XO7YWS-ARPGAW6-CRUSAI7-U2U5UMP-CCZKRAH",
      "fromTemporary": false
    }
  ],
  "global": {
    "deleted": true,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-09-05T14:25:19.0109486+02:00",
    "modifiedBy": "VBHJ3YR",
    "mustRescan": false,
    "name": "Adobe\\AdobeFlashPlayer\\Adobe_Flash_Player_CleanUp.exe",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0",
    "sequence": 15661,
    "size": 0,
    "type": "FILE",
    "version": [
      "LRORKSI:1",
      "VBHJ3YR:2",
      "VPYAMUE:1",
      "ZTMOUGV:1",
      "7E4EKDA:1"
    ]
  },
  "local": {
    "deleted": true,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-09-05T14:25:19.0109486+02:00",
    "modifiedBy": "VBHJ3YR",
    "mustRescan": false,
    "name": "Adobe\\AdobeFlashPlayer\\Adobe_Flash_Player_CleanUp.exe",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0",
    "sequence": 15661,
    "size": 0,
    "type": "FILE",
    "version": [
      "LRORKSI:1",
      "VBHJ3YR:2",
      "VPYAMUE:1",
      "ZTMOUGV:1",
      "7E4EKDA:1"
    ]
  }
}

***  devB ***
{
  "availability": [
    {
      "id": "VBHJ3YR-WXRWUR4-GVZBFIL-4I2GE5T-TYHNPZY-NY3PKDS-Y2AFDBO-MZNP4AD",
      "fromTemporary": false
    },
    {
      "id": "VPYAMUE-D44JYED-5DNCNOV-QNNFESH-QLVRLP4-525HK3X-GK5RXN4-QS5LMQY",
      "fromTemporary": false
    }
  ],
  "global": {
    "deleted": true,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-09-05T14:25:19.0109486+02:00",
    "modifiedBy": "VBHJ3YR",
    "mustRescan": false,
    "name": "Adobe/AdobeFlashPlayer/Adobe_Flash_Player_CleanUp.exe",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0",
    "sequence": 15319,
    "size": 0,
    "type": "FILE",
    "version": [
      "LRORKSI:1",
      "VBHJ3YR:2",
      "VPYAMUE:1",
      "ZTMOUGV:1",
      "7E4EKDA:1"
    ]
  },
  "local": {
    "deleted": true,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-09-05T14:25:19.0109486+02:00",
    "modifiedBy": "VBHJ3YR",
    "mustRescan": false,
    "name": "Adobe/AdobeFlashPlayer/Adobe_Flash_Player_CleanUp.exe",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0",
    "sequence": 15319,
    "size": 0,
    "type": "FILE",
    "version": [
      "LRORKSI:1",
      "VBHJ3YR:2",
      "VPYAMUE:1",
      "ZTMOUGV:1",
      "7E4EKDA:1"
    ]
  }
}

*** devC*** 
{
  "availability": [
    {
      "id": "7E4EKDA-DEXQAEY-DIKXM2L-6XO7YWS-ARPGAW6-CRUSAI7-U2U5UMP-CCZKRAH",
      "fromTemporary": false
    }
  ],
  "global": {
    "deleted": true,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-09-05T14:25:19.0109486+02:00",
    "modifiedBy": "VBHJ3YR",
    "mustRescan": false,
    "name": "Adobe/AdobeFlashPlayer/Adobe_Flash_Player_CleanUp.exe",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0",
    "sequence": 15319,
    "size": 0,
    "type": "FILE",
    "version": [
      "LRORKSI:1",
      "VBHJ3YR:2",
      "VPYAMUE:1",
      "ZTMOUGV:1",
      "7E4EKDA:1"
    ]
  },
  "local": {
    "deleted": true,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2017-09-05T14:25:19.0109486+02:00",
    "modifiedBy": "VBHJ3YR",
    "mustRescan": false,
    "name": "Adobe/AdobeFlashPlayer/Adobe_Flash_Player_CleanUp.exe",
    "noPermissions": false,
    "numBlocks": 0,
    "permissions": "0",
    "sequence": 31923,
    "size": 0,
    "type": "FILE",
    "version": [
      "LRORKSI:1",
      "VBHJ3YR:2",
      "VPYAMUE:1",
      "ZTMOUGV:1",
      "7E4EKDA:1"
    ]
  }
}
  • LRORKSI = ??? I don’t know what this is, maybe something old device that was removed ???
  • VBHJ3YR = devA
  • VPYAMUE = devC
  • 7E4EKDA = devB

After performing the --reset-deltas on all 3 nodes, the log showed “reinitializing delta index” but the state is exactly as before looking at the Web UI - see old screenshots [v1.6.1] Local and global state swapped between two nodes .

To sum up:

  • devA, devB: No indication of a problem on the web UI. Global and local states match.
  • devC: G: 2195 files, L: 2171 files, says that devB has 24 out of sync items but doesn’t mention them in the popup dialog (dialog empty)

I’ve now got a strong feeling I suffer from something similar to https://github.com/syncthing/syncthing/issues/6855#issuecomment-664478913, too.

As I wrote above, I don’t know the LRORKSI device ID and guess it was a device I’ve removed some time ago and that didn’t clean out well in the cluster?! I’m also using receiveOnly folders like the user in the linked GitHub comment said.

Hi,

good news. Since I’ve explained yesterday that I fresh started devC with v1.8.0-rc.2 and got the constant crashes “panic: device present in global list but missing as device/fileinfo entry” I decided to take the shortcut and start once more a fresh scan after deleting the index-DB folder with v1.8.0-rc.3. Since I did not change anything else I had expected the constant crashing problem to reappear but now everything is up2date and correctly accounted “in sync” on all three devices. Can’t explain this, because rc3 was only about fixing the versioning config bug but OK. It’s short before my vacation so sorry for going so fast now - the customer needs this up and running, that’s prio one. I will provide more logs and debugging if this tricky bug reappears in the future.

Thanks for your help and insights so far. :slight_smile:

EDIT: Conclusion is, the error was on device C because I didn’t tinker anymore with A and B to solve this. The database reset on C did help, as far, as you don’t do restarts manually during “Syncthing is under heavy scanning load”.

@imsodin I don’t know if this is in generic of any help, but I’ve upgraded the cluster to v1.9.0-rc.1 and I observed “something positive”.

The cluster was looking fully in sync and up2date before the upgrade on v1.8.0. File counts were also ok on the ST web UI.

After upgrading to v1.9.0-rc.1, all nodes presented me a list of recent changes on the web UI “out of a sudden”. Every list includes a handful (~ 27) deleted files that physically no longer existed in the shared folders since weeks ago in the shared folders. The cluster somehow still kept them in the database “by error”.

Every Web UI shows a list of deleted items looking like this, but different items. So whatever was wrong “in the past” or in “old versions”, it seems fixed now :slight_smile: . Comparing file counts on the web UI again after upgrade, they all match. Everything is told “up2date” now and I don’t see any more problems. The log also shows no errors, just the normal upgrade messages like “performing db upgrade - it will take while” and so on.

Good work!

Letzte Änderungen
Gerät	Aktion	Typ	Ordner	Pfad	Zeit
device	deleted	file	Installationsquellen	Image-W10/SoftwarePackage/7	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/SoftwarePackage/2020-03-27_Win10_SysPrep_Software - Kopie.zip	2020-08-19 09:57:50
device	deleted	dir		Installationsquellen	Image-W10/DriverPackage/Neuer Ordner	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/Drv-Upd-1903/2	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/Drv-Patches-1903/Intel-Network-DrvUpd_v12.18.9.10.msi	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/Drv-Patches-1903/2	2020-08-19 09:57:50
device	deleted	dir		Installationsquellen	Image-W10/DriverPackage/Drv-Patches-1903	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/6	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/5	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/3	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2020-07-08_Win10_SysPrep_Drivers_vhdx.rar	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2020-07-06_Win10_SysPrep_Drivers.zip	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2020-07-06_Win10_SysPrep_Drivers.rar	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2019-08-19_Win10_SysPrep_Drivers.zip	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2019-07-22_Win10_SysPrep_Drivers.zip	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2018-10-26_Win10_SysPrep_Drivers.zip	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Image-W10/DriverPackage/2017-08-24_Win10_SysPrep_Drivers.zip	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/6	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/2017-04-25_Samsung_NVME_Legacy_W7-8_Driver_2.1.0.1611_w10_amd64_2016/secnvme.sys	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/2017-04-25_Samsung_NVME_Legacy_W7-8_Driver_2.1.0.1611_w10_amd64_2016/secnvme.inf	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/2017-04-25_Samsung_NVME_Legacy_W7-8_Driver_2.1.0.1611_w10_amd64_2016/secnvme.cat	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/2017-04-25_Samsung_NVME_Legacy_W7-8_Driver_2.1.0.1611_w10_amd64_2016.zip	2020-08-19 09:57:50
device	deleted	dir		Installationsquellen	Win10CriticalDDrv/2017-04-25_Samsung_NVME_Legacy_W7-8_Driver_2.1.0.1611_w10_amd64_2016	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/2017-04-25_Intel_iaNVMe_Win10_1.5.0.1002.zip	2020-08-19 09:57:50
device	deleted	file	Installationsquellen	Win10CriticalDDrv/2017-04-25_Intel_iaNVMe_Win10_1.5.0.1002.exe	2020-08-19 09:57:50

I’d attribute that to fixing https://github.com/syncthing/syncthing/issues/1787. The github count nowadays is approach 7000 - this has been a long time coming :slight_smile:

@imsodin I didn’t expect that to be the problem because all of the folders synced are “managed” by me and I was aware of this issue thus never doing case-only renames from the very beginning.

But: Something strange is happening. I’m not experiencing a negative impact, but two of the three nodes emit “deleted file” events on every restart of Syncthing to the “recent changes” UI. It’s strange because those files physically weren’t touched on disk and there’s also no syncing activity. Seems like a database cleanup routine doing those reports over and over again.

According to the view of the cluster presented above:

  • devA “the sender” doesn’t show any recent changes on the Syncthing UI. Restarting ST or hitting “rescan all” doesn’t show recent changes.
  • devB + devC “the receivers” show a constant list of recent changes (deletes) that are “phantom”. This started after my first manual restart of ST 1.9.0-rc.1 (after the upgrade).

We should maybe snapshot the DB and exchange it?

Here’s how the symptom looks like on devB: On devB:

Gerät	Aktion	Typ	Ordner	Pfad	Zeit
devA	deleted	file	Dokumentation	Witzig/Explorer_Oli_734_PB.jpg	2020-08-19 10:28:56

After restart of devB:

Gerät	Aktion	Typ	Ordner	Pfad	Zeit
devA	deleted	file	Dokumentation	Witzig/Explorer_Oli_734_PB.jpg	2020-08-19 10:39:37

So whenever you restart devA, those phantom deletions show up in the recent changes on the other devices, correct?

Anything noteworthy in the logs on devA? If not, could you restart again with STTRACE=model,db,scanner set please. I suspect the db snapshot might not show that much here, the updated mod time points at a scanning/deletion detection issue.

@imsodin devA shows an empty “recent changes” list on its web UI. devB and devC show the deletions. And if I restart devB (or devC), they show up the same deletions again. I’ll try the traces tomorrow and report back if I find anything useful in them.

@imsodin: I’ve observed more details.

a) It is sufficient to hit “Rescan All” on the affected devices to see the timestamps of the “recent changes / deleted” items jump to current. Not neccessary to fully restart Syncthing to reproduce this.

b) The list of “recent changes / deleted” items does NOT change in regard to the files affected. I first thought this but the cause I got it wrong was the list only shows the last X changes and I have a lot of files affected by this.

c) I shut down Syncthing on devA and devB (during “all up2date” state of course). If I now restart devC (or hit “rescan all”), I can still see the timestamps of the affected files jump up in the recent changes dialog.

d) I’ve turned logging on by “export STTRACE=model,db,scanner” on devB and devC. I can’t find any errors in it. It lists all file names it encounters during scan (sensitive information, can provide via PM if needed). It does NOT have lines about the “deleted phantom files” (see post above for an example). This fits into the picture because the files are (correctly) not on disk. No node has them. The “deleted phantom” file names aren’t in the log if I look through it with CTRL+F.

How to go on from here? Why does Syncthing not remove those files finally from the database if they are relics of the past and have been correctly deleted long ago?

UPDATE: e) Updated the cluster to v1.9.0-rc3. The situation didn’t change. The problem is still there.

If there’s nothing in the db log, it means those files are somehow sent unchanged as updates to the db. I did changes to how existing items in the db are checked for deleted and ignored ones, so that’s a suspect. Unfortunately debug logging is (too) sparse there. I opened a PR with more debug logging, it would be great if you could run that and report back if something about those files turns up in there (scanner debug facility isn’t actually needed): https://github.com/syncthing/syncthing/pull/6918

Deleted items don’t get removed from db, that’s not a problem.

Ok, I’ve manually updated to ( https://build.syncthing.net/repository/download/Syncthing_BuildLinux/78082:id/syncthing-linux-amd64-v1.9.0-rc.3.dev.1.g5d493ebe.tar.gz ) on devC.

Using:

export STTRACE=model,db

I’ve identified one file shown on the recent changes last X list:

[folder “install”]/Utilities/TestDisk_6.13/testdisk_win.exe

Filtering the log for that filename gives me:

[VPYAM] 2020/08/20 18:25:55.608727 folder.go:644: DEBUG: changing localflags on deleted item File{Name:"Utilities/TestDisk_6.13/testdisk_win.exe", Sequence:2721, Permissions:0555, ModTime:2020-08-17 15:44:39 +0200 CEST, Version:{[{LRORKSI 2} {VBHJ3YR 1597671879} {VPYAMUE 1595941224} {ZTMOUGV 1} {7E4EKDA 1}]}, Length:0, Deleted:true, Invalid:false, LocalFlags:0x8, NoPermissions:false, BlockSize:131072}
[VPYAM] 2020/08/20 18:25:55.608734 set.go:243: DEBUG: n4q4z-ohpfz GetGlobal(Utilities/TestDisk_6.13/testdisk_win.exe)
[VPYAM] 2020/08/20 18:25:55.623695 lowlevel.go:200: DEBUG: not inserting unchanged (local); folder="n4q4z-ohpfz" File{Name:"Utilities/TestDisk_6.13/testdisk_win.exe", Sequence:2721, Permissions:0555, ModTime:2020-08-17 15:44:39 +0200 CEST, Version:{[{LRORKSI 2} {VBHJ3YR 1597671879} {VPYAMUE 1595941224} {ZTMOUGV 1} {7E4EKDA 1}]}, Length:0, Deleted:true, Invalid:false, LocalFlags:0x0, NoPermissions:false, BlockSize:131072, Blocks:[], BlocksHash:}

This file isn’t present on disk. When I started Syncthing after the upgrade, I made sure the node had no connection to other Syncthing nodes and kept it offline.

Ok, found it - thanks. It’s RC material so no need for an issue: https://github.com/syncthing/syncthing/pull/6919

1 Like

@imsodin: I’ve tried the build ( https://build.syncthing.net/viewLog.html?buildId=78119&buildTypeId=Syncthing_BuildLinux&tab=artifacts#%2Fsyncthing-linux-amd64-v1.9.0-rc.3.dev.2.g6a7418eb.tar.gz ) from PR 6919 on devC and the problem is solved.

The STTRACE log doesn’t show any errors as far as I can tell.

After that, I updated devB as well and put devB+devC online. They show recent changes that are expected and contain my files I’ve changed yesterday. All good so far.

Downloaded build ( https://build.syncthing.net/repository/download/Syncthing_BuildWindows/78120:id/syncthing-windows-amd64-v1.9.0-rc.3.dev.2.g6a7418eb.zip ) and applied it to devA.

The cluster is running fine now and fully up2date. No more strange things observed. Good work :-)! Thanks.

1 Like

Thank you for noticing, reporting and debugging!