Folders stuck trying to sync existing files

I have encountered this problem today.

I believe that this happened after upgrading the Syncthing binaries on Devices A and B. I have also tried to upgrade the binary on Device C today, but it has made no difference. Also, all three devices are running my self-compiled build, which is v1.12.0 with a few tweaks (see details).

I have compared a directory from one of the folders that are now stuck in sync, and this the result.

Device A:

{
  "availability": [
    {
      "id": "7LDIBZ4-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX",
      "fromTemporary": false
    }
  ],
  "global": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2020-11-10T00:21:13.2285041+09:00",
    "modifiedBy": "SWPW3LH",
    "mustRescan": false,
    "name": "xxxxxxx",
    "noPermissions": true,
    "numBlocks": 0,
    "sequence": 361637,
    "size": 128,
    "type": "FILE_INFO_TYPE_DIRECTORY",
    "version": [
      "SWPW3LH:1605597562"
    ]
  },
  "local": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2020-11-10T00:21:13.2285041+09:00",
    "modifiedBy": "SWPW3LH",
    "mustRescan": false,
    "name": "xxxxxxx",
    "noPermissions": true,
    "numBlocks": 0,
    "sequence": 361637,
    "size": 128,
    "type": "FILE_INFO_TYPE_DIRECTORY",
    "version": [
      "SWPW3LH:1605597562"
    ]
  }
}

Device C:

{
  "availability": [
    {
      "id": "D4DZUGP-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX",
      "fromTemporary": false
    },
    {
      "id": "7LDIBZ4-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX",
      "fromTemporary": false
    }
  ],
  "global": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2020-11-09T15:21:13.2285041Z",
    "modifiedBy": "SWPW3LH",
    "mustRescan": false,
    "name": "xxxxxxx",
    "noPermissions": true,
    "numBlocks": 0,
    "sequence": 108916,
    "size": 128,
    "type": "FILE_INFO_TYPE_DIRECTORY",
    "version": [
      "SWPW3LH:1605597562"
    ]
  },
  "local": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2020-11-09T15:21:13.2285041Z",
    "modifiedBy": "SWPW3LH",
    "mustRescan": false,
    "name": "xxxxxxx",
    "noPermissions": true,
    "numBlocks": 0,
    "sequence": 108916,
    "size": 128,
    "type": "FILE_INFO_TYPE_DIRECTORY",
    "version": [
      "SWPW3LH:1605597562"
    ]
  }
}

Do you have any idea what the culprit may be? I am guessing that I can “fix” the issue with STRECHECKDBEVERY=1s, but I would like to investigate a little bit first before doing so.

Just for the record, the actual local and global states on all three devices are exactly the same.

The v1.12.1-rc.1 maybe could fix such stuck problems.

2 Likes

The db/file output looks good, except that A doesn’t know that C has the file, i.e. indexes are mising. As Andy said there have been related fixes in v1.12.1, so I’d fix it now with STRECHECKDBEVERY=1s as proposed and please report if it occurs again after upgrading to v1.12.1.

1 Like

Thank you both for the tips :slightly_smiling_face:. I will try upgrading to v1.12.1 once the stable release is out and see whether the problem goes away. The files themselves seem to all sync properly despite the error, so this seems more of a cosmetic issue right now.

Re-reading the answer, I guess that you mean that using the check is required to fix the errors that have already popped up, and upgrading to v1.12.1 will not help with the already existing ones. Is this correct?

Yes.

1 Like

I have been trying to fix the problems after upgrading to v1.12.1, but the errors do not seem to go away no matter what…

  1. I tried STRECHECKDBEVERY=1s on all devices first, but the same problematic files were still stuck in sync even after doing so.

  2. Next, I tried -reset-deltas, also on all devices, and this did force them to re-index everything.

  3. However, the problem is still here, although the specific numbers and size of the stuck files have changed.

I have a feeling that only a complete wipeout of the databases will finally get rid of these errors, although I still have no idea why they popped up in the first place :sweat:.

Here are two examples.

  • 7LDIBZ4 → D4DZUGP

    File “example1” on 7LDIBZ4:

    {
      "availability": [
        {
          "id": "TX7YPBI",
          "fromTemporary": false
        }
      ],
      "global": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-09-11T08:54:42.4741678+09:00",
        "modifiedBy": "7LDIBZ4",
        "mustRescan": false,
        "name": "example1",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 58406,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "7LDIBZ4:1600015683"
        ]
      },
      "local": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-09-11T08:54:42.4741678+09:00",
        "modifiedBy": "7LDIBZ4",
        "mustRescan": false,
        "name": "example1",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 58406,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "7LDIBZ4:1600015683"
        ]
      }
    }
    
    

    File “example1” on D4DZUGP:

    {
      "availability": [
        {
          "id": "TX7YPBI",
          "fromTemporary": false
        },
        {
          "id": "7LDIBZ4",
          "fromTemporary": false
        }
      ],
      "global": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-09-11T08:54:42.4741678+09:00",
        "modifiedBy": "7LDIBZ4",
        "mustRescan": false,
        "name": "example1",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 59163,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "7LDIBZ4:1600015683"
        ]
      },
      "local": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-09-11T08:54:42.4741678+09:00",
        "modifiedBy": "7LDIBZ4",
        "mustRescan": false,
        "name": "example1",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 59163,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "7LDIBZ4:1600015683"
        ]
      }
    }
    
  • TX7YPBI → D4DZUGP

    File “example2” on TX7YPBI:

    {
      "availability": [
        {
          "id": "7LDIBZ4",
          "fromTemporary": false
        }
      ],
      "global": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-11-09T15:21:13.2285041Z",
        "modifiedBy": "SWPW3LH",
        "mustRescan": false,
        "name": "example2",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 108916,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "SWPW3LH:1605597562"
        ]
      },
      "local": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-11-09T15:21:13.2285041Z",
        "modifiedBy": "SWPW3LH",
        "mustRescan": false,
        "name": "example2",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 108916,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "SWPW3LH:1605597562"
        ]
      }
    }
    

    File “example2” on D4DZUGP:

    {
      "availability": [
        {
          "id": "TX7YPBI",
          "fromTemporary": false
        },
        {
          "id": "7LDIBZ4",
          "fromTemporary": false
        }
      ],
      "global": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-11-10T00:21:13.2285041+09:00",
        "modifiedBy": "SWPW3LH",
        "mustRescan": false,
        "name": "example2",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 361637,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "SWPW3LH:1605597562"
        ]
      },
      "local": {
        "deleted": false,
        "ignored": false,
        "invalid": false,
        "localFlags": 0,
        "modified": "2020-11-10T00:21:13.2285041+09:00",
        "modifiedBy": "SWPW3LH",
        "mustRescan": false,
        "name": "example2",
        "noPermissions": true,
        "numBlocks": 0,
        "sequence": 361637,
        "size": 128,
        "type": "FILE_INFO_TYPE_DIRECTORY",
        "version": [
          "SWPW3LH:1605597562"
        ]
      }
    }
    

Strangely enough, in both cases everything shows up as “Up to Date” on D4DZUGP.

I have tried using -reset-deltas multiple times, and the thing is that the files seem to differ every time the command has been run. Also, as before, the actual folders seem to be all in sync… meaning that the local and global states, etc. all match. It is only Syncthing that seems to be trying to sync the already existing entries.

Is there anything else that I could do to troubleshoot the problem? At the moment, I am little bit fed up, so I am considering just backing up all the databases, setting everything to RO, and then just doing a full wipeout :wink:.

Which kind of devices you have?

The three devices in question are a high-end Windows PC (D4DZUGP), a low-end Windows PC (7LDIBZ4), and a low-end Android phone (TX7YPBI). I have more devices connected, but they all seem not to be affected, at least for now.

I have a nasty feeling that the issue may be related to nested folders, but I have not been able to reproduce this behaviour in my test installation yet.

Actually, now that I am thinking about it, this whole thing may have started when I “un-nested” some of my folders. I will need to do some testing to prove it though.

If that’s means maybe you have had subfolders are also connected in Syncthing in the past or it was only cleaning and optimation?

I do not think that I understand the question :woozy_face:. What do you mean exactly?

I have eventually decided to stop all syncing activities and wipe out the databases one by one. This way I have managed to get everything to the “Up to Date” state with no visible errors.

I have also tried very hard to reproduce the issue in my test installation using multiple nested folders, etc., but strangely enough Syncthing always managed to get through it with no errors, so I am really unsure what the problem might have been.

By that I meant when you set up a connection between device 1 and 2 in a directory A with subdirectory AB. And then there is a connection between

Device 1 Directory A <-> Device 2 Directory B   
Device 1 Directory AB <-> Device 2 Directory C

then this would be nested for Directory A and Directory AB.

Now I see what you meant. What I have been doing though is rather something like this:

Device 1
   Directory A
      Directory B
      Directory C
Device 2
   Directory A
      Directory B
      Directory C
Device 3
   Directory B
   Directory C

I think this is an issue that’s been popping up for the last two - three months in various forms. I have mentioned it, others too. So far the only cure is to blow away all the index folders and start again, but after a few weeks it comes back.

There’s no obvious reason for the cause, take for example this…

image

It’s physically the same computer just two user profiles that’s created two IDs. This is a send only computer (the image is from the receive only end) and looking at exactly the same data yet there’s a difference. Nothing nested, full access, Windows based installation.

My feeling is something is upsetting the db, maybe under heavy load the data isn’t being written correctly or being missed and it slowly falls out of sync between the nodes, eg, St says the data was written, db says not. This is a view based on when I make changes to the advanced configuration and is St is busy the changes are not reflected in the config.xml file

I think there is no load problem. In my installations I run around 60 folders in which log files with kB and databases with GB and more are moved every day. That works fine. In one case, I transferred a file over internet during several days with connection interruptions because the servers were down for a few hours every day. Something like that also works perfectly.

In my experience, it is the changes that the administrator makes that can throw a steady system out of balance.

A few days ago I reported that I switched on a computer in a place that I do not visit so often, which was switched off for around 3-4 months. I switched on this integrated computer and that was enough to cause an imbalance without changing anything. The synchronized files were accordingly out of date. I have also renamed some files in the meantime. And version jumps have also been added.

However, there may be other circumstances as well. As a rule, I have directories that only exchange my servers or, in addition, the clients also exchange with each other, i.e. Windows and Linux Mint clients. And as a rule, I connect every client with everyone. It has happened that I have forgotten a client in a connection and if I add it, rarely this may also lead to imbalances.

I have now experienced the same issue on two other devices. The same thing with some folders getting stuck occurred there, although in one case, the global state of a new folder that was shared to that device also reported fewer files than it should. Everything did get fixed after -reset-deltas though.

Apart from the nested folders, my other suspicion is that the issue may have something to do with Syncthing upgrades, and as I compile the binaries myself, I also tend to upgrade them quite often (or at least more often than the official releases).

I have now enabled sendFullIndexOnUpgrade on all my devices. I am going to keep an eye to see whether the problem comes back or not. I also want to do more extensive testing, so that I can actually reproduce the issue in a clean environment, but time is limited, so it will probably take a while :sweat:.

Tomasz, seems we both are doing and saying the same things, my reply to you back in October

As I mentioned earlier in the thread, clearing the db in any form will fix it, but it’s not a permanent fix as the error comes back. I’m hoping that we can start testing the badger database to see if that resolves it.

As a non developer I can only make guesses and hope that something clicks with a dev. At the moment my opinion is either data isn’t being written into the database (eg, heavy load), or maybe during the compact stage something lost / corrupted. This is based on a fresh index having lots of small files then some time later they are larger. I wonder if at this point the issue creeps in.

1 Like

Yeah, although that one was supposed to have been fixed in Shutting Syncthing down while pushing files to multiple devices corrupts the database? · Issue #7036 · syncthing/syncthing · GitHub. I have a feeling that these errors are somewhat connected to that, especially since the last two devices had their Syncthing version upgraded in between the synchronisation, and right after that some of their files got stuck in sync.

The badger database is dead, as far as I know :wink: (see lib: Remove USE_BADGER experiment (#7089) · syncthing/syncthing@7892547 · GitHub).

2 Likes

I have just encountered the issue again, right after upgrading everything to v1.13.0.

Fortunately, only 1 device seems to have got stuck this time. Everything had sendFullIndexOnUpgrade enabled, but it did not prevent the problem from happening. I am now quite sure that the problem is related to the upgrade process, as I had no issues whatsoever during the last 3 weeks when everything was running v1.12.1.

Can you check which (if any) of this steps brings the device (closer to) up-to-date. Device A is the device you took the screenshot on, device B is the device that has status syncing. Ensure a bit of time to pass after each step to see if there’s a change.

  1. On A, pause a folder that is shown as out-of-sync.

  2. Same on B.

  3. On A or B, pause the other device and unpause it.

  4. On A, restart syncthing (without any special options).

  5. Same on B.

  6. On A, restart syncthing with -reset-deltas.

  7. Same on B.

1 Like

Hi Simon,

Further to our conversation about Android and SD cards…

Well FWIW, when I gave up on the SD card, and instead synced to the internal storage, my syncs too got stuck in the high 90%s.

I started the OpenBSD-side with -reset-deltas and the devices immediately re-synced to 100%.

The files that were getting stuck were a mix of directories (size 128 bytes), images files something (android music player?) had put in .thumbnails, and the occasional FLAC file. There were no errors displayed on either side, just that the folders were out of sync.

Hrm…