Out of Sync when nothing changed (linux, macOS)

Do you have failed items in the UI? (This is where screenshots would answer the rest of the qiestions).

So, I let Syncthing run overnight on both device to allow the sync state to stabilize. During this process, there were failed items, but this morning there are none. Of the 17 folders on the iMac, 3 are out of sync and 14 are up to date. All 13 synchronized to the Linux box are up to date.

Although the failures are temporary and I don’t think they should affect final synchronization outcome, I’ll briefly describe what know about them so far. I have observed two reasons for failures: too many open files and generic failure.

Generic failures happen on the Linux box. I don’t know what they mean and have not had a chance to explore them.

Too many open files happens on the iMac. It happens most frequently when Syncthing has a lot of work in progress. When failures are occurring, lsof shows just 250-260 files open for Syncthing and about 17,000 open system wide. Since ulimit -n shows 4864 open files per process and kern.maxfiles is 196,608, too many open files does not yet make since to me–but my knowledge in this area limited and mostly theoretical.

Note: Below I use the terms source and destination. I realize that since both folders are send-receive, that these terms are technically incorrect. I chose these terms because the destination started empty and no changes were made during synchronization. Thus, the end effect should be the same as performing a simple copy from the source to the destination.

THE SOURCE iMac

Figure 1 shows the source folder on the iMac. There is a wide variety of files types in the folder (although this should not matter). Figure 2 shows the four out of sync files. Note the difference between the global and local states. There are 11,502 files that should be synchronized. Leaving out the 4 that are out of sync yields 11,498, which is the local state. There are also 4 ignored files (all macOS .DS_Store files); but, these should not be included in the counts.

I cannot account for the difference between the local and global state counts. find folder -type f | wc -l confirms there are exactly 11,506 files in the source folder as expected (11,498 + 4 ignored + 4 out of sync). I don’t yet know if this difference is relevant. However, I have two other out of sync folders on the iMac; and, the counts are different on those too.

Figure 1. Source Folder 1%20Source%20Folder%20-%20annotated

Figure 2. Out of Sync Files 2%20Out%20of%20Sync%20items


THE DESTINATION Linux Box

Figure 3 shows the the destination folder on the Linux box. This folder started empty and is up to date according to Syncthing. All 11,498 files from the iMac’s local state have been synchronized.

Note: Watch is disabled in the Linux box because all changes will on other devices; thus, watch does not seem necessary.

Figure 3. Destination Folder 4%20Destination%20Folder%20-%20annotated


LOOKING FOR AN EXPLAINATION

Previously I said that I did not believe this was a bug. For now, I still believe this. However, it is not resolved by running rescan on either or both devices.


APPENDIX: DEVICE INFORMATION

The device information for iMac and Linux box are shown in figures 4 and 5 respectively.

Figure 4. Source Device 3%20Source%20Device%20-%20annotated

Figure 5. Destination Device 5%20Destination%20Device%20-%20annotated


Any insight will be greatly appreciated.

Thanks

I suspect it is a bug, maybe related to the ignore patterns? Either the four named files are part of the 11498 files that are common to both machines, and they should sync, or they are part of the 18 ignored files and should be ignored. If they are part of the common files, maybe touch them on the Linux side to kick things loose…

The .stignore on the iMac is below. On the Linux box, .stignore is empty. Nothing in the .stignore should ignore the 4 out of sync items (and if something did, hundreds of other items should also be ignored). Plus, all four out of sync files shown in Figure 2 above are on both devices and their MD5s identical (sorry I didn’t check this before).

So, maybe it is a bug. Anything else I can test?

.stignore

#include .stglobalignore

.stglobalignore (this is a symlink)

# Various temporary files
*~
~$*
.~lock.*
~*.tmp
*.~*
._*
.*.sw?
*.part
*.filepart
*.crdownload
*.kate-swp
*.gnucash.tmp-*
*.unison
.nfs*

# Stuff macOS keeps locked
*.photoslibrary

# Git folders and files
(?d).git
(?d).gitignore

# macOS system folders and files
/.DocumentRevisions-V100
/.Spotlight-V100
/.TemporaryItems
/.Trashes
/.Trash-*
/.apdisk
/.directory
/.fseventsd
(?d).DS_Store
(?d).ds_store
(?d)Icon\r*

# Windows system folders and files
/System Volume Information
(?d)Thumbs.db

# Other system folders and files
/.fuse_hidden*

# Synchronization folders and files
/.csync_fournal.db*
/.sync
/.sync-exclude.lst
/._sync_*
/.synckron.*

Thanks

If you can, ideally use the rest api to see what Syncthing thinks differs:

https://docs.syncthing.net/rest/db-file-get.html

Okay. I touched all 4 out of sync items on the Linux box and kicked off a rescan. The folder changed to up to date on the iMac and the item counts remained the same. So I tried another experiment.

Since I didn’t really need the folder containing the 4 files that were out of sync, I deleted it on the iMac. The local and global state counts dropped to 7,631 and 7,649 respectively on the iMac. On the Linux box, both dropped to 7,631. Again, running find ... | wc -l confirms 7,631 files on both devices.

At this time, short of running a checksum on each file, I believe the folders are identical on both devices. The local state counts on both devices are the same and match actual number of file system objects (see table below). Only the global state count on the iMac seems amiss.

type     iMac   Linux  Command
-------- -----  -----  ----------------------------------------
files    7,631  7,631  find * -type f ! -name .DS_Store | wc -l
symlinks    20     20  find * -type d | wc -l
folders  1,210  1,200  find * -type l | wc -l
-------- -----  -----  ----------------------------------------
Total    8,861  8,861  find * | wc -l

Thanks

On the iMac, I ran and db-file-get as suggested on all items returned by db-browse-get; and, then again on all listed by find (the lists were identical except for ignored items). The db-file-get results matched on all attributes (except for sequence) on all 7,657 items. However, in the middle of all this, the folder finally synced up. I suspect the addition of scripts and temporary files caused re-syncs and one of them finally did the trick.

So, I moved on to a different out of sync folder. This time I was careful not to add any new files that might cause a re-sync. Running db-need-get returned the same 21 items shown in the UI. Running db-file-get on all items returned by db-browse-get found no attribute differences just like above. Example db-need-get and db-file-get results follow.

db-need-get

{
  "page": 1,
  "perpage": 65536,
  "progress": [],
  "queued": [],
  "rest": [
    {
      "deleted": false,
      "ignored": false,
      "invalid": false,
      "localFlags": 0,
      "modified": "2016-11-24T10:33:26-05:00",
      "modifiedBy": "XXXXXXX",
      "mustRescan": false,
      "name": "CARLY SIMON/01 That's the Way I've Always Heard It Should Be.aiff",
      "noPermissions": false,
      "numBlocks": null,
      "permissions": "0777",
      "sequence": 12899,
      "size": 99553168,
      "type": "FILE",
      "version": [
        "XXXXXXX:1"
      ]
    },
    . . . 20 other items
}

db-file-get

{
  "availability": [
    {
      "id": "REMVS26-LZ3DKR5-GABOS7J-SZVI5UD-VQNP7AQ-PQWON7O-LVM2FLF-7OGDXQO",
      "fromTemporary": false
    }
  ],
  "global": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2016-11-24T10:33:26-05:00",
    "modifiedBy": "XXXXXXX",
    "mustRescan": false,
    "name": "CARLY SIMON/01 That's the Way I've Always Heard It Should Be.aiff",
    "noPermissions": false,
    "numBlocks": 760,
    "permissions": "0777",
    "sequence": 12899,
    "size": 99553168,
    "type": 0,
    "version": [
      "XXXXXXX:1"
    ]
  },
  "local": {
    "deleted": false,
    "ignored": false,
    "invalid": false,
    "localFlags": 0,
    "modified": "2016-11-24T10:33:26-05:00",
    "modifiedBy": "XXXXXXX",
    "mustRescan": false,
    "name": "CARLY SIMON/01 That's the Way I've Always Heard It Should Be.aiff",
    "noPermissions": false,
    "numBlocks": 760,
    "permissions": "0777",
    "sequence": 18155,
    "size": 99553168,
    "type": 0,
    "version": [
      "XXXXXXX:1"
    ]
  }
}

Thanks.

You redacted the only useful fields. Unnecessarily.

Apologies. Redacting ids is my default unless I clearly understand their security implications. They were all the same: 3OYVDAZ.

I found that 1.3.0 was released shortly after I installed 1.2.2. After upgrading both the iMac and Linux, Out of Sync remained on the iMac. Following up on your touch suggestion, I then moved each of files the out of sync file out of the sync folder tree, waited for rescan to finished, then moved the files back. Now all is in sync.

I propose an experiment. I’ll delete the sync folder from both devices and recreate it using the same steps as before (create on iMac, let scan finish, add Linux to sync). During this, I’ll check any data structures and API calls you think might shed light on the issue.

Thanks

Right. Then the files (“local” and supposed “need”) are in fact identical and the indication that it needs them is a bug. I think doing something to the file, like updating the mtime on any device, should fix it.

To see how it ends up needed while it shouldn’t, you could enable model,db debug logging during the entire process (this will log all the db operations, model for context). Just do that to a separate log file, definitely not system log or any such, as it will become large. Then either grep for the filenames in question or just make the whole thing available so we can.

Okay. Running with model,db now.

I ran the experiment I described above using v1.3.0. The Syncthing folder were delete on both the iMac and Linux box. The filesystem folder was also deleted on the Linux box. Then,

  1. Recreated the Syncthing folder on the iMac.
  2. Waited for initial scan to complete.
  3. Shared the folder with the Linux box.
  4. Added the folder on the Linux box with Watch for Changes off, Full Rescan Interval 1 day, File Pull Order oldest first.

All worked as expected. No sign of the bug. And, no failures failures during step 2 either.

I am currently running it again with model,db logging. If it succeeds, I am thinking off forcing a rescan of all 15 other Syncthing folder while the initial scan is running to force Too Many Open Files errors, which I think might be related. (Really, its my only current good hypothesis.) Other suggestions welcome.

Thanks, Scott

So I complete the test in my last message and everything work. So, I completely removed my 8 archive folders and began again:

  1. Removed the Syncthing folders from both the iMac and Linux box.
  2. Removed the filesystem folders on the Linux box.
  3. Added model,db debug logging as suggested by @imsodin.
  4. Recreated the 8 Syncthing folders on the iMac (the source) without waiting for scans to complete. This is 2.1TB of data across 47,400 files in 4,810 folders.
  5. Added the 8 folders on the Linux box (the destination).
  6. Waited for synchronization to complete overnight.

This time there is a single file “stuck” out of sync in Music Archive. The other 7 folders complete successfully. Running db-need-get yields the JSON in Figure 1 (note: long, common path prefix elided from figure and logs):

Figure 1. db-need-get results

{
  "page": 1,
  "perpage": 65536,
  "progress": [],
  "queued": [],
  "rest": [
    {
      "deleted": false,
      "ignored": false,
      "invalid": false,
      "localFlags": 0,
      "modified": "2009-11-05T12:11:20-05:00",
      "modifiedBy": "3OYVDAZ",
      "mustRescan": false,
      "name": ".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac",
      "noPermissions": false,
      "numBlocks": null,
      "permissions": "0755",
      "sequence": 17129,
      "size": 119997437,
      "type": "FILE",
      "version": [
        "3OYVDAZ:1"
      ]
    }
  ]
}

Grepping the logs for “(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac” yielded 3,175 lines. Of these, 1,637 are for the out-of-sync file; and, 1,538 are for two other files with the same name. I include these other 2 because they were synced in the same time frame as the out-of-sync file.

The only clear anomalies (to me at least) occur at 2019/10/17 16:16:58. There are six log entries caused by too many open files. The Linux box log what appears to be a corresponding generic error. (The iMac and Linux system clocks are within a few milliseconds.) At about this same time, the Out of Sync status appears in the iMac’s UI.

Three log extracts are attached:

  • Syncthing-iMac-BadFile.log (654.6 KB) contains the log entries for the out-of-sync file.
  • Syncthing-iMac-Others.log (808.4 KB) contains the log entries for the other 2 files with the same name.
  • Syncthing-Linux.log (215 Bytes) contains the single Linux log entry that references the file name; however, it is one of the other 2 files, not the out-of-sync file.

There were no error log entries on the iMac for the entire synchronization period.

I’m out of ideas for now. Suggestions welcome.

Thanks, Scott

The file is twice inserted to the db, once when it is detected locally and once when a remote gets it. What’s wrong is that upon adding the file for the remote, it “forgot” about the local file in the global list and thus marks it as needed.

[3OYVD] 2019/10/17 16:03:36.855113 instance.go:100: DEBUG: insert (local); folder="cjpsl-dmvvp" File{Name:".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac", Sequence:17211, Permissions:0755, ModTime:2009-11-05 12:11:20 -0500 EST, Version:{[{3OYVDAZ 1}]}, Length:119997437, Deleted:false, Invalid:false, LocalFlags:0x0, NoPermissions:false, BlockSize:131072, }
[3OYVD] 2019/10/17 16:03:36.855672 transactions.go:105: DEBUG: update global; folder="cjpsl-dmvvp" device=7777777-777777N-7777777-777777N-7777777-777777N-7777777-77777Q4 file=".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac" version={[{3OYVDAZ 1}]} invalid=false
[3OYVD] 2019/10/17 16:03:36.855751 transactions.go:162: DEBUG: new global for ".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac" after update: {{{[{3OYVDAZ 1}]}, 7777777-777777N-7777777-777777N-7777777-777777N-7777777-77777Q4}}
[3OYVD] 2019/10/17 16:03:36.855881 instance.go:108: DEBUG: adding sequence; folder="cjpsl-dmvvp" sequence=17211 .../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac
[3OYVD] 2019/10/17 16:17:24.486023 instance.go:53: DEBUG: insert; folder="cjpsl-dmvvp" device=REMVS26-LZ3DKR5-GABOS7J-SZVI5UD-VQNP7AQ-PQWON7O-LVM2FLF-7OGDXQO File{Name:".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac", Sequence:17129, Permissions:0755, ModTime:2009-11-05 12:11:20 -0500 EST, Version:{[{3OYVDAZ 1}]}, Length:119997437, Deleted:false, Invalid:false, LocalFlags:0x0, NoPermissions:false, BlockSize:131072, }
[3OYVD] 2019/10/17 16:17:24.486216 transactions.go:105: DEBUG: update global; folder="cjpsl-dmvvp" device=REMVS26-LZ3DKR5-GABOS7J-SZVI5UD-VQNP7AQ-PQWON7O-LVM2FLF-7OGDXQO file=".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac" version={[{3OYVDAZ 1}]} invalid=false
[3OYVD] 2019/10/17 16:17:24.486264 transactions.go:176: DEBUG: local need insert; folder="cjpsl-dmvvp", name=".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac"
[3OYVD] 2019/10/17 16:17:24.486334 transactions.go:162: DEBUG: new global for ".../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac" after update: {{{[{3OYVDAZ 1}]}, REMVS26-LZ3DKR5-GABOS7J-SZVI5UD-VQNP7AQ-PQWON7O-LVM2FLF-7OGDXQO}}

Then later on it tries again and again fruitlessly to get back in sync, without throwing an error:

[3OYVD] 2019/10/17 16:51:14.730160 folder_sendrecv.go:1168: DEBUG: sendreceive/cjpsl-dmvvp@0xc005f18000 taking shortcut on .../PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac

The too many open files falls square into the middle of the two. So we’d need to know what exactly went on there and generally have more context of what it’s doing. Ideally you could post the entire log. Or at least include all info level logs and above (that should work: grep -P '(INFO|WARNING|PACM067/(05) [Trio Santoliquido, Bruno Giuranna] Piano Quartet 2- 1st mvt. - Allegro non troppo.flac)).

The log files I uploaded include every entry about the file. Logging was set to model,db as you suggested above. I’ll be glad to testagain with any additional logging as necessary. Just let me know what is best.

Thanks, Scott

I don’t need new logs, I need the complete picture of the existing ones. Only the entries containing that filename do not give that. That’s why I proposed to share all logs or to get rid of all the debug logging about other files, all the info/warning level logs in addition to the debug level logging about the specific file you already provided.

Got it (so much for reading regex on a small screen). INFO_WARNING_file.log.bz2 (217.5 KB) contains the results from the grep.

The entire log is about 21M lines and nearly 505MB xz compressed. Even the 1 hour from 16:00-17:00 is 2.5M lines and 64MB compressed with xz. I doubt the forum will allow these to be uploaded. I’ll be glad to FTP drop them or something though.

Thanks, Scott

So, everything seems to be working fine for the past 2 week. However, I have not created any new large sync folders. Were the logs helpful at at all?

Thanks, Scott

They don’t show what exactly happened, but you do have tons of too many open file errors all over the log. I assume that a write operation on the db failed due to that without producing an error, resulting in the inconsistent state I described above. So clearly we (or our db layer) could be more secure (or at least verbose) about such a problem (if my theory is correct), but the main problem is the file handle limitation by the system.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.