Syncprocess times increased

Hello everyone, I have observed an interesting behavior since I am running on v1.10 on my two biggest shared folders. The scanning times increased a lot as it is going very slowly through the file system and take a couple of days to complete (initial scan and continuing scans). Log sample for shared folder tmc-vc:

Oct 20 06:20:22 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:22.451065 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698748137/12424138566 at 0.0 MiB/s (78%)                                                                                                                                                                        
Oct 20 06:20:24 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:24.450609 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698748137/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:20:26 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:26.450950 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698748137/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:20:28 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:28.451001 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698748137/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:20:30 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:30.451323 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698748137/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:20:32 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:32.451287 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698750223/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:20:34 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:20:34.451049 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698750223/12424138566 at 0.0 MiB/s (78%)

# from 06:20:32 till 06:21:06 it seems to stuck, than

Oct 20 06:21:06 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:21:06.450766 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698751266/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:21:08 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:21:08.451085 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698751266/12424138566 at 0.0 MiB/s (78%)
Oct 20 06:21:10 syncthingsrv1 syncthing_tmc[6770]: [WWQT1F] 2020/10/20 06:21:10.450562 walk.go:177: DEBUG: Walk tmc-vc [] current progress 9698751266/12424138566 at 0.0 MiB/s (78%)

# than it stuck again for nearly 30s and so on

From the webinterface where the “Scan Time Remaining” is jumping between 12 hours and > 1 month:

Folder ID tmc-vc
Folder Path /data/shared/var/
Global State 754,936 4,783 ~131 GiB
Local State 755,683 4,782 ~131 GiB
Reduced by ignore patterns
Out of Sync Items 1,231 items, ~65.3 MiB
Scan Time Remaining ~ 3d 20h
Failed Items 119 items

On the another server which is on software level the exactly same, it is working with out any issues. The only difference is, that the working one is on running syncthing version 1.5 with 20G of RAM. In this version the servers were able to scan the full shared folder within 2 - 4 hours. In one forum post there were some talking about to less ram, so I increased it on the affected hosts to 32G.

Server specs

  • OS: Debian 10
  • CPU Cores: 8
  • Mem: 32G
  • FS: xfs inside of drbd, read: IOPS=55.0k, BW=215MiB/s (225MB/s)(3118MiB/14509msec)

Resources used

Syncthing uses 25.3% of the RAM and has a CPU usage (overall) between 480% and 620% and hase a nice level of 11 (which is default) and prio 31

Downgrade test

I have performed a downgrade for testing if it recovers and it did. So when it was back to version 1.5 it performs the scans way faster, after the upgrade to 1.10 again, it slows down.

Question

What can I do there to improve the scanning time again?

As you mention 1.10 and downgrading to 1.5 I assume you are running debian packages? Or more to the point, you haven’t been running 1.9, right?

Could you grab a cpu profile please: https://docs.syncthing.net/users/profiling.html

Yes, that is right, we are using the debian packages. I was on the 1.9 as well and had there similar issues, but had to downgrade it again very quickly so no time for testing/analyzing

Will provide the profile in some seconds

syncthing-cpu-linux-amd64-v1.10.0-082416.pprof (68.1 KB)

No need to test anymore then, my suspected culprit was added in v1.9.0 which matches your experience. And the cpu profile corroborates it: The added case-insensitivity safety mechanism uses almost all of the cpu time. Compared to another report about slowdown it isn’t filesystem access here, but looping/in memory operations that use the time.
Anyway the same config to fix it for you and info that might help us improve it for everyone as in the other thread applies:

Of note in the profile: Operations through db walking and osutil.IsDeleted result in 46s spent in “case-code” out of total 88s, while for filesystem walking the “case-code” isn’t even listed as consuming any time at all. Walking the fs we process all items along the filesystem tree then alphabetically, walking the db it’s directly alphabetically. I don’t understand how that would cause the discrepancy though. Needs some more thought.

Thanks for the fast first analyse. Well, I guess you mean that I just should just set caseSensitiveFS=true on the syncthing services right? Do you want me to also provide some numbers, about files and structure as you have asked in the linked post?

Sorry for not having been clear: Yes, to make your setup as fast as before, set caseSensitiveFS to true. And yes, to potentially gain insight why you experience such a strong slowdown (that’s unexpected) and thus find something to optimize, please also provide this info on files and structure.

sync folders which are working fine

foldername: cnfg

  • total subdirs: 2
  • total files: 0

foldername: kfl

  • total subdirs: 2
  • total files: 128

foldername: scrm

  • total subdirs: 2
  • total files: 0

foldername: bafls

  • total subdirs: 388231
  • total files: 197994
  • size: ~28,6G

foldername: bpr

  • total subdirs: 1757
  • total files: 300

foldername: bll

  • total subdirs: 8
  • total files: 120

foldername: blklst

  • total subdirs: 0
  • total files: 0

foldername: cbs

  • total subdirs: 2
  • total files: 0

foldername: cnn

  • total subdirs: 4
  • total files: 4

foldername: eis

  • total subdirs: 990
  • total files: 3613092
  • size: ~16,7G

foldername: rrts

  • total subdirs: 3
  • total files: 1

foldername: iptc

  • total subdirs: 2
  • total files: 1

foldername: monit

  • total subdirs: 4
  • total files: 1

foldername: mos

  • total subdirs: 351
  • total files: 2354

foldername: rik

  • total subdirs: 3
  • total files: 6

foldername: sa

  • total subdirs: 3
  • total files: 17

foldername: stt

  • total subdirs: 17
  • total files: 2349

foldername: tdo

  • total subdirs: 10
  • total files: 1068

syncfolder which are making issues

foldername: bchan

  • child dirs: 16
  • total subdirs: 83
  • total files: 808661
  • size: ~9,5G
  • list of biggest sized/amount dirs:
    1. dwnld
    • suddir: 6
    • files: 220390
    • size: 6,3G
    1. abt
    • suddir: 16
    • files: 304069
    • size: 3,2G
    1. pocssd
    • suddir: 1
    • files: 6947
    • size: 1,9G
    1. coor
    • suddir: 1
    • files: 27310
    • size: 109,2M
    1. naaed
    • suddir: 1
    • files: 123015
    • size: 490M
    1. nzed
    • suddir: 1
    • files: 27742
    • size: 110M
    1. nons
    • suddir: 1
    • files: 98598
    • size: 393,7M

foldername: cot

  • child dirs: 550
  • total subdirs: 4784
  • total files: 750307
  • size: ~133G
  • list of biggest sized/amount dirs:
    1. ct1
    • subdirs: 10
    • dir1-files: 53452
    • size: 66G
    1. ct2
    • suddir: 10
    • files: 43953
    • size: 11,6G
    1. ct3
    • suddir: 11
    • files: 19311
    • size: 7,4G
    1. ct4
    • suddir: 10
    • files: 3012
    • size: 6,9G The reset of the usbdirs is lower than 6,5G and has not that much files

foldername: rcit

  • total subdirs: 166
  • total files: 490975
  • size: ~92,7G
  • list of biggest sized/amount dirs:
    1. out
    • suddir: 143
    • files: 153174
    • size: 74,2G
    1. in
    • suddir: 19
    • files: 337810
    • size: 19,8G
    1. tmp
    • suddir: 2
    • files: 0
1 Like

I have applied the config setting for each folder arouned 4hours ago and restarted the syncthing services

<caseSensitiveFS>true</caseSensitiveFS>

and the scanning time is still slow

for folder cot:

  • Scan Time Remaining ~ > 1 month
  • scann process till now 55%

for folder bchan:

  • Scan Time Remaining ~ 8d 9h
  • scann process till now 2%

It is always jumping between couple of hours and 1 month

Any idea what I can improve there still? Shell I recreate the cpu dump?

many thanks

Thanks for the info! That’s consistent with the cpu info: You have a huge amount of files per directory, and for every item in the directory we need to compare its name with all of the others, which just takes a lot of time.

The cpu profile would be good to see. Even if it might not contain viable info to improve the current situation, it can at least show if the case setting did take effect or something went wrong there.

there we go syncthing-cpu-linux-amd64-v1.10.0-115359.pprof (77.9 KB)

caseSensitiveFS option is working. Now most time is spent in db and matching ignores. The total cpu time used in the 30s profiling interval is virtually the same as before. Looks like that wasn’t the culprit in the first place.

The usual suspect for the bottleneck would be storage, however that should have the same effect on 1.5 and 1.10. Still just to know: Is there significant seeking/iowait? Is the db on the same storage as the folders/data? Does changing maxFolderConcurrency to 1 make any difference?

that was fast, thanks so the iowait is always at 0.13 according to iostat -x

%user   %nice %system %iowait  %steal   %idle
 2.06   31.87    4.99    0.13    0.00   60.95

Yea the db is on the same disk to make a switchovers between servers easier possible as the disk is a drbd.

I will tried the maxFolderConcurrency and see what it changes.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.