Very slow scanning (50kbps) on Ubuntu 18.04 server VM

sapphironza · October 29, 2021, 11:42am

Hi All

have been using Syncthing for Offsite and local backups of a Ubuntu webserver dealing with a lot of files (attachments)

All the nodes are running Ubuntu 18.04 VM’s on a Proxmox hypervisor. The folder I am syncing (send only on the host and receive only on the destinations)

It’s about 400GB of data, about 1.1mil files in one folder. All 3 hosts have been running fine over the last few years. A full scan on the slowest of the host takes about 11 hours running at 5MB/s in spinning disks (files and DB)

I recently setup a new VM for the webserver with a new syncthing install. I removed the folders from all the nodes (leaving data in place) and setup a new folder as before.

The destination nodes rescanned the data overnight, but the host as not even completed 1%. Its scanning at between 50kbps and 100kbps. The files are on a 4 Drive RAID 10 mechanical drive (sdb) and the db folder is on a 4 disk RAID 10 SSD (sda).

Syncthing seems to be mostly stuck on one thread running at 100% (VM has 10 threads)

top - 13:40:38 up 16:37,  1 user,  load average: 1.26, 1.40, 1.45*
Tasks: 211 total,   1 running, 117 sleeping,   0 stopped,   0 zombie*
%Cpu(s):  2.7 us,  3.1 sy, 11.4 ni, 82.1 id,  0.3 wa,  0.0 hi,  0.1 si,  0.3 st*
KiB Mem : 25661780 total,  5813364 free, 10338184 used,  9510232 buff/cache*
KiB Swap:  8388604 total,  8388604 free,        0 used. 14903748 avail Mem*

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND*
20233 workpool  31  11 4209228 2.563g  16852 S 142.9 10.5   2:08.32 syncthing*
1503 workpool  20   0 5515768 2.361g  36372 S  31.9  9.6 140:47.11 java*
1427 www-data  20   0  245420  18692   9396 S   2.7  0.1   3:48.84 nginx*
1303 mysql     20   0 23.422g 4.539g  35544 S   2.3 18.5  18:12.81 mysqld*

Disk idle times are very high

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.03    9.29    3.87    0.31    0.27   85.23

Device             tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
loop0             0.21         0.00         0.00         12          0
loop1             0.00         0.00         0.00          0          0
sdb               2.23         0.07         0.01       4287        418
sda              16.48         0.06         0.68       3443      40541

I played around with various tuning options to no effect. Eventually, I nuked the config and reinstalled syncthing with defaults. Now it’s not even connected to the other nodes, I only added the folder for scanning, still not getting anywhere.

Log file is not showing any errors.

Syncthing version is v1.18.3, Linux (64-bit Intel/AMD), previously I was using 1.7.1

Am I possibly looking at a bug, or is there something else I should look at?

AudriusButkevicius · October 29, 2021, 12:30pm

What does it report hashing speed as, on startup?

sapphironza · October 29, 2021, 2:03pm

2021-10-29 14:34:14 Single thread SHA256 performance is 397 MB/s using crypto/sha256 (396 MB/s using minio/sha256-simd).
2021-10-29 14:34:15 Hashing performance is 328.84 MB/s

AudriusButkevicius · October 30, 2021, 7:52am

I guess I would try this in isolation. Single folder with a single large file and see what that looks like. Perhaps the raid doesn’t like concurrent file access.

bt90 · October 30, 2021, 12:05pm

So Syncthing is running within the VM. Are those iostats from the hypervisor? How are the VMs accessing the storage? Which type of controller is configured within Proxmox?

The host<->guest interaction can be quite tricky as it adds another layer of possible bottlenecks.

sapphironza · November 8, 2021, 8:37am

I have done some additional testing, I don’t think the disk configuration has any influence on this problem.

The previous VM with the same data, is running on the same hardware and hypervisor. The only difference I can see is the synching version is much newer.
I mounted the DB folder onto various disk configurations to test. no change in performance for any of the tests.

I even tried an single NVME disk with no RAID. Iothread=1 or 0, SSD emulation on and off

Iostat on the VM showing very little disk activity related to Syncthing, while it is running at 100% CPU. sdb has the data on it that I am scanning. sdd contains the config folder with the db

Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
loop0             0.00         0.00         0.00          0          0
loop1             0.00         0.00         0.00          0          0
sda             107.10       416.80       130.80       4168       1308
sdb               0.00         0.00         0.00          0          0
sdd               0.10         1.60         0.00         16          0

top - 10:31:25 up 4 days,  2:47,  1 user,  load average: 2.40, 1.79, 1.50
Tasks: 216 total,   1 running, 121 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  3.3 sy,  7.1 ni, 88.5 id,  0.2 wa,  0.0 hi,  0.0 si,  0.2 st
KiB Mem : 25661780 total,  1886504 free, 22857688 used,   917588 buff/cache
KiB Swap:  8388604 total,  2655136 free,  5733468 used.  2384764 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
30932 workpool  31  11 4212212 2.723g   7536 S 100.7 11.1  23:28.51 syncthing
18102 workpool  20   0 6627008 3.288g  10840 S  12.3 13.4 101:51.07 java
 1367 mysql     20   0 23.885g 0.015t   7040 S   2.3 62.0  61:40.27 mysqld
31929 workpool  20   0   42716   3924   3184 R   0.3  0.0   0:00.04 top

The host IOdelay is sitting between 0 and 1%

Any other avenues I can investigate?

imsodin · November 8, 2021, 10:45am

You can grab a cpu profile to see what it’s doing: Profiling — Syncthing v1 documentation

sapphironza · November 8, 2021, 4:28pm

CPU Profile Files uploaded syncthing-cpu-linux-amd64-v1.18.4-182542.pprof (18.2 KB) syncthing-cpu-linux-amd64-v1.18.4-182706.pprof (17.9 KB)

imsodin · November 8, 2021, 7:56pm

The reason is most likely that v1.7.1 was before the measure to prevent case-conflicts was added:

The time is spent resolving the case of paths, a third of it in syscalls. Which in itself seems excessive. However it’s already peculiar that this takes so much resources: Unless you have a crap-ton of directories with little in them, those lookups should be cached well. As in there should be exactly one call to list directory contents per directory, and that should be quick.

Anyway given you are on linux, you can set the caseSensitiveFs configuration variable to true on all your folder to disable those checks.

sapphironza · November 8, 2021, 8:01pm

Solved!

scanning at 132MB/s now

Thanks for the help

Should I switch caseSensitiveFs to true on all linux hosts, since they are case sensitive?

imsodin · November 8, 2021, 8:16pm

You can. The others seem to “behave normally”, i.e. the performance impact is not noticeable, so there’s no real need.

sapphironza · November 9, 2021, 8:20am

FYI, after upgrading Syncthing on my other two hosts, they also presented with the same issue, so I had to switch caseSensitiveFs to true. Those two hosts are LXC containers also running on Proxmox.

austinclamon · November 18, 2021, 3:42am

Just wanted to pop in and say I experienced this issue. Ubuntu 20.04 LTS in a VM (KVM). Folder being scanned is mounted via cifs on a virtualized 10G nic. Before setting caseSensitiveFs the scan was running at 9MBps. It is around 150MBps after setting it to true.

system · December 18, 2021, 3:43am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.