0.7MiB/s, is this slow scan rate normal for a single 5400rpm hard disk?

Hi there,

I started using syncthing yesterday and currently at the process of completing the initial scan of my 2.7TB of data, from a single SATA 5400rpm hard disk.

My setup: OmniOS VM running on ESXi host, the physical 5400rpm SATA hard disk is connected to a LSI2008 HBA card that is pass-through’ed from ESXi to OmniOS. This hard disk forms a single disk zfs pool in OmniOS.

OmniOS has been given 4 vCPUs and 48GB of RAM.

I have seen extremely slow scanning speed from this hard disk. From the syncthing log, it shows:


    2020-05-28 09:13:31 real to hash: Setup/BTW_hlp.chm
    2020-05-28 09:13:31 Walk oeit5-ptrea [] current progress 136619995756/2488826695378 at 0.7 MiB/s (5%)
    2020-05-28 09:13:33 Walk oeit5-ptrea [] current progress 136620126828/2488826695378 at 0.7 MiB/s (5%)
    2020-05-28 09:13:35 Walk oeit5-ptrea [] current progress 136621175404/2488826695378 at 0.7 MiB/s (5%)
    2020-05-28 09:13:36 real to hash: Setup/btrez.dll
    2020-05-28 09:13:37 real to hash: Setup/btw_hlp.chm
    2020-05-28 09:13:37 Walk oeit5-ptrea [] current progress 136623608591/2488826695378 at 0.7 MiB/s (5%)
    2020-05-28 09:13:39 Walk oeit5-ptrea [] current progress 136623739663/2488826695378 at 0.7 MiB/s (5%)
    2020-05-28 09:13:41 Walk oeit5-ptrea [] current progress 136624132879/2488826695378 at 0.7 MiB/s (5%)

I understand the scanning and hashing tax processor and hard disk heavily. I have checked and can confirmed the bottleneck appears to be the hard disk. As there is next to nothing activities happening to the processor, but the disk utilisation is 100% all the time.

Below is the iostat output:


iostat -xtc tank sd2 5
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy dt id
tank     26.9    8.3 3167.9  489.4 74.6  4.6 2246.4  96 100    0    4   1  1  0 98
sd2      27.0   10.3 3169.5  489.6  0.0  4.6  124.0   0 100
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy dt id
tank      3.4    1.8  390.4   83.0 75.5  4.4 15395.5 100 100    0   87   0  1  0 99
sd2       3.4    2.4  390.4   83.0  0.0  4.4  767.4   0 100
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy dt id
tank     18.0    3.4 2284.6  190.8 75.1  5.7 3768.6 100 100    0   62   0  1  0 99
sd2      18.0    4.2 2284.6  190.8  0.0  5.7  256.3   0 100

My question is: Is this normal for a 5400rpm hard disk, hashing at 0.7Mbps? Note that I have done some benchmarking on this hard disk, it can easily saturate 1Gbps network when doing read from another server, from the same zfs pool.

Thanks.

It’s probably not, but there are a lot of variables at play here.

I have looked at the log file and discovered these two interesting lines:


2020-05-27 21:27:21 Single thread SHA256 performance is 179 MB/s using minio/sha256-simd (139 MB/s using crypto/sha256).
2020-05-27 21:27:21 Hashing performance is 150.81 MB/s

Are these performance results calculated merely based on my server’s processor specs without any hard disk specs taking into account?

Regards.

This is an benchmark using random data, i.e. no disk involved.

Disk shows svc_t (time to service a request) > 10s and 100% wait time, so yeah, it’s just slow. If it’s doing more than one thing at a time it’s probably all just waiting for disk seeks. If you’re scanning lots of small files I think this is expected, though setting the folder config hashers to 1 might help.

I have already configured hashers=1 in the configuration file.

Now I have more discovery. I fired up another server, FreeNAS 11.3 U3.1 (the latest as of now), the data is the same, 2.7TB of data on zfs pool. The hard disk is almost the same, a little bit faster in 5900rpm.

FreeNAS is installed on a bare metal Dell R805 server, dual processors, 64GB of RAM.

I installed Syncthing plugin from FreeNAS, the version is 1.4. After I added the data folder to Syncthing for scanning, it reaches 79MiB/s scanning rate, way faster than the same data on OmniOS VM on the other server.

Here are the output of varies commands:


2020-05-28 21:08:26 Walk oeit5-ptrea [] current progress 1014855510981/2540460805898 at 79.4 MiB/s (39%)

freenas# iostat -x ada0
                        extended device statistics
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b
ada0          47      12   5846.7    210.3    75     8   111    61   65  13

freenas# /usr/sbin/iostat -xI ada0
                        extended device statistics
device           r/i         w/i         kr/i         kw/i qlen   tsvc_t/i      sb/i
ada0       8354039.0   2110009.0 1035778954.5   36264560.0   43   657780.0   22913.2

last pid:  4232;  load averages:  1.46,  1.31,  1.27    up 1+23:49:24  21:15:03
64 processes:  1 running, 63 sleeping
CPU:  0.1% user, 14.0% nice,  3.0% system,  0.5% interrupt, 82.3% idle
Mem: 201M Active, 2104M Inact, 350M Laundry, 58G Wired, 1589M Free
ARC: 54G Total, 7273M MFU, 46G MRU, 42M Anon, 222M Header, 490M Other
     51G Compressed, 57G Uncompressed, 1.11:1 Ratio
Swap: 2048M Total, 2048M Free

The differences are many, between these two servers, the factors almost the same are the actual data, and the almost same speed hard disks (both are desktop hard disks).

It makes me wonder there is bound to be something wrong with the OmniOS server file scanning, but I couldn’t pinpoint what might be the issues.

I have done some more experiments and I am reporting back the results here.

Experiment 1: Syncthing on FreeNAS VM

On the same ESXi host, I installed a brand new FreeNAS 11.3 U3.1 VM, the same version as the one running on a bare metal server. I then exported the 2.7TB zpool from OmniOS VM, and imported it into the FreeNAS VM. I then installed the Syncthing plugin onto the FreeNAS VM, and fired it up to scan the same zpool.

Result: not only the scanning speed is unbearably slow, the whole FreeNAS VM responded like a snail. For example, if I type “ls” command at the console, it takes 3 to 5 seconds for the ls output to showup. However, checking on both ESXi and VM inside, processors, RAM and hard disks are pretty idling, next to no activities at all.

The moment I stopped the Syncthing plugin, the FreeNAS VM came back to life, everything became responsive and snappy. Looks like somehow the Syncthing plugin took the whole FreeNAS VM down to its knee.

Experiment 2: Syncthing on the latest OmniOS r151034 VM

I installed the latest OmniOS r151034 VM on the same ESXi host, exported the same 2.7TB zpool from FreeNAS VM, imported it into the newly created OmniOS r151034 VM, downloaded Syncthing and started it up.

Results: The scanning speed reached 80-90MiB/s and can stay there continuously.

Summary:

  • Same ESXi host
  • Same physical 5400rpm hard disk
  • Same LSI 2008 HBA card that is pass-through’ed to VM
  • Same set of user files
  • Almost the same VM specs (4 x vCPU, 48GB RAM, all have open-vm-tools installed)

Different OSes on three different VMs, diffferent Syncthing scanning speeds resulted:

  • OmniOS r151028: 0.7MiB/s
  • FreeNAS 11.3 U3.1: 0.5-0.6MiB/s
  • OmniOS r151024: 80-90MiB/s

I can only conclude the Syncthing’s slow scanning speed is related to the OS. Luckily zfs is very portable, I could easily export and import the same pool between different OSes.

Empirical experimentation is good. However,

I don’t think this makes sense. More likely you are looking at effects of scanning small files vs large or with hot vs cold caches.

I tend to think at about 80-90MiB/s scanning rate is expected from my setup, as I have seen similar rate from my other physical setup containing the same set of user files on a hard disk with similar speed.

There is got to be something wrong with the OmniOS r151028 that I have, but I don’t yet know what is the root cause.

My newly installed OmniOS r151034 actually went up to 100MiB/s last night at the 2nd half of the 2.7TB zpool scanning. Small vs large files, or hot vs cold caches likely not applicable here as each scan needed to go through same files and they are many (over 260k files).

In any case I appreciate you guys participation on this issue that I have.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.