I have finally found a configuration that works for me. What I actually did:
(a) switched from ubuntu-syncthing to the officially supported syncthing version (v1.29.7, Linux (64-bit ARM))
(b) set Max Folder Concurrency to 16 (much higher than you would normally do with spinning disks)
c) use the Tuning tipps to optimize the metadata operations (my settings below), I think there is further optimization potential
Volume Name: wdVolume
Type: Distributed-Disperse
Volume ID: 5b47f69a-7731-4c7b-85bf-a5014e2a5209
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: tPi5Wb.my.name:/mnt/glusterBricks/extWd18a/data
Brick2: 192.168.129.9:/mnt/glusterBricks/extWd18b/data
Brick3: tPi5Wb3.my.name:/mnt/glusterBricks/extWd18c/data
Brick4: tPi5Wb.my.name:/mnt/glusterBricks/extWd5a/data
Brick5: 192.168.129.9:/mnt/glusterBricks/extWd5b/data
Brick6: tPi5Wb3.my.name:/mnt/glusterBricks/extWd5x/data
Options Reconfigured:
storage.linux-io_uring: on
server.event-threads: 16
performance.rda-cache-limit: 1Gb
performance.io-thread-count: 64
performance.quick-read: on
performance.io-cache: on
performance.read-ahead: on
disperse.stripe-cache: 10
disperse.other-eager-lock: off
disperse.eager-lock: off
cluster.lookup-optimize: on
performance.cache-max-file-size: 1MB
performance.qr-cache-timeout: 600
performance.xattr-cache-list: security.*,system.*,trusted.*,user.*
performance.cache-samba-metadata: on
performance.nl-cache-positive-entry: on
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.parallel-readdir: on
performance.readdir-ahead: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
disperse.shd-max-threads: 16
performance.cache-size: 1024MB
features.scrub: Active
features.bitrot: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
cluster.disperse-self-heal-daemon: enable
diagnostics.client-log-level: DEBUG
storage.build-pgfid: on
Iām quite happy with this, thanks for your ideas and input.
For your use-case, Iād even set it to unlimited (-1). Or otherwise leave it as default (0), which then corresponds to numbers of cpus of your server. I donāt see think more concurrency can hurt with a remote filesystem like this.
The very first thing on that link is enabling metadata caching - if it doesnāt do that by default, that seems very likely to be the main change that helped a ton. I donāt see the metadata-cache string from that operation in your settings though, but there are some other cache related ones - probably that just manifests differently in settings?
Yes, this is probably the most useful setting in general for such a usecase (16 was my choice to make sure there is enought concurrency).
Iām not sure, if this would work, because I do not use much CPU during the critical scanning phases. Possibly, this would lead to infinit scanning. Someone should test this idea.
Yes, I executed this command, but there is no special output in gluster volume info ... for this.
Ceph has been solid for me. My only note is heed the warning if you didnāt see it that you shouldnāt run the kernel mode driver if youāre mounting the file system on the ceph node itself. Use the FUSE user mode mount instead. If youāre mounting the file system in a different machine then you can use the kernel driver.
I donāt know why this is, but I didnāt notice this and did have some fs issues which I was able to recover from and havenāt had any problems since switching.