How to limit folder scans

Hi there. For now I’m having like a 20 folders inside Syncthing and it’s 2Tb in summary. All that data is stored on my USB HDD and it’s not the fastest HDD, like a 80Mbytes/second maximum. Average is 40Mbytes/second Screenshot_20240905_113522 I’d like to know how to limit Scans for my folders. My goal is to get 1 folder synchronisation operation and 1 sync operation at the time. Because my HDD can’t handle more.

Now I have 4 folders scans at the time, so it takes like 18 hours to scan all my folders (2Tb). This is unacceptable.

Debian 12 v1.27.11, Linux (64-bit Intel/AMD)

Thanks in advance.

UPDATE: From what I understood there is only an option maxFolderConcurrency which cover both scans and download/upload operations. Are there any other options to specify scans and download/upload operations separately?

Yes, there is unfortunately no way to limit them separately. To overcome this, I basically hack the code and compile my own version of Syncthing that only applies the limit to scanning but not syncing.

This is because with slow HDDs, you do want to limit scanning to 1 or 2 folders at at time, but then you’re stuck in a situation where a single folder downloading a very large file basically blocks syncing of all other folders (even if they just need to sync a few bytes).

In addition to raw throughput, the choice of filesystem, possibly how it’s mounted, and size/number of files also makes a big difference.

From your screenshot, I see that the volume on the USB HDD is NTFS.

In Debian, mount.ntfs symlinks to the ntfs-3g driver, which relies on FUSE. As the number of files increases, the I/O overhead can get quite large.

Roughly how many files are in that 2TB NTFS volume?

Good idea, I was thinking about moving to ext4. For now it looks like that:

-files 136,643

-folders 15,685

-general amount of data 1.61 TiB

Good, so likely a relatively small number of files per folder (Syncthing’s scanning time for folders with tens of thousands of files can get pretty slow, and is also negatively impacted by the filesystem type).

Depending on the cluster size, plus how often new files are added and/or existing files are updated, file fragmentation be quite a drag on performance with a NTFS volume.

The NTFS-3G driver is great, but if there’s an option to use another filesystem instead of NTFS, definitely go for it.

Ext4 doesn’t require a constant round trip between user space and kernel space so you’ll see a performance boost (I use all kinds of FUSE-based filesystems every day, but only for applications that aren’t I/O intensive).

1 Like

Thank you for the information. Don’t you know, is there any way to test HDD performance under my Debian installation? Without data destruction :grinning: Like Windows app CrystalDiskMark.

Speaking about main topic. For each folder I have <maxConcurrentWrites>2</maxConcurrentWrites> generally. And also specified <maxFolderConcurrency>1</maxFolderConcurrency>. Looks good for me.

For a console/terminal connection there’s hdparm (if it’s not already installed, it’s the hdparm package):

hdparm -t <device>

hdparm is mainly intended for setting drive parameters such as adjusting sleep time, but it has a simple benchmarking feature for testing raw throughput.

hdparm -t /dev/sdb

The -t option flushes disk caching and other steps to avoid skewing the results, while -T allows caching to see how the CPU, cache and RAM are impacting transfer speeds.

For a desktop environment, there are several options including:

On a related note, the badblocks utility is well worth keeping in mind, especially when working with external drives that tend to subject to more damage. To run a read-only scan with progress display on storage device /dev/sdb:

badblocks -s /dev/sdb

(It’s generally a standard tool in most Linux distros.)

2 Likes