Very long initial scan

Sorry for probably dumb question, but can I do something with initital scan? It is started everytimes when I start syncthing and it takes very long time a slows down whole computer.

You can let it finish. To reduce resource usage (making your computer maybe more responsive while probably making the scan take longer) you can;

  • Set the number of hashers per folder to one, if you are on Linux where it defaults to a higher number. This is in the advanced settings.
  • Pause folders so that it’s only scanning one at a time, if you have several.

Thank you. Yes, I have more folders. I have ~289GB local state and it takes about half hour on 7200RPM drive. Is it normal? And is necessary to do this on every start?

UPDATE> I’ve checked it on old laptop, and there is initial scan of the same data finished in one or two minutes after start and it does not slow down the computer. This seems, that I have some problem on a new laptop.

One difference between them is, that on a new one I’am using encrypted home folder …

Storing syncthings database on an encrypted filesystem is not wise.

Why?

I simply have encrypted home folder. Maybe I can move ~/.config/syncthing somewhere to /opt (and it will be on SSD), but what about data security?

What about the data that lives on your disk, why is that not all encrypted as well? So you are suggesting to encrypt the database that holds no data, just information about the data encrypted, and the actual data in plain text?

We don’t understand each other. Now I have everything in my home folder, which is encrypted. You wrote here, that it is not a good idea to have syncthing database on an encrypted filesystem. So I had idea about moving ~/.config/syncthing folder somewhere to unencrypted filesystem, which is, for example, /opt in my case. And make a link in home to it, of course.

But I don’t know, if synthing database contains some sensitive data or not, but I guess, that yes?

And, will this help with my problem with very long and heavy initial directory scan?

Well depends on your definition of sensitive. It contains filenames, sizes, who has what etc. There are other files that decide your device ID such as certificate files, which if stolen would allow someone to impersonate your device.

I guess you can move the database via symlink to an unencrypted location, and leave the certificates in the encrypted home dir. This should help.

If the data you are scanning is also in an encrypted filesystem, then it’s somewhat obvious why performance sucks.

Yes, all data are on encrypted filesystem. I understand, that there will be some performace penalty, but so much? On machine without encryption it takes several minutes (9:41 syncthing started, 9:44 last directory scanned) and on the one with encryption it takes about 45min and renders whole computer nearly unusable … The data are the same in both computers.

Well as you can see syncthing works fine when there is no encryption involved on the same data, so why would you think poor performance is caused by syncthing, and not encryption?

Or just plain different machines with difference performance characteristics. Encrypted filesystems as such are not a problem, and usually the encryption overhead itself isn’t significant. Slow rotary disks and limited memory for disk cache is probably a much larger cause or subpar performance.

(Though I’m sure there are exceptions with odd userspace encryption implementations on Linux or whatever.)

But that machine with encrypted filesystem is better one. Both of them are quad cores with 7200rpm disks for data. I can’t believe that encryption slows it more than ten times.

Me neither, it’s probably something else.

Yes. But what? What it exactly does during inital scan?

And, my problem is, that is runs initial scan after each machine restart. Shouldn’t it be run only once when directory is added?

You may be conflating two kinds of initial scans. One is when a folder is added - all files need to be read and hashed. The other is the regular scan that happens periodically (every 60s by default) and at startup, the one at startup being “initial”. This is just a check of metadata of all files, and of course reading and hashing if the metadata has changed or the file is new.

In both cases it means looking at the metadata of all files, and reading the corresponding database entries for them. Both of which can be slow on slow disks.

If you think it’s taking too long, whichever it is, you need to look at what your system is doing. Check how busy the disks are, how busy the CPUs are, how much memory is in use. One of those three is probably the bottleneck. Note that disk performance on rotary disks is mainly in operations per second (MB/s being much less relevant) and count on about 100-200/s for a single 7200 RPM disk. That’s, barring optimizations and cache, how many files can be checked per second during a metadata only scan.

So my concern is scan after startup, not after adding folder. Syncthing uses several percents of CPU during scan, despite of big load average (above 10). HDD led shines continuosly, so the slowest element is HDD, probably.

But these must be some bigger problem. I repeat it, I have two similar machines, both having 7200 rpm disks, both have the same data (synchronized with syncthing) and at one of them it takes several minutes (4 in last case) and on the other one it takes about 45min and computer is nearly uselless until the scan is complete. The second one (where it takes 45min) is the better one (32 GB RAM vs. 12GB, newer Core i7). Differences are system version (Ubuntu 17.10 on better one (that with problem) vs. 16.04 on old one) and the fact, that on new machine I have encrypted home.

Cannot be there some problem resulting in performing full scan with full hashing after each syncthing start?

If that’s what’s happening you can see it in the GUI. When hashing files it shows a percentage indicator and a hashing speed and stuff.

Well, there is nothing like that, there is simply blue ‘Scanning’.

Then it’s looking for metadata changes, and the issue is probably either slow metadata lookup on the files or slow database reads.