High CPU on FreeBSD with v1.20+

I have two FreeBSD systems syncing the same sets of files. I had to increase kern.maxfiles sysctl when first setting them up, but they have worked well up until v1.20+

[edit: Third server in the cluster is CentOS 7, Syncthing v1.20.1, operating normally.]

Global state is 90,262 files, 8,127 directories, ~343 GiB.

  1. Disabling file system watching does not seem to reduce the system load. Even scanning a set with 10 files bumps CPU to 95% for 15-20 seconds. Previously syncthing CPU usage was less than a few percent almost all the time.
  2. One system has UFS, the other has ZFS. UFS has become almost unusable unless I pause all but the smallest sets. ZFS CPU usage has jumped to between 70-80% during rescans, but otherwise the system is still usable.

Reverting to v1.19.2 on the UFS system seems to return the CPU usage to a sane level – peaks to 95% for a few seconds when unpausing large sets, but dropping below 60-70% for 15-20 seconds while scanning all sets, then dropping to less than 1% overall.

I will leave upgrades disabled and watch this version to confirm, but so far it seems to undo the CPU usage problem.

Both systems are current/updated FreeBSD 12.2-RELEASE-p15.

Syncthing is the only application on these servers, nothing other than stock FreeBSD install is running. Top reports 513M Wired, 43M Free memory (typical).

Ideas on what changed that has caused high CPU?

Grab a CPU profile? Profiling — Syncthing documentation

CPU profiles for UFS system on v1.19.2:

CPU Profiles for ZFS system on v1.19.2:

CPU Profiles for ZFS system on v1.20.1 – The first is right after upgrade, the latter two are after all scanning is complete and all sets are up to date:

CPU Profiles for UFS system on v1.20.1 – First right after upgrade (all sets scanning), latter after all sets are up to date:

Click “rescan”:

After all sets are up to date:

“At rest” with periodic rescan on schedule (watching disabled):

Periodic Rescans appear to use significantly higher CPU on v1.20.1 than v1.19.2.

Looks like a lot of CPU time is spent in the case resolution stuff. Presumably your filesystems are case sensitive and you can set the corresponding option, cutting out all of that. Though, this isn’t something that has changed recently, this code has been in there for a while.

There was the filesystem wrapping change, which means that previously operations from walking didn’t go through case resolution, and now do → much more usage of that during scans.

Unexpectedly most of the time is not spent with filesystem operations but with determining time/time.Now() - looks like that’s a somewhat costly operation on freebsd. We call that on every cache hit - would be possible to optimise that (caching time for a quick win or handling expiry differently/more efficiently in the cache). Then again if it only affects freebsd badly, disabling it is an even quicker and bigger win.

1 Like

Was there a recent change in the use of time.Now() that would have exacerbated the CPU usage?

After the CPU profiling above I reverted to v1.19.2 and the CPU usage dropped and usability improved significantly, so that seems to confirm to me that there was something in v1.20 that caused this – although correlation is not causation.

No, but this happened since:

Thanks. What was the “filesystem wrapping change”?

What’s written there: Much more usage of the case resolution code during scanning. If you are interested in the technicals: lib: Get rid of buggy filesystem wrapping by imsodin · Pull Request #8257 · syncthing/syncthing · GitHub

Thanks, I think I understand where things went off the rails.

FWIW: Setting caseSensitiveFS on sets that are only shared on *nix systems, which covers the majority of global files, did not appear to improve the CPU usage before reverting to v1.19, but I could do more testing with this if it was useful.

For what it’s worth, that option isn’t about who it’s shared with, only about how your local fs behaves. So you can set it to case sensitive and share with case insensitive systems.

2 Likes

Ohhh, that’s a helpful clarification!

2 Likes

FWIW Dept: I’m forgetting about my home XigmaNAS server, which is also running FreeBSD 12.2, ZFS, and Syncthing v1.20.1 (patched to autoupgrade, unlike stock XigmaNAS). This Syncthing instance has not shown any high CPU symptoms, despite having a similar size global state (but not syncing with any of the first cluster).

This server has 8Gb of RAM and 4 CPU cores, the others above are VMs with only 1Gb of RAM and 1 CPU core, so I assume that makes a lot of the difference.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.