RAM Usage Increased in 1.2.0?

Moisie · August 8, 2019, 5:15pm

Hi:

(Forked from Web based Syncthing not showing info - I understand this is not related, but link it here just incase it is.)

I’m seeing Syncthing getting killed by the kernel on one of my NAS systems because it’s using too much RAM. This wasn’t a major issue prior to 1.2.0 - I can see 3 such instances in my kernel log for May/June, but 25 in July/August so far. I haven’t knowingly done anything to dramatically increase my usage, and this NAS isn’t massively out of sync with the cluster.

Is an increased RAM footprint expected in 1.2.0+?

Here’s a screenshot of the UI currently:

And a heap profile:

syncthing-heap-linux-amd64-v1.2.1-175005.pprof (391.1 KB)

Please let me know if I can provide any further information to help pinpoint if there’s anything generally amiss. If needs be, I’m happy to run through the RAM-trimming suggestions that have been provided previously.

Thanks!

calmh · August 8, 2019, 5:42pm

No increase is expected. Your profile shows some things that surprise me, but for literally millions of files I suspect it might not be unexpected, really.

The things that surprise me are the filesystem watcher keeping lots of state apparently:

I guess maybe it remembers all directories it watches, which is like a million of them in your case.

Why do we need 185 megs of crypto handshake stuff? How many connections do you have anyway?

And lastly, why does the versioning cleaner need 250 megs? Probably it tracks all directories, again…

I think these are all things that make sense, probably.

Your scale is unusual. I’d allocate more RAM to the NAS.

Oh and kudos for taking a profile and showing a screenshot. Being able to talk about something concrete is nice.

Moisie · August 9, 2019, 7:31am

Hi:

Thanks for looking in - much appreciated. I’ll try to fill in some context:

Hmmm - definitely wasn’t seeing this level of being stamped on prior to 1.2.0. Is it safe to downgrade back to 1.1.4 for testing? (From an internal consistency perspective that is - I understand QUIC won’t work of course.)

I have 16 Remote Devices defined on this machine - some rarely come online, but some are constantly on.

There’s a lot of data churn in some of the shared folders, and I’ve currently got 365 days of staggered versioning defined on all the shared folders (of which there are 17).

I know - and I know I’m asking a lot for quite a niche case. Unfortunately the NAS units are limited to 6GB RAM maximum (according to the manufacturer; anecdotally they appear to support 16GB - but I’m wary of going beyond manufacturer’s recommendation on this).

You’re very kind; I find this tool so useful, and I’m so grateful for the hours of work you, Audrius, Simon and the rest of the team put in, I just wish I could contribute more to the project!

calmh · August 9, 2019, 7:39am

Yes, you can downgrade to 1.1.4. The database format is unchanged, but the config format changed slightly for QUIC and crash reporting – you can just set the config version back to 28 on the first line of the config and Syncthing 1.1.4 will be happy.

One thing though. Upgrades (and downgrades, so any version change) will mean a full index transfer instead of the usual delta-since-last-connect thing. This by itself will cause a lot more database churn and memory usage directly after upgrade, and after downgrade. This could be part of what you’re seeing, perhaps.

Moisie · August 9, 2019, 8:17am

Thanks Jakob. I’ll give that a try - if for no other reason than to see if the memory usage is similar on the earlier version.

AudriusButkevicius · August 9, 2019, 11:05am

The tls crypto stuff is coming from qtls, which I reckon stores session resumption tickets. Watcher I cannot comment on.

Perhaps we should open a ticket with quic guys for the memory usage there.

Moisie · August 9, 2019, 1:08pm

Hello:

Ok - I’ve got 1.1.4 installed on this unit, and will see what happens.

In the meantime - and incase it’s of use to see a comparison, just looking at another NAS (same model) that I look after, I’m seeing much lower memory usage (even with v1.2.1):

This unit has even more files, but only 12 shared folders and only 2 Remote Devices defined. Here is a heap profile:

syncthing-heap-linux-amd64-v1.2.1-140417.pprof (1.8 MB)

No offence taken if this is of no use!

EDITEDITEDIT

Sorry - this might be useless information here. Some context:

Although this Syncthing instance initially appeared to be running smoothly, I found one of it’s partners had stalled (as per Web based Syncthing not showing info - #33 by AudriusButkevicius ). Unfortunately I wasn’t able to grab a heap profile - but I was able to see the UI was reporting 6GB of RAM usage;
I restarted this instance - it came back up as normal;
However, at this point the NAS profiled just above here seemed to stall.

Unfortunately I haven’t got any logging running on these units so I can’t provide anything concrete at present.

calmh · August 9, 2019, 2:33pm

I suspect you’re seeing the effect of the initial index transfer. I also know for a fact that the database parameters (very low level tuning stuff) are crap for large setups. There is a better set of parameters to plug in, but it’s not configurable in 1.1.4/1.2.1. There’ll be debug options to set it in 1.2.2 and something more sensible in the future.

This is however a solution to blocking & bad performance, rather than too high memory usage.

imsodin · August 10, 2019, 7:42pm

It does indeed for linux and any other systems with a underlying watcher that does not support recursive watches. This is due to supporting multiple watches in the same tree and having to know about existing watches on watch setup (and also event dispatch).

calmh · August 10, 2019, 8:25pm

So this is as expected and designed then.

system · September 9, 2019, 8:39pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.