Excessive RAM Usage - Continued Continued

Moisie · September 24, 2018, 8:53am

Dammit - just missed the closure of my previous thread…

Hello:

Following on from Excessive RAM Usage and Excessive RAM Usage - Continued

Thought I’d just feed back on this: since disabling ignore caching and applying the puller queue limit, my RAM usage has been miraculously low:

Many thanks for your help in addressing this.

As a brief aside: I’ve seen ignore caching enabled on some other Syncthing instances I look after - I definitely wouldn’t have manually enabled this, so I wonder if it got enabled in a past RC, before being defaulted to disabled? Just thought I’d highlight this, incase anyone else is having RAM issues…

Thanks,

Pants.

imsodin · September 24, 2018, 10:05am

Thanks for the heads up, glad to hear it’s working nicely for you!

I just looked at the git log and the default for ignores caching is disabled since 2016-04-03 and it also was disabled for existing files back then. Since that commit there are no changes regarding that setting. So I have no clue why it’s enabled for some of your folders.

Moisie · October 9, 2018, 6:42am

Thanks Simon. I can’t explain the ignores caching being enabled then - I can’t remember turning it on - but that must be me!

On a related note: I’ve been reworking Jakob’s puller queue limiting patch ( https://github.com/calmh/syncthing/commit/82c5fc008ce82a02b5d5ad1bab124d4648dca724 ) to apply onto the current build (0.14.51, plus commits up to yesterday). I’ve think I’ve got it right - it builds successfully - but I’d be really grateful if a more experienced pair of eyes could look over it before I deploy it.

My commit is here: https://github.com/MoisieSyncthing/syncthing/commit/ddcd0c698aa08e44fa60da755aed7680c52887cf

And, FYI, my reason for patching onto 0.14.51: to take advantage of the ‘deletions still run when the disk is full’ feature.

imsodin · October 9, 2018, 7:32am

Your commit wont work: Your applying the limit after files were queued, i.e. the RAM usage already occurred. Also @calmh closed the PR with a comment about a corner case:

This has significant merge conflicts now, and I also discovered another corner case where this wasn’t great, be back later

Maybe that’s minor or not a problem, if you are aware of what the corner case is. However I wouldn’t deploy it without confirmation that this is so.

calmh · October 9, 2018, 8:24am

Cryptic me is the best me. Even I don’t know what I was talking about anymore…

Moisie · October 9, 2018, 2:30pm

Hi - thanks for chiming in!

Oops - will look again and try to understand the function flow…

Really? I can’t find that - I can only see the (linked) commit which I used as the basis for this rework, and that only mentions the issue with renames falling across batches of pulls.

imsodin · October 9, 2018, 2:33pm

Didn’t really use my brain while resolving conflicts, so use at own risk

Moisie · October 9, 2018, 3:26pm

Hi Simon:

Amazing - thanks so much! I owe you a pint!

Moisie · October 13, 2018, 8:07pm

Hello again:

Sorry to keep pestering - but I’m seeing some things in the logs, and I want to make sure it’s not something to be worried about…

I’m seeing lots and lots of entries of 2018-10-13 14:08:57 sendreceive/*****-*****@0xc0001aab00 parent not missing <file path here> for different files. As far as I can see, the listed files are ones which exist on the other node in the cluster, but don’t yet exist in place on the machine that is producing the log entries.

In this situation, I’m running the queue limiting version, and the machine is in a disk full situation (well below 1% free space, but still 90GB free).

Are the log entries just noise caused by the specific circumstance here?

Many thanks as always,

Pants.

imsodin · October 13, 2018, 11:08pm

That’s a perfectly normal debug level log line when pulling. Arguably a useless one, but nevertheless it’s debug level, so nothing you should trouble yourself with unless you are investigating something related.

Moisie · October 14, 2018, 7:34pm

Thanks yet again Simon!

LE0N · October 16, 2018, 6:27pm

Just curious, while seeing your screenshot: Which value do you use for fs.inotify.max_user_watches? And does this also contribute to the memory footprint?

Moisie · October 17, 2018, 4:26pm

Hi Le0n:

I don’t yet have the file watcher enabled for my large folders, so I can’t answer this at present.

I did play with it a while back, but I think I hit the limits of my NAS could cope with, so I turned it off for now.

Moisie · October 20, 2018, 8:19pm

Hello again:

Sorry - I feel like I’m being a real pain, but I’m trying to solve an issue where I have two partly-synced machines with a huge dataset - and I keep running into issues!

So - I’m running the rebase of Jakob’s queue limiting that Simon very kindly provided - but I’m still not seeing any progress.

Looking into it, it seems that I’m being scuppered by my disk full condition: every file that is being queued for processing (up to the queue size limit) is failing - and so the next block of files contains exactly the same entries as before - and consequently fails, and so on.

Thus I’m never getting far enough through my list of files to get any deletions processed - and so I can’t free up any space to allow normal sync to resume.

Rather than press anyone for yet more custom patches, is it possible to human-read the list of outstanding sync operations to identify the deletions and manually do them?

Many thanks!

imsodin · October 20, 2018, 10:03pm

You might have found one of the corner cases Jakob meant
The problem is, that queuing file is aborted due to the memory limit before deletions are processed. Usually these files would then not be requeued on the second run, but as they are failing due to the disk full condition, they get queued again. The two features (queue limiting and processing deletions when disk is full) are just not compatible (the way they are implemented now).

You can open the list of out of sync items in the web UI and prioritize items (i.e. put them at the head of the queue).

Moisie · October 21, 2018, 12:09pm

Thanks Simon - I thought as much!

Unfortunately I have 8.5m out of sync items, so the list takes a long time to display and flick through pages.

So can I double-check something before I dive in elsewhere?:

This setup just has two nodes; there are a lot of files which have been reorganised into different folders on the other node - but this node has only partially caught up with the move. If I were to delete the remaining files in their old location on this node, that won’t cause any deletion of the corresponding files on the other node, will it? (Because they now have a different file path…)

I know this will then result in increased pull times as this node will then have to transfer data from the other node (rather than just copying items into place) - but it’s the only way I can think of to get around this deadlock. (Unfortunately adding more disk capacity temporarily won’t help - although a Synology NAS volume can be expanded to include an additional disk, it then can’t be shrunk again once the requirement has passed.)

Many thanks for your thoughts!

imsodin · October 21, 2018, 8:53pm

Syncthing doesn’t make any links between moved files. It just looks at what it is pulling and if it sees two files with equal content, one removed one added, it does a move instead. So indeed you are good to remove those files manually, if they were already removed on the other device.

Moisie · October 21, 2018, 9:35pm

Thanks Simon - much appreciated, as always!

system · November 20, 2018, 9:35pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.