Excessive RAM Usage - Continued Continued


#1

Dammit - just missed the closure of my previous thread… :laughing:

Hello:

Following on from Excessive RAM Usage and Excessive RAM Usage - Continued

Thought I’d just feed back on this: since disabling ignore caching and applying the puller queue limit, my RAM usage has been miraculously low:

Many thanks for your help in addressing this.

As a brief aside: I’ve seen ignore caching enabled on some other Syncthing instances I look after - I definitely wouldn’t have manually enabled this, so I wonder if it got enabled in a past RC, before being defaulted to disabled? Just thought I’d highlight this, incase anyone else is having RAM issues…

Thanks,

Pants.


(Simon) #2

Thanks for the heads up, glad to hear it’s working nicely for you!

I just looked at the git log and the default for ignores caching is disabled since 2016-04-03 and it also was disabled for existing files back then. Since that commit there are no changes regarding that setting. So I have no clue why it’s enabled for some of your folders.


#3

Thanks Simon. I can’t explain the ignores caching being enabled then - I can’t remember turning it on - but that must be me!

On a related note: I’ve been reworking Jakob’s puller queue limiting patch ( https://github.com/calmh/syncthing/commit/82c5fc008ce82a02b5d5ad1bab124d4648dca724 ) to apply onto the current build (0.14.51, plus commits up to yesterday). I’ve think I’ve got it right - it builds successfully - but I’d be really grateful if a more experienced pair of eyes could look over it before I deploy it.

My commit is here: https://github.com/MoisieSyncthing/syncthing/commit/ddcd0c698aa08e44fa60da755aed7680c52887cf

And, FYI, my reason for patching onto 0.14.51: to take advantage of the ‘deletions still run when the disk is full’ feature.


(Simon) #4

Your commit wont work: Your applying the limit after files were queued, i.e. the RAM usage already occurred. Also @calmh closed the PR with a comment about a corner case:

This has significant merge conflicts now, and I also discovered another corner case where this wasn’t great, be back later

Maybe that’s minor or not a problem, if you are aware of what the corner case is. However I wouldn’t deploy it without confirmation that this is so.


(Jakob Borg) #5

Cryptic me is the best me. Even I don’t know what I was talking about anymore…


#6

Hi - thanks for chiming in!

Oops - will look again and try to understand the function flow… :grin:

Really? I can’t find that - I can only see the (linked) commit which I used as the basis for this rework, and that only mentions the issue with renames falling across batches of pulls.


(Simon) #7

Didn’t really use my brain while resolving conflicts, so use at own risk :wink:


#8

Hi Simon:

Amazing - thanks so much! I owe you a pint! :grinning:


Folder stopped - Error 0.193170 % < 1 %
#9

Hello again:

Sorry to keep pestering - but I’m seeing some things in the logs, and I want to make sure it’s not something to be worried about…

I’m seeing lots and lots of entries of 2018-10-13 14:08:57 sendreceive/*****-*****@0xc0001aab00 parent not missing <file path here> for different files. As far as I can see, the listed files are ones which exist on the other node in the cluster, but don’t yet exist in place on the machine that is producing the log entries.

In this situation, I’m running the queue limiting version, and the machine is in a disk full situation (well below 1% free space, but still 90GB free).

Are the log entries just noise caused by the specific circumstance here?

Many thanks as always,

Pants.


(Simon) #10

That’s a perfectly normal debug level log line when pulling. Arguably a useless one, but nevertheless it’s debug level, so nothing you should trouble yourself with unless you are investigating something related.


#11

Thanks yet again Simon!


#12

Just curious, while seeing your screenshot: Which value do you use for fs.inotify.max_user_watches? And does this also contribute to the memory footprint?


#13

Hi Le0n:

I don’t yet have the file watcher enabled for my large folders, so I can’t answer this at present.

I did play with it a while back, but I think I hit the limits of my NAS could cope with, so I turned it off for now.


#14

Hello again:

Sorry - I feel like I’m being a real pain, but I’m trying to solve an issue where I have two partly-synced machines with a huge dataset - and I keep running into issues! :scream_cat:

So - I’m running the rebase of Jakob’s queue limiting that Simon very kindly provided - but I’m still not seeing any progress.

Looking into it, it seems that I’m being scuppered by my disk full condition: every file that is being queued for processing (up to the queue size limit) is failing - and so the next block of files contains exactly the same entries as before - and consequently fails, and so on.

Thus I’m never getting far enough through my list of files to get any deletions processed - and so I can’t free up any space to allow normal sync to resume.

Rather than press anyone for yet more custom patches, is it possible to human-read the list of outstanding sync operations to identify the deletions and manually do them?

Many thanks!


(Simon) #15

You might have found one of the corner cases Jakob meant :slight_smile:
The problem is, that queuing file is aborted due to the memory limit before deletions are processed. Usually these files would then not be requeued on the second run, but as they are failing due to the disk full condition, they get queued again. The two features (queue limiting and processing deletions when disk is full) are just not compatible (the way they are implemented now).

You can open the list of out of sync items in the web UI and prioritize items (i.e. put them at the head of the queue).


#16

Thanks Simon - I thought as much!

Unfortunately I have 8.5m out of sync items, so the list takes a long time to display and flick through pages.

So can I double-check something before I dive in elsewhere?:

This setup just has two nodes; there are a lot of files which have been reorganised into different folders on the other node - but this node has only partially caught up with the move. If I were to delete the remaining files in their old location on this node, that won’t cause any deletion of the corresponding files on the other node, will it? (Because they now have a different file path…)

I know this will then result in increased pull times as this node will then have to transfer data from the other node (rather than just copying items into place) - but it’s the only way I can think of to get around this deadlock. (Unfortunately adding more disk capacity temporarily won’t help - although a Synology NAS volume can be expanded to include an additional disk, it then can’t be shrunk again once the requirement has passed.)

Many thanks for your thoughts!


(Simon) #17

Syncthing doesn’t make any links between moved files. It just looks at what it is pulling and if it sees two files with equal content, one removed one added, it does a move instead. So indeed you are good to remove those files manually, if they were already removed on the other device.


#18

Thanks Simon - much appreciated, as always!


(system) #19

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.