As a brief aside: I’ve seen ignore caching enabled on some other Syncthing instances I look after - I definitely wouldn’t have manually enabled this, so I wonder if it got enabled in a past RC, before being defaulted to disabled? Just thought I’d highlight this, incase anyone else is having RAM issues…
Thanks for the heads up, glad to hear it’s working nicely for you!
I just looked at the git log and the default for ignores caching is disabled since 2016-04-03 and it also was disabled for existing files back then. Since that commit there are no changes regarding that setting. So I have no clue why it’s enabled for some of your folders.
Thanks Simon. I can’t explain the ignores caching being enabled then - I can’t remember turning it on - but that must be me!
On a related note: I’ve been reworking Jakob’s puller queue limiting patch ( https://github.com/calmh/syncthing/commit/82c5fc008ce82a02b5d5ad1bab124d4648dca724 ) to apply onto the current build (0.14.51, plus commits up to yesterday). I’ve think I’ve got it right - it builds successfully - but I’d be really grateful if a more experienced pair of eyes could look over it before I deploy it.
Your commit wont work: Your applying the limit after files were queued, i.e. the RAM usage already occurred. Also @calmh closed the PR with a comment about a corner case:
This has significant merge conflicts now, and I also discovered another corner case where this wasn’t great, be back later
Maybe that’s minor or not a problem, if you are aware of what the corner case is. However I wouldn’t deploy it without confirmation that this is so.
Oops - will look again and try to understand the function flow…
Really? I can’t find that - I can only see the (linked) commit which I used as the basis for this rework, and that only mentions the issue with renames falling across batches of pulls.
Sorry to keep pestering - but I’m seeing some things in the logs, and I want to make sure it’s not something to be worried about…
I’m seeing lots and lots of entries of 2018-10-13 14:08:57 sendreceive/*****-*****@0xc0001aab00 parent not missing <file path here> for different files. As far as I can see, the listed files are ones which exist on the other node in the cluster, but don’t yet exist in place on the machine that is producing the log entries.
In this situation, I’m running the queue limiting version, and the machine is in a disk full situation (well below 1% free space, but still 90GB free).
Are the log entries just noise caused by the specific circumstance here?
That’s a perfectly normal debug level log line when pulling. Arguably a useless one, but nevertheless it’s debug level, so nothing you should trouble yourself with unless you are investigating something related.
Just curious, while seeing your screenshot:
Which value do you use for fs.inotify.max_user_watches?
And does this also contribute to the memory footprint?
Sorry - I feel like I’m being a real pain, but I’m trying to solve an issue where I have two partly-synced machines with a huge dataset - and I keep running into issues!
So - I’m running the rebase of Jakob’s queue limiting that Simon very kindly provided - but I’m still not seeing any progress.
Looking into it, it seems that I’m being scuppered by my disk full condition: every file that is being queued for processing (up to the queue size limit) is failing - and so the next block of files contains exactly the same entries as before - and consequently fails, and so on.
Thus I’m never getting far enough through my list of files to get any deletions processed - and so I can’t free up any space to allow normal sync to resume.
Rather than press anyone for yet more custom patches, is it possible to human-read the list of outstanding sync operations to identify the deletions and manually do them?
You might have found one of the corner cases Jakob meant
The problem is, that queuing file is aborted due to the memory limit before deletions are processed. Usually these files would then not be requeued on the second run, but as they are failing due to the disk full condition, they get queued again. The two features (queue limiting and processing deletions when disk is full) are just not compatible (the way they are implemented now).
You can open the list of out of sync items in the web UI and prioritize items (i.e. put them at the head of the queue).
Unfortunately I have 8.5m out of sync items, so the list takes a long time to display and flick through pages.
So can I double-check something before I dive in elsewhere?:
This setup just has two nodes; there are a lot of files which have been reorganised into different folders on the other node - but this node has only partially caught up with the move. If I were to delete the remaining files in their old location on this node, that won’t cause any deletion of the corresponding files on the other node, will it? (Because they now have a different file path…)
I know this will then result in increased pull times as this node will then have to transfer data from the other node (rather than just copying items into place) - but it’s the only way I can think of to get around this deadlock. (Unfortunately adding more disk capacity temporarily won’t help - although a Synology NAS volume can be expanded to include an additional disk, it then can’t be shrunk again once the requirement has passed.)
Syncthing doesn’t make any links between moved files. It just looks at what it is pulling and if it sees two files with equal content, one removed one added, it does a move instead. So indeed you are good to remove those files manually, if they were already removed on the other device.