v1.8.0-rc.3 panic during "upgrade restart"

Catfriend1 · July 28, 2020, 12:38pm

Hi,

I did the Syncthing upgrade a few minutes ago because of the versioning bug that’s been solved in rc3. After hitting the upgrade button on the Web UI, the upgrader restarted Syncthing without me doing anything.

Syncthing came back to life again. It was scanning before the update and continued scanning a folder after the update, according to the Web UI.

I noticed that three panic logs were created shortly after the upgrade started to carry out.

Link: https://pastebin.com/2HewtFfM

All three look similar, pointing at d.(*Snapshot).GetGlobal and “panic: device present in global list but missing as device/fileinfo entry”.

Might be related to https://github.com/syncthing/syncthing/issues/6855 but my instance did not go offline forever. Only looking at the web UI I wouldn’t have even noticed there was a panic.

Kind regards, Catfriend1

imsodin · July 28, 2020, 12:57pm

Meaning it recovered after panicking three times? That’s weird.

And there’s no “Checking db due to upgrade” in the pastebin - did it occur after updating to rc.3?

imsodin · July 28, 2020, 1:58pm

Don’t you have the logs (not panic files) anymore?

And responding to your message on the PR (because the restarting stuff discussed here is essential to it):

can I run this straight off the PR’s teamcity build to get more diagnose info what’s wrong with my cluster? I expect the panic to happen again as soon as Syncthing v1.8.0-rc.3 finishes the rescan on the node where it crashed constantly before.
lib/db: Include blocks in db check (ref #6855) by imsodin · Pull Request #6861 · syncthing/syncthing · GitHub

That PR “enhances” the check/repair happening on upgrade. As such if it is relevant to your problem, it will temporarily fix it and thus provide the insight that your problem was related to blocks and remove the actual problem (i.e. no more info on it available). If you do this, please first run without the PR with STRECHECKDBEVERY=1s, just in case the db check really didn’t run, and then apply the PR if it didn’t help.

As to the device C and what you describe on the issue: The old device ID (shortID) showing in the version is perfectly normal, even if that device doesn’t exist anywhere anymore. That’s not an indication of a problem but working as expected.

All in all I still don’t have an idea what might be causing these panics.

Edit: Actually if you have the db of the device that started panicking after a full db reset while scanning, I’d be interested in that. I know I asked for/got lots of dbs from you without delivering anything, I hope you still some faith left

system · August 27, 2020, 1:58pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.