I wanted to add some “assert-style” panics to the code and run integration-tests in the hope of getting something tangible regarding https://github.com/syncthing/syncthing/issues/5340. The I came across this very weird issue:
Are we missing something with the suture-should-not-restart-panics in that code path?
If the code path is from the HTTP server it would swallow the panic (though also print it) and we’d hang on the lock next time through. addFile though shouldn’t get called that way, afaik.
Pity, I was hoping this was some not so obvious consequence of an obscure language spec that you knew and not a weird problem that should be impossible and might also be specific to my system/syncthing integration tests/moon phase/…
Well, it’s not just you - I see the same thing. Panic before the lock panics as it should. Panic inside the lock is swallowed. By what, I don’t know, but it’s bad and it shouldn’t happen.
Edit: Oh, I know. addFiles is called from (s *FileSet) Update(...) and it panics somewhere in the middle. But defers run on panic, and we have a defer s.meta.toDB(...) in there which tries to acquire the same lock while unwinding from the panic. So the panic never completes.
If addFiles used a defer unlock instead (and we don’t panic precisely between the lock and the defer-unlock, which should be impossible) it works.
Thanks a lot for finding the cause!
Apparently Go’s panic isn’t panicked enough if it lets itself be restrained by something as feeble as a simple lock
I found something completely puzzling concerning metadataTracker, which means it’s either broken or I am too tired already. To rule out the second I’ll go to sleep and have a look another time. I have been sleeping
I think this is a double whammy of not using deferred unlock and doing stuff that takes locks in a defer. Both of those are probably suboptimal practice.