This database closing stuff

calmh · July 28, 2019, 6:32pm

So, there’s all this magic with handling errClosed, having wrapper snapshot types and wrapper iterator types and now I’d need wrapper transaction types etc. Can we not just wait for all services to shut down before closing the database, and ensure similar things in tests? At the moment I’m suffering backtraces like these which took me a few minutes to figure out WTF it’s saying

fatal error: sync: Unlock of unlocked RWMutex
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4748c76]

goroutine 335 [running]:
runtime.throw(0x4c36cb7, 0x20)
	/usr/local/go/src/runtime/panic.go:617 +0x72 fp=0xc00a1db030 sp=0xc00a1db000 pc=0x402fb82
sync.throw(0x4c36cb7, 0x20)
	/usr/local/go/src/runtime/panic.go:603 +0x35 fp=0xc00a1db050 sp=0xc00a1db030 pc=0x402fb05
sync.(*RWMutex).Unlock(0xc0000e4140)
	/usr/local/go/src/sync/rwmutex.go:124 +0xeb fp=0xc00a1db090 sp=0xc00a1db050 pc=0x408010b
runtime.call32(0x0, 0x4daecf0, 0xc00015f4b0, 0x800000008)
	/usr/local/go/src/runtime/asm_amd64.s:519 +0x3b fp=0xc00a1db0c0 sp=0xc00a1db090 pc=0x405e31b
panic(0x4b50ec0, 0x53da110)
	/usr/local/go/src/runtime/panic.go:522 +0x1b5 fp=0xc00a1db150 sp=0xc00a1db0c0 pc=0x402f6b5
runtime.panicmem(...)
	/usr/local/go/src/runtime/panic.go:82
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:390 +0x411 fp=0xc00a1db180 sp=0xc00a1db150 pc=0x4046421
github.com/syndtr/goleveldb/leveldb.(*Transaction).Commit(0x0, 0x0, 0x0)
	/Users/jb/go/pkg/mod/github.com/syndtr/goleveldb@v1.0.1-0.20190318030020-c3a204f8e965/leveldb/db_transaction.go:191 +0x56 fp=0xc00a1db310 sp=0xc00a1db180 pc=0x4748c76
github.com/syncthing/syncthing/lib/db.readWriteTransaction.close(0x4dab700, 0x0, 0x0)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/db/transactions.go:152 +0x3d fp=0xc00a1db350 sp=0xc00a1db310 pc=0x47a955d
runtime.call32(0x0, 0x4c52778, 0xc00064c9e0, 0x1800000018)
	/usr/local/go/src/runtime/asm_amd64.s:519 +0x3b fp=0xc00a1db380 sp=0xc00a1db350 pc=0x405e31b
panic(0x4b50ec0, 0x53da110)
	/usr/local/go/src/runtime/panic.go:522 +0x1b5 fp=0xc00a1db410 sp=0xc00a1db380 pc=0x402f6b5
runtime.panicmem(...)
	/usr/local/go/src/runtime/panic.go:82
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:390 +0x411 fp=0xc00a1db440 sp=0xc00a1db410 pc=0x4046421
github.com/syndtr/goleveldb/leveldb.(*Transaction).NewIterator(0x0, 0xc003f23b00, 0x0, 0x0, 0x0)
	/Users/jb/go/pkg/mod/github.com/syndtr/goleveldb@v1.0.1-0.20190318030020-c3a204f8e965/leveldb/db_transaction.go:80 +0x53 fp=0xc00a1db4e8 sp=0xc00a1db440 pc=0x47471a3
github.com/syncthing/syncthing/lib/db.(*readWriteTransaction).NewIterator(0xc0000b5580, 0xc003f23b00, 0x0, 0xc003f23b00, 0x0)
	<autogenerated>:1 +0x7b fp=0xc00a1db538 sp=0xc00a1db4e8 pc=0x47b506b
github.com/syncthing/syncthing/lib/db.checkGlobals(0x4db3880, 0xc0000b5580, 0x4dbd860, 0xc003f208a0, 0xc000636d28, 0x7, 0x8, 0xc00015fcc0)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/db/instance.go:472 +0x14a fp=0xc00a1db898 sp=0xc00a1db538 pc=0x478243a
github.com/syncthing/syncthing/lib/db.(*instance).checkGlobals(0xc0000b5560, 0xc000636d28, 0x7, 0x8, 0xc00015fcc0)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/db/instance.go:468 +0x165 fp=0xc00a1db918 sp=0xc00a1db898 pc=0x47822b5
github.com/syncthing/syncthing/lib/db.(*FileSet).recalcCounts(0xc00015fc40)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/db/set.go:94 +0x231 fp=0xc00a1db9f8 sp=0xc00a1db918 pc=0x478fdc1
github.com/syncthing/syncthing/lib/db.NewFileSet(0x4c251e6, 0x7, 0x4dbf480, 0xc00a124230, 0xc0001f20f0, 0x4c25ab3)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/db/set.go:82 +0x4f0 fp=0xc00a1dbb18 sp=0xc00a1db9f8 pc=0x478f8d0
github.com/syncthing/syncthing/lib/model.(*model).RestartFolder(0xc001244000, 0x4c251e6, 0x7, 0x0, 0x0, 0x0, 0x4c25ab3, 0x8, 0x0, 0xc00a0d3800, ...)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/model/model.go:470 +0x84b fp=0xc00a1dc3f0 sp=0xc00a1dbb18 pc=0x499534b
github.com/syncthing/syncthing/lib/model.(*model).CommitConfiguration(0xc001244000, 0x0, 0xc000f4db80, 0x2, 0x2, 0xc00a11f3b0, 0x1, 0x1, 0x0, 0x0, ...)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/model/model.go:2518 +0xe35 fp=0xc00a1de388 sp=0xc00a1dc3f0 pc=0x49b6575
github.com/syncthing/syncthing/lib/config.(*wrapper).notifyListener(0xc0011f6000, 0x4dab080, 0xc001244000, 0x0, 0xc000f4db80, 0x2, 0x2, 0xc00a11f3b0, 0x1, 0x1, ...)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/config/wrapper.go:239 +0x20d fp=0xc00a1df1a0 sp=0xc00a1de388 pc=0x46dddbd
github.com/syncthing/syncthing/lib/config.(*wrapper).notifyListeners.func1(0xc0011f6000, 0xc0011af880, 0xc0011afc00, 0x4dabc00, 0xc00a1148d0, 0x4dab080, 0xc001244000)
	/Users/jb/dev/github.com/syncthing/syncthing/lib/config/wrapper.go:230 +0x166 fp=0xc00a1dffa8 sp=0xc00a1df1a0 pc=0x46e5506
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc00a1dffb0 sp=0xc00a1dffa8 pc=0x4060011
created by github.com/syncthing/syncthing/lib/config.(*wrapper).notifyListeners
	/Users/jb/dev/github.com/syncthing/syncthing/lib/config/wrapper.go:229 +0x2a7

because I’m not correctly handling the fact that the database might be closed and yet we’re calling newReadWriteTransaction on it (and have no error return)…

I guess another option is to take the bull by the horn and bubble all errors? At the highest level that should result in some Serve() returning when it can’t handle it, which is the desirable result anyway…

calmh · July 28, 2019, 8:55pm

I’m gonna bubble.

And I’m going to do it as a second commit on top of the transaction stuff, because both of those things touch like every line of code and the conflicts from doing both separately in parallel would kill me.

AudriusButkevicius · July 28, 2019, 10:16pm

Sure, makes sense.

imsodin · July 29, 2019, 5:52am

The problem all this wrapping solves is that goleveldb panics when its methods are called and that iterators must be released before closing. There are no guarantees that once all services stopped the db isn’t accessed anymore. Guaranteeing that would require additional mechanisms in model (and maybe others) to protect/track each and every db access.

calmh · July 29, 2019, 6:22am

Are you sure? I haven’t investigated this yet for iterators, but for the transaction thing I get ErrClosed at all stages (creating the transaction, writing to it, committing it) which means I can abort. It’s just that at the moment we have no way of returning the error, and if I for example do further calls on an ErrClosed-transaction then it will panic.

imsodin · July 29, 2019, 6:37am

I am pretty sure for NewIterator, however not so much anymore whether it was only that or if there’s also other methods that behave badly (i.e. don’t just return ErrClosed). In my memory it feels like there must have been more problems than just iterators (but that might be entirely wrong) and the documentation has a should on not calling any more methods, so I’d be cautious:

It is not safe to close a DB until all outstanding iterators are released. It is valid to call Close multiple times. Other methods should not be called after the DB has been closed.

calmh · July 29, 2019, 6:39am

Fair enough, we might need to keep the iterator wrappers. I’ll check it out along the way.