Beta test of Syncthing build with alternate database backend ("boltdb")

well, it’s been just about 20 days… no db corruption.

Syncthing does seem to crash regularly, without negative effects. Restart it, and all is well. I don’t know if this should be pursued in this thread…?

Please post a few crash dumps. Not having any database problems may just be a sign of boltdb not actually checking for correctness as leveldb does. If there is corruption due to bad drivers or hardware, those crashes may be the remaining sign.

Bolt fsyncs on data write and meta write and will do a consistency check on the meta pages but you’re right that it doesn’t perform a checksum on the data pages. You can run “bolt check” on the data file if you want to verify integrity. It’ll walk the whole b+tree and verify that all pages are either accessible or are moved to the free list.

If you’re seeing crashes but the application recovers fine then I don’t think it would be a data corruption issue. I’d be curious to see the stack traces as well.

No logs were created of the crashes, at least none in %localappdata%/Syncthing Since previous post I have started syncthing with -verbose option.

Should I also use STTRACE=all to make sure nothing gets lost ?

There should certainly be a panic file if it panics, or at least output in syncthing.log (on Windows) if something else weird happens. The STTRACE setting doesn’t affect crash logging either way. How have you determined that it crashed to start with?

I checked on it at one point in time, the next time i had a look - the console window was gone. The host had not gone into sleep and it hadn’t rebooted or similar.

Hello! I have almost 2 weeks using Syncthing build with alternate database backend (“boltdb”) and have never been to a database error. For myself, I concluded that ST with the old algorithm database contains errors, because that often occur failures database and had to completely remove the index files. My question is: now the main version ST is 0.11.13. A version of the ST with “boltdb” still 0.11.10. Will be updated alternate version?

Just found this after a mention in a different thread I started. If this looks like the future of ST, I’d love to test an updated build! I’m having a ton of db issues on what I use as the “server” for the clients to connect to. If I’m gonna have to rebuild… I’ll probably try this one any how.

I’ve updated the build at http://build.syncthing.net/job/syncthing-bolt/ to be v0.11.13+bolt.

1 Like

Thanks!

error log : http://46e854b36581e262.paste.se/

Thanks for the updated build. Installed on OS X 10.10.4.

Database rebuilt successfully with ~210,000 files ~113GB.

Whatever issue(s) I was having with the old goleveldb seem to have completely vanished with this new version.

It also rebuilt in significantly less time than the previous version. Wow.

That error is essentially equal to what leveldb was telling you - the database is corrupt and returning incorrect data. I think that at least in your case we can rule out that it’s a database bug causing the corruption.

I assume it just reada garbage in this case?

@calmh, while i understand what you’re saying (and i would have the same reaction, wearing your shoes), this crash was perhaps not a hardware issue. I came to the machine, because other nodes showed disconnected. I tried to stop ST with ctrl-c, and the console window didn’t respond. About a minute later, windows crashed and made a crash dump file. Anyway, ST started up normally after this. Also, ST has been stable for nearly a month on the boltdb builds, but on the exact same hardware as previously. If the hardware is/was at fault on the regular db, well…it has cured itself.

1 Like

Yeah, I don’t know. Leveldb checksums all data, so any discrepancy is detected. Boltdb does not, so we’ll only discover an issue when the format gets broken, not if for example a file name gets changed from foo to fop or a block checksum gets changed. The Windows crash isn’t a good sign either. I’m just saying that you seem to have a lot of crashes and database corruption issues that most users don’t, so your system is slightly suspicious.

noted, and I agree mostly. Except that I used to have daily corruption with goleveldb (on average), on bolt a crash happens after weeks of uptime. On the same hardware & OS.

You seem to be saying that bolt is less strict with db consistency issues? If this is the case, might it not be worth considering creating some kind of db maintenance app?

Yep, he is:

I think I’ve found the cause for ST crashing the OS, without any mention in the ST logs. Have a look at the screenshot, ST is consuming all available memory - and then some - causing Windows to crash.

While that’s an impressive amount of memory usage, I’m pretty sure that Windows won’t just crash if a process is using too much memory. If memory serves, it will display multiple warnings saying that it’s running low of memory, then it will start killing processes - but it won’t crash.