Please test Badger database backend

Syncthing uses a database to store index information and other things. This is pretty essential to how it works. Currently we’re using a LevelDB implementation, and it’s served us well but we’re seeing a lot of unexplained crashes and other worrying things. For this reason we’ve looked at alternatives. Badger is one alternative and we’ve implemented this as a database backend. However it’s a fundamental change that requires careful testing before any kind of roll out.

This is where you can help. As of Syncthing 1.7.0 it’s easy to switch between the LevelDB and Badger database backends, without having to reindex or reinstall.

There are usage details on the docs site, but the short of it is that if you start with the USE_BADGER environment variable set the database will be converted and you’ll use Badger. If you restart without the environment variable the conversion will be done in the other direction and you’ll use LevelDB as always.

We’d love to hear stories of success, but horror stories (should they occur) are the essential ones. Back up your data. :slight_smile:

(For what it’s worth I’ve run all my machines on Badger for the last month without ill effects.)


Later edit

:warning: Don’t test this on Windows, as there are issues. (See below)

4 Likes

There seems to be a huge size difference between the old and the new database. The old index-v0.14.0.db folder was only 155 MB, while the new indexdb.badger takes 2.4 GB. Is this expected?

2 Likes

It’s certainly bigger, though not quite as extreme as that in my testing.

jb@kvin:..on Support/Syncthing % du -sh index*
966M	index-v0.14.0.db.migrated.20200529135506
2.4G	indexdb.badger

Supposedly it has compression similar to leveldb, but apparently the overhead is larger. It would be interesting to see what differs in your database compared to mine, structure wise. Can you grab a current stindex tool and run something like:

root@lun0:~# systemctl stop syncthing@syncthing
root@lun0:~# USE_BADGER=1 ./stindex -mode account /data/syncthing/.config/syncthing/indexdb.badger
 0x00:  390664 items,  26506 KB keys +  57024 KB data, 67 B +  145 B avg,    610 B max
 0x01:  214886 items,  13938 KB keys +  19341 KB data, 64 B +   90 B avg,    306 B max
 0x02: 8624516 items, 783682 KB keys +  34498 KB data, 90 B +    4 B avg,    242 B max
 0x03:       2 items,      0 KB keys +      0 KB data, 72 B +   15 B avg,     87 B max
 0x04:      21 items,      0 KB keys +      0 KB data, 22 B +   17 B avg,     77 B max
 0x06:      11 items,      0 KB keys +      0 KB data,  9 B +    4 B avg,     20 B max
 0x07:      39 items,      0 KB keys +      0 KB data,  9 B +    2 B avg,     41 B max
 0x08:      10 items,      0 KB keys +      0 KB data,  9 B +    8 B avg,     17 B max
 0x09:       5 items,      0 KB keys +      2 KB data,  5 B +  404 B avg,    422 B max
 0x0a:       5 items,      0 KB keys +      0 KB data, 15 B +    8 B avg,     28 B max
 0x0b:  175778 items,   2285 KB keys +  11707 KB data, 13 B +   66 B avg,    234 B max
 0x0d:  177111 items,   5844 KB keys + 872020 KB data, 33 B + 4923 B avg, 101109 B max
 0x0e:  115679 items,  10399 KB keys +      0 KB data, 89 B +    0 B avg,    238 B max
 Total 9698727 items, 842657 KB keys + 994594 KB data.

The rows represent various item types in the database (files, block entries, etc.). The sizes reported are actual, useful data – what we put into the database, not what it actually stores after compression and overhead and whatever.

It could also be the case that it’ll shrink a bit after compaction and is at its largest just after migration.

It also seems to use a bit more RAM in my setup than leveldb did, but this may be down to tuning – which we really haven’t started on, yet.

This is the result.

 0x00:  507694 items,  53393 KB keys + 117144 KB data, 105 B +  230 B avg,    894 B max
 0x01:  246456 items,  24899 KB keys +  24767 KB data, 101 B +  100 B avg,    489 B max
 0x02:  629665 items,  71702 KB keys +   2518 KB data, 113 B +    4 B avg,    414 B max
 0x03:       4 items,      0 KB keys +      0 KB data,  72 B +   15 B avg,     87 B max
 0x04:      54 items,      1 KB keys +      1 KB data,  22 B +   20 B avg,    117 B max
 0x05:     219 items,     33 KB keys +      6 KB data, 151 B +   30 B avg,    185 B max
 0x06:      26 items,      0 KB keys +      0 KB data,   9 B +    7 B avg,     20 B max
 0x07:      17 items,      0 KB keys +      0 KB data,   9 B +    9 B avg,     41 B max
 0x08:      40 items,      0 KB keys +      0 KB data,   9 B +    8 B avg,     17 B max
 0x09:      17 items,      0 KB keys +      7 KB data,   5 B +  439 B avg,    634 B max
 0x0a:       5 items,      0 KB keys +      0 KB data,  16 B +   14 B avg,     46 B max
 0x0b:  224179 items,   2914 KB keys +  23246 KB data,  13 B +  103 B avg,    415 B max
 0x0c:      49 items,      3 KB keys +      0 KB data,  77 B +    0 B avg,    187 B max
 0x0d:   15488 items,    511 KB keys +  22627 KB data,  33 B + 1460 B avg, 100725 B max
 0x0e:  132673 items,  17984 KB keys +      0 KB data, 135 B +    0 B avg,    410 B max
 Total 1756586 items, 171444 KB keys + 190320 KB data.

There is a problem though. Syncthing refused to start anymore after using stindex with the following error.

[start] WARNING: Using experimental badger db
[start] WARNING: Error opening database: During db.vlog.open: Value log truncate required to run DB. This might result in data loss

Removing the variable to re-convert the database did nothing, so in the end I had to wipe out the database to be able to start Syncthing again.

Okay, that’s certainly a horror story to begin with… Nothing about the database statistics jump out at me either. :confused: What OS was this on? Looks like this is some sort of known issue on Windows with unclean close of the database, which is not very promising.

Windows 10 Enterprise LTSC 2018 x64.

Syncthing had seemed to work fine initially though. It was only after using the stindex tool that everything broke down.

Why are you not using SQLite? It is basically in everything (OS/CPU) and has an active community? Seems like the DBs you are choosing are side projects and are basically programatic dead ends. Surely there is a GO interface to SQLite?

Sqlite is written in C, the only reason we are able to support this many platforms is because we use native Go to cross compile. The moment we pull in a C dependency we’ll have to abandon arm, freebsd and a bunch of other platforms.

Also, our data is not relational, so I suspect having transactional database handle non transactional database is going to have a much worse performance to what we already have.

Badger is maintained by dgraph, so it’s not some side project, there is a whole commercial product based on it.

2 Likes

Ok, fair enough. I didn’t realize you were just using a key-value datastore and not a relational db. Thanks for the insight.

I also see a rather big increase in db sizes. I think leveldb is compressing everything, badger only values. And then they are using either snappy or zstd depending on cgo being available or not - expected that to be resolved by adding a go-only zstd implementation by now given there was lots of activity, but hasn’t happened yet (https://discuss.dgraph.io/t/use-pure-go-zstd-implementation/8670). In addition we will probably have to run GC much more frequently and tweak a few parameters (https://github.com/dgraph-io/badger/issues/718).

I have two annotations regarding what you posted:

  • Is to be expected that V17.0 stays with its index in subdir /index-v0.14.0.db, despite upgrading the database format?
  • C doesn’t mean platform dependent, in fact K & R specifically made C long time ago to be able to program platform independent. What is platform dependent is the libraries, but there are ways to circumvent this. And Go AFAIK has an interface to C/C++ source code, to be able to use the existing code base. So your argumentation IMHO can’t be 100% right. Though that doesn’t mean that SQLite is the database to choose of course.

No.

And cross compiling C is an entirely different matter than cross compiling Go. C’s theoretical platform independentness notwithstanding.

Will this break compatibility with v1.0.1 devices?

This doesn’t affect the wire protocol at all, so no, no difference.

So…

  • It’s stable for me on 64 bit Mac/Linux/FreeBSD, but then again so is leveldb.
  • It breaks on Windows during shutdown unless we’re super careful. This isn’t great, but it might be technically fixable.
  • The database size is rather larger. There may be tweaks and it’s not the end of the world regardless I think.
  • The 32 bit build breaks now and then because they don’t test that it compiles apparently.
  • They recently discontinued bug tracking in favour of “tracking” bugs on their forum, which to me is a sign of insanity somewhere in the organization.

I was hoping for a more stable database backend, with good ongoing maintenance, but the last two points leaves me rather worried.

We can put in some engineering effort to try to fix the technical problems, but the question is if this seems like time well spent?

The strikes against leveldb on the other hand are lots of unexaplained crashes which may or may not be leveldb’s fault, and the fact that it’s maintained on a hobby/free time basis only.

2 Likes

The increased size may be a big deal on Android though, as there the database is located on the /data/ partition, which is usually not that large, and I do no think that there is an option to change its location in the Syncthing application.

1 Like

I had a go at it, tried it for the last 24 hours and it was a lot slower for me, db on SSD. Anyway, PC was locked up as I was doing some heavy work on it and had no choice but to pull the plug.

When Synctrayzor restarted i’m now getting

WARNING: Error opening database: During db.vlog.open: Value log truncate required to run DB. This might result in data loss

Also, the db folder under badger is four times larger

so i’m going back to the old one, but I won’t start St for a while as I need to get work done and I can’t be io bound on St

But there’s no other errors in the logs syncthing.log (6.3 KB) so i’m guessing badger hates not being shut down gracefully

Apparently that’s the case on Windows, specifically. It seemed a known bug, although now all bugs are closed.

It’s clearly not a bug then?

We should do the same, close all issues and tell people to come and be ignored on the forum