How does index db work?

Just curious:

(And full disclosure I think I’m probably looking for a manual/readme page I’m missing. It’s early.) XD

How does the index-{n.n.].db folder work? What I mean is, are the .ldb files routinely cleaned up and restructured or do they just sort of accrue?

I know you don’t restore them on the event of a system crash, and they get rebuilt, when the systems compare files on the near end to the far, but is there any benefit/need to clearing them routinely or does (as I suspect) the program handle that? How about the log?

Also, how large do they eventually get? If I knew how they worked (format/info stored/method of function) I could probably figure that out without too much ado, but since I’m asking inane questions, I thought I’d toss that one in.

To further clarify, I’m looking at this with an eye to maintaining a clean, lean system, long term, for possibly 100-200 syncing devices, so I’d like to establish practices to maintain it BEFORE it gets to be a problem/out of hand. = )

That’s the files backing the database. We are just using the database library don’t really control the lifecycle of these. You should also not mess around with them, and they can get as big as the database gets.

You can read about how lsm tree databases implement their storage if you want to understand compaction and so on.

Followup question: So if those are auxiliary files, where is the main database located?

This is the database, they are auxillary to us as we don’t interact or manage them, we just give the db a directory and off it goes, we don’t care what it does there and what the internal structure is.

Ah!

Then I guess my question could be further amended to “what do you DO with the database?”

As in: what data is being written, how often, and how is it managed? Is it, say, an entry for each individual file on the remote/local system along with accompanying data? Are there separate entries for versioned files? Do previous records of files get removed and compacted regularly or are they marked as deprecated entries and left in the DB? Does it get compacted or reindexed on service restart? System reboot? Not at all?

I’m interested in particulars, and since I don’t expect you to rattle off an answer to all these (and similar questions, unless you really REALLY want to :grin:) if there’s a page/resource that has the DB and table layout and a quick overview of what data is stored and how it’s handled, I’d truly appreciate it. = )

The proto messages stored in the db are here: https://github.com/syncthing/syncthing/blob/master/lib/protocol/bep.proto
It’s more or less all fileinfos, and files * number of configured devices. And then there’s some more overhead, but I think negligible compared to that.

We never do any manual reindexing/compaction/…

1 Like

Thanks! That’s what I was looking for! :slight_smile: