This came up in Is Syncthing v2 with very large folders possible ? - #64 by bt90 I’m creating a dedicated topic in order to discuss possible solutions for Checkpoint starvation · Issue #10559 · syncthing/syncthing · GitHub
To fully solve this, we need to work on three things that contribute to this:
- long running transactions. A checkpoint can only write back stuff from the WAL that no reader/writer needs anymore. If a transaction is open for hours, we can’t remove pages written to the WAL after its start. I’m currently working on a potential fix that improves this for our vacuum handling, but there might be others.
- truncating the WAL requires no active readers/writers. This one is especially tricky while Syncthing is still in the processing a large folder. There is almost constant reading/writing to the database. The problem is that queries might cause the WAL to grow very large temporarily. As our application level checkpointer is timer based, we might never hit a timewindow without an active reader. So we can’t truncate the WAL file afterwards.
- Auto checkpoints. We don’t disable the auto checkpoint functionality of sqlite. That means that any commit while also try to do a passive checkpoint if the WAL has a size of 1k pages. As checkpoints of any kind are exclusive, we’re blocking our own checkpointer