Dealing with checkpoint starvation

This came up in Is Syncthing v2 with very large folders possible ? - #64 by bt90 I’m creating a dedicated topic in order to discuss possible solutions for Checkpoint starvation · Issue #10559 · syncthing/syncthing · GitHub

To fully solve this, we need to work on three things that contribute to this:

  • long running transactions. A checkpoint can only write back stuff from the WAL that no reader/writer needs anymore. If a transaction is open for hours, we can’t remove pages written to the WAL after its start. I’m currently working on a potential fix that improves this for our vacuum handling, but there might be others.
  • truncating the WAL requires no active readers/writers. This one is especially tricky while Syncthing is still in the processing a large folder. There is almost constant reading/writing to the database. The problem is that queries might cause the WAL to grow very large temporarily. As our application level checkpointer is timer based, we might never hit a timewindow without an active reader. So we can’t truncate the WAL file afterwards.
  • Auto checkpoints. We don’t disable the auto checkpoint functionality of sqlite. That means that any commit while also try to do a passive checkpoint if the WAL has a size of 1k pages. As checkpoints of any kind are exclusive, we’re blocking our own checkpointer
1 Like

Maybe we can replace the timer based checkpointer and move the logic into our connection pool? Releasing a connection back to the pool would run our heuristic and invoke a checkpoint if needed. That would improve the chance of the checkpointer to hit a point of “radio silence” and would also better scale with high load on the database.

Being able to truncate the WAL would mandate some kind of exclusive locking. If the size of the WAL reaches critical levels of multiple gigabytes, we need to act. Otherwise we end up with a positive feedback loop and database operations grind to a halt.

2 Likes

Whenever a write operation occurs, the writer checks how much progress the checkpointer has made, and if the entire WAL has been transferred into the database and synced and if no readers are making use of the WAL, then the writer will rewind the WAL back to the beginning and start putting new transactions at the beginning of the WAL. This mechanism prevents a WAL file from growing without bound.

tl;dr: the WAL will grow indefinitely until we’re able to complete a checkpoint

1 Like