Is Syncthing v2 with very large folders possible ?

Realistically speaking, no. Both Android and iOS are restricted by available storage and its I/O performance. Androids filesystem abstraction layer is terrible, which renders large folders in Syncthing nearly unusable.

Still reading Golang and SQLite documentation…

Meanwhile the timings in the debug logs made me have a look again at the file_names/file_versions as the selection of the next chunk to work on took a significant part of the process (~20-25%, the rest being the actual DELETE).

I found a simpler way to do it in SQL that requires only one query in the common case on large folders (every chunk except the last where there isn’t enough rows to make a complete chunk at the current chunk size) instead of the two queries I used before.

Tested on my laptop and pushed, will wait a bit before sending it to process the huge folder.

Thanks a lot for your time

I’ve benefited from Open Source projects for more than 30 years now. Syncthing is one of many. So I probably have more thanks to give than receive.

I’m very much into the “if it itches, scratch it then publish” mindset.

Fortunately for my current itch, we have had bad weather for days here and it doesn’t seem to want to go away so I’m a bit stuck in front of the keyboard and monitors.

I really want to know the full reasonning behind the updateLock sync.Mutex. I’ve only seen Consider better reaction to SQLITE_BUSY · Issue #10529 · syncthing/syncthing · GitHub

Could a developer (@calmh ? You seem the main author of basedb.go) stop by and complete my understanding below ?

From my understanding the lock is intended to protect against an SQL_BUSY error returned to a writer (and maybe against concurrent incompatible updates but I’ve not yet seen it). The Github issue above seems to describe a read that is not protected by the Mutex though.

We can minimize the chance of SQL_BUSY by making sure writers don’t work for too long (this is how I’m dealing with the GC, vacuum and analyze right now so this is a nice side-effect) but there are devices where IO bottlenecks can temporarily be so bad (cheap SSDs suddenly deciding that they have some work to do behind the scenes instead of serving IO requests) that they will trigger this whatever we do.

If there is no other reason than protection agains SQL_BUSY, given that it will probably always be a possibility unless the readers are blocked by the same lock I think the only solution is to accept SQL_BUSY can happen on rare occasions and deal with it by retrying the operation later. It could be done by using methods wrapping dbx or modifying the existing wrappers to automatically retry where it makes sense.

The only long write operation left I know of is the checkpoint(TRUNCATE) that I currently restrict to 1/day.

That said I’ll have to check SQLite documentation if there is a way to truncate without risking blocking other processes. I’m a bit pessimistic about that but I’ve not yet fully absorbed this subject.

I’ve just disabled updateLock by adding this to basedb.go :

// Experiment: disable updateLock
type ourUpdateLock struct {
}

func (fakelock *ourUpdateLock) Lock() {
}
func (fakelock *ourUpdateLock) Unlock() {
}

...

type baseDB struct {
...
        updateLock       ourUpdateLock
...

I’m monitoring my laptop instance with the code above and a script creating a file (with different content to avoid any shared entries that would lower the load on the DB) in a shared folder every 100ms (fish shell) :

for i in (seq 1 10000); echo $i; echo $i > $i; sleep 0.1; end

I don’t see any problem :

  • the folder in Syncthing properly detects the new files in near realtime,
  • my phone which shares the folder is lagging a bit but makes progress fetching the new files
  • the GC does its job
  • I see no error reported in the logs.

The updateLock ensures there is only one writer at a time, which more or less prevents SQLITE_BUSY. The linked issue shows that there is a corner case where it can happen anyway, which I guess is more likely for you guys with the chonky databases. It effectively doesn’t happen for anyone else, though.

So, in principle we want & need 100% protection against operations getting a SQLITE_BUSY response, because there is no graceful handling of it by most callers. Currently it’s apparently not 100% but 99.99something% which is suboptimal but not a disaster. That is not a reason to make it worse.

In the linked thread it’s said that we can’t do checkpoint(restart) with concurrent readers as there is a risk the readers get SQLITE_BUSY. That then seems to me like we need a mutex on that checkpoint(restart) vs readers. That’s annoying as we’ve currently been able to make do without any mutex on readers.

It’s a solution but it means the same mutex (probably a multiple reader/single writer mutex to allow read concurrency but forbid reads when a write happens) would have to be added to each and every read access to the DB.

In my maybe naive view this is not much different than wrapping all accesses to deal with SQL_BUSY.

Wrapping SQL_BUSY gives us an advantage : you can use “PRAGMA busy_timeout” to control how long a reader or writer can be blocked before getting SQL_BUSY. If you use a Mutex you effectively block accesses even when SQLite would be happy to process them concurrently.

You’re welcome to try what you like, of course, but I can guarantee it will be much more complex to try to transparently handle getting an error halfway through iterating a select result or whatever.

This will be more work than simply adding a Mutex around the whole iterations that’s for sure.

But that seems worth it :

  • you gain concurrency/performance and delegate more of the locking to the database which would simplify the code,
  • the wrappers will log the SQL_BUSY before retrying which will help find sources of pain and fix bugs,
  • after it’s done I’ll have a complete view of all DB accesses as I would have searched for them all :slight_smile: That would probably help me if I want to experiment on any other part later.

I think wrapping the DB accesses in an intermediate layer (funcs in baseDB when they don’t already exist) and avoiding direct sql/sqlx access would be the right way to pursue this : it would add a point of flexibility when wanting to control how the queries hit the DB, log the SQL, the query timings, …

Could you have a quick glance at my code to stop me if I do horrible things ? I’m fairly confident about achieving the goals but I’m a bit afraid on the “how” part and that the first time you’ll look at my code style you’ll want to smash your monitor to save your eyes…

As you wrote this will be long, better if I take good habits early instead of rewriting most of it at the end. Ideally I would like to leave the code base in a better shape not a worse shape (on the readability/maintainability scale) after I finish.

I very much doubt it will be cleaner or simpler in any way, but I obviously can’t prove this to you, it’s just a feeling based on experience. I don’t see what concurrency we’d gain as SQLite can’t ever have more than a single writer at a time. I can also say it’s not obvious to me how to handle a sudden error with retry in the middle of iterating a result set, but perhaps you have some ideas.

All that said, the actual SQLite documentation as I understand it seems to say we’re doing everything correctly and shouldn’t ever get a SQLITE_BUSY on a reader in this scenario:

SQLITE_CHECKPOINT_FULL: This mode blocks (it invokes the busy-handler callback) until there is no database writer and all readers are reading from the most recent database snapshot. It then checkpoints all frames in the log file and syncs the database file. This mode blocks new database writers while it is pending, but new database readers are allowed to continue unimpeded.

SQLITE_CHECKPOINT_RESTART: This mode works the same way as SQLITE_CHECKPOINT_FULL with the addition that after checkpointing the log file it blocks (calls the busy-handler callback) until all readers are reading from the database file only. This ensures that the next writer will restart the log file from the beginning. Like SQLITE_CHECKPOINT_FULL, this mode blocks new database writer attempts while it is pending, but does not impede readers.

(Keeping in mind we already guarantee there are no writers at this point, given the updateLock.)

There is another document on when we can expect to see SQLITE_BUSY to queries in WAL mode: Write-Ahead Logging

None of those scenarios apply to us.

Which is to say, I don’t think we understand this problem correctly, so it may be premature to implement solutions.

(Also not 100% sure why you’re moving down this path but I didn’t attentively read all the context above. Is it the spurious SQLITE_BUSY in the linked GitHub issue that is a problem for large databases?)

Yes but it can have readers with a writer active. If we forbid reads during a write we lose this concurrency.

Depends on how the result set is implemented exactly : you can have anything between a result set fully in memory, fetched by paginating with multiple queries, using a db-side cursor (if SQLite supports them) with increased work along them. The only one worrying me a bit I can think of is the CURSOR based iteration.

I agree on the lack of understanding of the cause of this particular issue. I can’t find the reason for it either. My only weak theory : SQLite might have undocumented behavior/bugs when a SQL statement takes far more time than even the busy_timeout (I’ve not yet found the default value for it btw).

It may be a bit premature but my own experience with concurrent accesses tells me that blocking them at the highest level for simplicity comes back later to bite you when you need everything to go smoothly under load.

It doesn’t manifest in my instances, no. But the updateLock with the mainline implementation blocks Syncthing from performing any worthwhile work for hours at a time. I didn’t dare touch it initially and arguably splitting the maintenance into smaller chunks is probably better for my goal of having a folder synced as it avoids load spikes that could very well be a problem anyway.

Yeah that was never on the table of what I was suggesting, at least. I could see blocking readers during wal_checkpoint(restart) if that was required, but the docs indicate it really isn’t so I see no reason for that either.

Right, so breaking up the maintenance into smaller chunks (and optimising the queries etc) is the solution to that I guess, which I saw you already had a branch up for.

Yes. Working on moving the concurrency control deeper in the layers would be on another branch. Given what I know I don’t know… I’d say there’s a 25% chance it fails to bear fruit so it is more of an experimental branch.

Do you think you will eventually merge the smaller chunks branch (db_maintenance_performance) ?

I don’t think it is ready yet. I believe it should be more tested and as I wrote earlier there’s a distinct possibility that my code style needs to be reviewed…

But if it is on the path to be merged I probably want to create the - let’s call it “busy-retry” to make it short - new branch on top of the smaller chunks one.

We’ll absolutely merge improvements to making the db maintenance less impactful.

Removing write locks and just spinning retries on sqlite_busy sounds both risky and undesirable, trading lock waits for busy loops, so I would currently say that’s something we don’t want – but that’s not a religious stance, if you bring the tests and benchmarks to prove it’s better, then it’s better.

I won’t “just” spin retries :slight_smile: I just hope I won’t lack motivation before finishing : I’ve not had the opportunity to code for several days straight for too long.

Not “just” spinning is why I see potential in this approach : the retry itself is just a way to replicate the lock behavior at a deeper level and with more fine-grain control to avoid long freezes. It can provide better concurrency which would mainly be beneficial to large instances and that’s my initial motivation but that’s only the tip of the iceberg.

What can really make it useful is that you can do so much more where you implement the retry instead of having a Mutex locked and being completely blind.

I’ve touched on it earlier (easier debug with lock duration, SQL statement being blocked and anything you can get your hands on that could help like what SQLite can provide on its internal state).

It would also make something like displaying a message on the GUI : “The database is overwhelmed, your storage might be too slow.” possible by simply having the opportunity to get the information.

You think you found a bug in SQLite ? Activate debug, send the log of all statements made to the DB to SQLite’s devs with a copy of the DB.

Anyway code will speak for itself. I almost hope it continues to rain…

The default is 0, which might explain the 0.001% of cases where this happens for readers. From my previous search on that matter, it seems to happen if a read transaction falls together with some kind of WAL management lock. While the lock is only held for a fraction of a millisecond, it’s enough to trigger the busy callback. The default timeout of 0 boils down to “fail instantly” in that case.

Setting a sensible low non-zero busy timeout is probably going to fix our reader errors. But I agree with @calmh that we need some kind of mutex to aid checkpoints.

I don’t think so. A mutex lock is managed by the runtime, uses zero CPU, and has fairness (queueing) built in. A spin lock has none of that. The concurrency in both cases is precisely one, as far as writers go.

If the problem is that we’re holding the updateLock too long in the maintenance routine then that needs to be split up and the lock released between transactions, but spinning isn’t going to help any.

Honestly I don’t have a solution for checkpoint(TRUNCATE) and I expect our large instance to block during the checkpoint (which will probably happen for the first time tonight if I don’t restart it).

Whatever the code, the real problem is that SQLite must wait for all readers to finish to be able to move all the content of the WAL to the DB file. It’s not clear to me yet if the checkpointer work in a single pass where all accesses must be stopped when truncating or if it can incrementally advance up to the point in WAL where the earliest transaction was opened/current query started, pause and then advance when the transactions/queries close/finish.

It’s quite tricky to move data to the DB file while queries are being processed so I suspect they have to process the whole file after everyone is finished…

That’s rough : imagine that with a 200GiB WAL.

This also is documented: Write-Ahead Logging

It checkpoints pages up to the oldest reader, and resumes from that point on next pass. However, it can’t actually truncate the WAL file until it reaches the end with no active readers, which does require reaching a point of idleness. (We could, obviously, enforce such idleness with a read/write mutex for checkpointing vs readers, if we wanted to. Currently we don’t and just assume that at some point things will be in sync and we’re happy.)

If our transactions are short, this isn’t as bad as it looks like. While the WAL file is large, the amount of non-checkpointed pages in it might be tiny.