This guy claims that he managed to reimplement syncthing protocol node in a way that is 7 times faster. Supposedly he does use some kind of lock-less data structure to handle metadata.
Honestly it sometimes feels like syncthing is slower than it could be, especially when handling lots of small files on LAN. Can you please look into that and see if some of these optimizations can be backported to syncthing?
And yes, i know this is a bit of a hot take. But it seems there is a room for improvement.
This is kind of Russian-school indeed :), Alexandrescu-style hardcore CPP, (both with syncspirit), yes Russians like that.
A nice project with goals reached, but nothing to be ported to Syncthing mainstream, a different story anyway.
UPD: well, except the obvious idea that DB layer needs redesign, but this is kind of background noise in all the topics on performance, thing that everyone seems to agree. A lot of work and most probably only the owner can do.
Actually that does not matter, in theory it could be just a file. Because DB is used on start and it is periodically flushed on disk. It is not used for runtime synchronization between threads.
At the moment libmdbx as KV-database is used; however a lot of its features are not used (cross-threaded functionality), I might consider using something faster/lightweigther )
But you could have a mismatch between the files that were written to disk not matching what the database thinks was done?
Anyway I thought thatâs always why the process was slow on the existing implementationâŚ. To be absolutely sure that the integrity of the database and the files on disk is sound.
As soon as db is consistent and it âwatchesâ to some previous state, then after restart the non-matched files upon start are considered ânewâ. It does not matter a lot, whether ânewâ files appear because they has been modified âby userâ or âby programâ: all new files will be re-hashed and added to DB. If the ânewâ files appear because they has been downloaded, they are identical to remote (cluster) files, so, they will not be sent to peers. Only metadata changed, which seems does not big harm. And even that can be handled, if needed.
Anyway I thought thatâs always why the process was slow on the existing implementationâŚ. To be absolutely sure that the integrity of the database and the files on disk is sound.
Consider the case, then you need to be sure that every I/O should be confirmed by hardware, that the bits are actually physically fixed on a device. That would be terribly slow. Other cases are just mitigate consequences of the losses, so, I think it is more correct to handle the rare cases with the described above way.
If is safe on streets of a city, is seems a bit redundant to wear armor, right?
The file on the peer might have changed in the meantime, so now you have either a spurious conflict or an unintended revert (data loss) depending on what you implemented.
That is precisely what the fsync calls we use guarantee, and yes, theyâre pretty slow. I think itâs worth it. Fast is only fun until you lose data, then you typically care less about how fast the data was lost.
I only skimmed this thread, but Iâm entirely unsurprised you can make something 7x faster than Syncthing by making different tradeoffs than we do. Probably you can be hundreds of times faster in the bad corner cases. We optimise for data protection and correctness as the number one goal. Weâll try to make it as fast as possible of course, but within the constraints weâve chosen.
This type of thinking also forbids batches. In consistency-first effort, you interleave file (f)syncs with DB transactions, which are per file then.
With (simplified) SQLite intent logging, this means copying a lot of relevant index pages on each record update into WAL (not intent, but for indexes, most likely a whole page each time, and many of them), so WAL growth is unavoidable. No matter how properly and in what time you do commits, it will be a waste of resources for large index trees.
Given view from this side of things, how possible do you see any batching here @calmh ? To update in a single transaction so into-WAL copy of the relevant index tree is written once. It is an interesting point. Do you think given all this strict interleaving idea, you deny this completely?
Youâre right, thereâs batching which is a tradeoff towards performance. To single-mindedly focus on correctness to the exclusion of all else weâd probably need something like three syncs per file, first committing a transaction describing what we intend to do, then doing it and syncing the file (plus directories upwards), then committing the result â so that it would be possible to rollback on recovery, etc. We donât do anything that complicated, deeming it good enough to have the files synced to stable storage and the database updated reasonably close in time.
I donât think I understand the bits about WAL copies of index trees etc, sorry.
Perhaps some compromises could be made in receive only folder cases but there could be gotchas there too.
I wonder if for large batches of small files if these could be held âtransfer/assembly completeâ but not renamed and then in batches of 5 or 10 files do the rename and dB update. But for my cases Iâm okay with the performance I have.