Document explaining architecture and algorithm needed

cwg · May 21, 2019, 3:19pm

I will refrain from adding yet another “please document syncthing architecture” issue on github like others did (only to see it closed) and instead try posting here.

This project is of great interest and has already a certain age and maturity, but there’s still no document that explains how syncthing actually works!

Although I know something about software engineering, networking, and specifically synchronization software, I still fail to develop a clear mental model of syncthing. Short of reading the source, I think that I have intensely browsed all the available information (documentation, forum).

The “getting started” guide barely explains how syncthing works, while the “specifications” section focuses on technical details of messages while neglecting the big picture. Some information can be extracted from “usage” by reading between the lines, but all of this is not sufficient, at least for me.

For example, from the various bits that I have found it seems to me that syncthing does something more than realtime pairwise synchronization (=running unison continuously on the same network of peers). It seems to me that after a cluster has been established, the different members of a cluster first exchange their index databases until everybody has a view of the complete situation. Then each peer decides which files it needs to get (how exactly?) and from where and tries to obtain them (How? What if peer A wants the file from peer C, but it’s only connected directly to peer B?). I do not know whether this understanding is correct, but even if it is, it is still unclear to me what happend when files are modified during synchronization.

I think that a design document on syncthing would be very helpful for advanced users and would motivate contributions.

calmh · May 21, 2019, 3:52pm

I’m sure having something comprehensive, readable, and correct in a single place would be useful. As is there are some building blocks to it.

https://docs.syncthing.net/users/syncing.html

https://www.kastelo.net/blog/2018-06/syncthing-scanning/

https://www.kastelo.net/blog/2018-06/syncthing-syncing

AudriusButkevicius · May 21, 2019, 5:07pm

I don’t think this is a Lockheed Martin project where we have to have a design document.

How it works is constantly changing and maintaining a document like this would kill contributor time.

Sure, we should roughly sketch out a few things (a paragraph each), how discovery, relays, scanning, pulling works, but not some sort of “what if” scenarios, as those are unbounded.

Best way to understand how things work in opensource is to read the code.

Sure, enterprise architwats want pretty UML which they assume will teach them everything they need to know, but we don’t have a pack of interns doing nothing that we could throw this task to.

cwg · May 21, 2019, 8:08pm

Thanks for the quick reply! The two blog posts are very informative, I didn’t find them on my own before. Perhaps it would be a good idea to link them (the first one should be enough, since it’s a series) from https://docs.syncthing.net/ for example.

What remains unclear to me now is what exactly happens between the scanning and syncing that are described in the above documents. How are the indexes propagated? How exactly is the “best version” determined for each file. These questions have been posed by others as well, see for example what is the "best" version?.

If you could find some time to complete the series with a third post that clarifies these questions then I’d say the blog post series will provide what I was looking for!

cwg · May 21, 2019, 8:15pm

Audrius, I was not asking for any pretty diagrams, nor low-level technical details. But I have the impression that the basic mode of operation of Syncthing has remained unchanged since the beginning of the project, yet it is not explained anywhere clearly. In my optinion, the project can only profit from a better explanation of the basic design of syncthing.

AudriusButkevicius · May 21, 2019, 11:26pm

Basic operation yes, but “what ifs” definately did.

Indexes are send after a scan. There is an integer that is ever incremented as items are added to the index (or modified), there is a separate routine that subscribes to LocalIndexUpdated events (same even that is dispatched by the events API) and routinely checks that number and send entries that are newer than the number that was previously sent.

Best version is determined using vector clocks. There is plenty of information explaining how these work, but we essentially pick the clock with the highest increment since we’ve last observed it.

Concurrent clock increments cause conflicts.

If it’s a tie (I incremented my clock by one, you did too), the decision which wins is visible here: https://github.com/syncthing/syncthing/blob/master/lib/protocol/bep_extensions.go#L129

cwg · May 22, 2019, 7:49am

Thanks! What remains unclear to me is how the indices are sent around. Say that there’s a chain of three peers like this:

A-B-C

Does everybody (at least after some time of communication) have a copy of everybody else’s index? Or only of the immediate neighbors? Or perhaps from A's point of view it’s indistinguishable (and does not matter) whether it’s B or C that actually has the best version (because what matters for A is that the file can be requested from B)?

Say that some file is first changed on B and later on C, but no transfer of the file contents has started yet (perhaps because other files are being transferred). Now the connection between B and C fails. Will A patiently wait until C is back before requesting the file (does it even know about C?), or will it already get the “best available” version from B?

AudriusButkevicius · May 22, 2019, 11:14am

A maintains a global index of what every it can see has.

So Cs change would be sent to B first and appear in Bs global index, but global indexes are not forwarded, just local ones. So B would forward Cs entry to A once B has downloaded that item, updated it’s local index with the new version.

A has no awareness of C if it’s not linked to it.

cwg · May 22, 2019, 11:33am

OK, I see. So every node is only aware of its immediate neighbors and requests the best version of a file that it can see among them. This makes sense and seems indeed better than continuously running a pair-wise synchronizer (like Unison) for each link.

Thanks, I now finally have a rather clear mental model of how Syncthing works! I still suggest to at least add a paragraph somewhere in the documentation that explains the above.