Synchronization algorithms

jodogne · October 15, 2014, 6:59am

Dear all,

Thanks for this great software!

I am interested in knowing more about which file synchronization algorithm is built in Syncthing. I am aware of the Block Exchange Protocol, but the latter only gives low-level information about how peers exchange information, and does not provide a high-level description of how the synchronization state of each file is maintained across all nodes.

Does Syncthing use its tracker to manage a central index of all the available files in a given repository (as in Napster)? Does it automatically create a star-shaped topology for each repository by electing one master node that is responsible for the indexing of this repository (as recommended by Unison)? Does it build a Distributed Hash Table (DHT), as in BitTorrent? Does it use some Version Vectors flavor to track the various versions of each file, together with their adding/removal?

As a side question, what is the Syncthing tracker responsible for, besides maintaining an index of all the (IP address, port) pairs for each peer that is active in this repository?

Thanks for your answer!

Regards, Sébastien-

AudriusButkevicius · October 15, 2014, 8:48am

So @calmh probably can explain it probably in much more detail then I can, but this is my (perhaps incorrect) understanding of it:

There is not tracker, there is only an announce server which is used for discovery ONLY. There is no master, all peers are of equal importance.

Each device is responsible of advertising its index updates (the diff between the old and the new) to all other devices as they discover changes to the folder. (I think the initial exchange contains the whole index) All devices store each others indexes, so that this way a node can work out what files are out of date, as each file has a sort-of lamport clock based versions.

Files which were deleted are still tracked, but marked as deleted.

As nodes pull the needed files, they update their own index, and as a result advertise it, which then makes remote peers understand that this peer now has what we need, hence they can start pulling data from it.

bigbear2nd · October 15, 2014, 8:58am

Are these ever removed from the index? In some cases, the index might get quite big. Also in other cases i dont want to send all the filenames which have ever been in a folder to newly connecting node…

AudriusButkevicius · October 15, 2014, 9:09am

I don’t think there are alternative ways of doing this, but it is still marginal compared to the index of files which are there, as you spend 64 bytes for every 128kb. Again @calmh might correct me.

calmh · October 15, 2014, 11:04am

Deleted files are currently not removed from the index. Ideally that should happen once every node in the cluster knows about the delete.

bigbear2nd · October 15, 2014, 1:23pm

Ah, ok. I am more concerned about the privacy, then.

I think this might be quite tricky to realize on larger clusters, where ring configurations and seldom online devices exist. Is this a standard problem in IT?

PS: You really got attached to “node”… I also delete every single time the word “node” and write “device” instead.

bigbear2nd · October 15, 2014, 1:27pm

I created an issue for this:

calmh · October 15, 2014, 2:25pm

Habits, habits… In theory, this causes unbound growth of the index. In reality, I don’t think this is a problem. If you continually create and delete files with completely random names, you’ll grow the index size by a few tens of bytes per name.

To clean it out, we don’t really need to have all ~~nodes~~devices online at the same time, we just need to know that every involved device has announced a version number higher than the delete version at some point. This we could track.

jodogne · October 15, 2014, 2:30pm

Thanks for your explanations! This is much clearer to me now

Please could you tell me where the data structure that encodes the index of a node is defined in the source code of Syncthing?

Also, when you write “sort-of Lamport clock based versions”, do you have a pointer to the file that implements this data structure, and/or a reference paper about the used algorithm?

calmh · October 15, 2014, 2:33pm

All the data structures for the wire protocol + on disk index are in the internal/protocol package, messages.go. For the lamport clock, there’s a pretty trivial internal/lamport package, and the Wikipedia article is pretty good; also references the original paper.

AudriusButkevicius · October 15, 2014, 2:34pm

https://github.com/syncthing/syncthing/blob/master/internal/protocol/message.go https://github.com/syncthing/syncthing/blob/master/internal/protocol/message_xdr.go

It’s just an integer representing a version number. It does not have eventual consistency part of the original paper.

jodogne · October 16, 2014, 9:31am

Once again, thank you for all these explanations!

Sébastien-

user555 · December 4, 2014, 3:03pm

Hi, I have been testing your software for quite a while and I was wondering if one client is trying to ask several clients (with the same folder) at once to transfer different blocks of a file to speed up the total transfer?

I wanted to look up the algorithm, but could not find it in your code.

Another question is referring to the block transfer in general. As far as I understand it, if for example the transfer stops after having reveived 10 of 100 blocks, you are able to restart at block 11, to avoid double transfer, right? How do you make sure that you have received all blocks without damage? Are the blockhashes re-calculated and compared with the index after receival? Thanks!

Frank

AudriusButkevicius · December 4, 2014, 3:32pm

The implementation is here: https://github.com/syncthing/syncthing/blob/master/internal/model/puller.go#L465

By default we pull 16 blocks at a time, pulling a block from least busy node. Least busy means the one with least requests currently in-flight from your machine.

There are multiple ways we reuse blocks.

First of all, we check if there already exists a temporary file for the file we are trying to download, for example if the transfer was terminated, we remove all the blocks which are already there from the list of blocks we need.

Then we can reuse blocks from other files, for example two files share some same blocks (or the existing file which we are about to replace), we will not pull the blocks from other nodes, instead we will copy the blocks for the other file which we share to the temp file. (Though a block is 128kb * N where N is the block index in the file, so if two files are nearly identical, but one of them has one extra byte at the beginning, that shifts all the blocks by one byte making all of them different)

Blocks that are still missing, we pull.

Once we’ve got all the blocks to the temp file, we replace the old file with the temp file, and rehash it to make sure that the content matches.

If it doesn’t match, we pause for a minute and retry again the same set of steps.

This means that if a file which we are trying to download is changing as we download it, it might take a few minutes for it to get in sync.

One thing we don’t do (which I would like to do), is allow other nodes pull blocks for files which have not yet been fully downloaded (proper torrent style). Currently a file only becomes available for other nodes to pull from once the file has been successfully synced across.

uok · December 5, 2014, 12:38am

Currently a file only becomes available for other nodes to pull from once the file has been successfully synced across

Does Syncthing also use existing files/blocks across all shared folders on the same device?

AudriusButkevicius · December 5, 2014, 8:42am

Yes, but that might change as we are allowing someone to access a folder we might have not shared with.

This is not a problem today, as you cannot request blocks by hash, and you have to request them by file/folder, but might become in the future.