The implementation is here:
By default we pull 16 blocks at a time, pulling a block from least busy node.
Least busy means the one with least requests currently in-flight from your machine.
There are multiple ways we reuse blocks.
First of all, we check if there already exists a temporary file for the file we are trying to download, for example if the transfer was terminated, we remove all the blocks which are already there from the list of blocks we need.
Then we can reuse blocks from other files, for example two files share some same blocks (or the existing file which we are about to replace), we will not pull the blocks from other nodes, instead we will copy the blocks for the other file which we share to the temp file.
(Though a block is 128kb * N where N is the block index in the file, so if two files are nearly identical, but one of them has one extra byte at the beginning, that shifts all the blocks by one byte making all of them different)
Blocks that are still missing, we pull.
Once we’ve got all the blocks to the temp file, we replace the old file with the temp file, and rehash it to make sure that the content matches.
If it doesn’t match, we pause for a minute and retry again the same set of steps.
This means that if a file which we are trying to download is changing as we download it, it might take a few minutes for it to get in sync.
One thing we don’t do (which I would like to do), is allow other nodes pull blocks for files which have not yet been fully downloaded (proper torrent style). Currently a file only becomes available for other nodes to pull from once the file has been successfully synced across.