So the point of the v0.9 release is supposed to be improved scalability and performance. I’ve started doing some light benchmarking to check that I’m on the right track, and I think it looks good. In this test I have two nodes and a single repo. The repo contains 11 GiB in 175000 files. Node 1 has all the files and is indexed and ready, node 2 is completely blank. I measure CPU and memory usage for node 2 from startup to “in sync”.
Here’s v0.8.20 (the x scale is in seconds since start):
There’s an immediate spike up to 150 MiB of RAM when it receives the index, then it … crunches? that index at 100% CPU for almost five minutes. I honestly don’t know what’s happening there, but it’s reproducible. This is measuring I should have done earlier, because that’s creepy. Then it starts syncing at about the 280 second mark, finishes at about 450 seconds and then continues crunching at 100% CPU until I interrupt it. Memory is allocated along the way as files are synced and added to the local index, ending up just over 250 MiB.
Here’s v0.9.0 beta (the scales are the same as above for easy comparison):
Here we can see an initial increase in memory over the first 60 or so seconds as indexes are received and the pulling starts. There’s no real delay until it starts syncing, as it was with v0.8. It’s done syncing at about 400 seconds and CPU usage drops to almost nothing, memory usage constant at just under 100 megs. Some things to note here:
v0.9 actually takes longer to sync and uses more CPU doing it, counted from when it starts. This is a reasonable tradeoff because we’re doing a lot of database transactions in the meantime. In v0.8 we skip all that since the “database” is just a map in RAM. Even so, it finished the race a little earlier. In an actual setup with non-infinite bandwidth between nodes the difference should be smaller.
The memory usage here was a difference of 2.5 times. But given more data, v0.8 would balloon further while v0.9 will stay constant.
There’s for sure more to do to reduce the memory footprint of v0.9, but just getting it to a constant level instead of being a factor of the synced data is a good step.