Advice on profiling

Hi,

I am seeking advice on the best way to analyze the performance of Syncthing and maybe propose some enhancements.

Context:

I want to sync big files (400-800 GiB) between 3 instances. The machines have 32 GB of RAM and are connected on a 1 GB/s network.

Observations:

For small files (up to 100 GiB), the transfer speed is around 25-30 MiB/s while the memory usage is low (1-2 GB). All is good. For big files (400-800 GiB), the transfer speed is around 1-3 MiB/s while the memory usage is very high (6 to 9 GB). The transfer takes forever.

I have already made CPU/Memory profiling but I am struggling to find out an explanation to the poor performances.

Could anybody provides me with some guidelines on what to look at or how to interpret the profiling files ?

You need to look at the resulting profiles using go tool pprof. From then on it’s all handywork, trying to figure out if what you’re seeing is reasonable and if not, what to optimize… I’ll be happy to help if you share profiles of things that seem broken.

You are confirming what I have understood from the Profiling Go page. I have fetched pprof files (CPU/Heap) by following the Profiling guide.

Where should I send the files ?

Thanks.

Post them here if you like. Otherwise, the first step is usually to run go tool pprof $binaryName $profileName and run the command web to get a graphical overview of what\s in the profile. Then it can be narrowed down as appropriate. This procedure is the same for CPU, memory and block profiles.

Here are the files:

  • syncthing-instance1.zip is for the instance that has all the files (~2TiB).
  • syncthing-instance2.zip and syncthing-instance2.zip are the two other instances.

After using the pprof tool to display the top20 objects, I have noticed that the memory used is far from what is reported by Syncthing in the GUI. Is it related to the way Go report the memory usage in pprof file ?

syncthing-instance1.zip (52.5 KB) syncthing-instance2.zip (76.4 KB) syncthing-instance3.zip (83.0 KB)

Yes… Measuring memory usage is tricky, with all of the concepts of “memory”, “in use” and so on being somewhat unclear and defined differently from system to system. In general though:

  • Go manages memory for the heap, for goroutine stacks and probably a small amount of runtime overhead.
  • Most long time allocated memory lives in the heap.
  • Since Go is a garbage collected language there is some overhead. Usually the memory allocated for the heap can be up to twice the memory actually in use by the program.
  • The GUI reports all memory requested from the OS by the Go runtime, minus memory returned to the OS. This includes the heap, stacks and overhead.
  • The heap profile only looks at the in use heap memory. So expect total usage to be higher due to garbage collection, stacks, etc.
  • Go returns memory to the OS by marking the pages as being unused and candidates for discarding. If the OS doesn’t currently need the memory they’ll stay in RAM and may or may not be accounted in the process’ resident set.

Your profiles look reasonable enough to me, and in particular don’t show the high memory usage that you report above? However, this being on top in the heap profile is somewhat concerning and probably an effect of syncing those large files:

We should try to optimize that, in the meantime can you try setting disableTempIndexes in the advanced settings and see what effect that has?

Also, in the CPU profile, there’s a lot of time attributed to optimizing ignore patterns, which is #3394.

And again the profiles are a bit broken in that the deep stack traces don’t make any sense. I reported that to the Go team but so far no response so maybe it’s just me or just Syncthing that’s broken. I’ll have to look into it I guess. :confused: