Proposal: Delay hashing performance checks

The default frequency ramp-up of many default CPU configurations and power plans is quite slow (takes seconds), and in result - it has a significant impact on the startup hash benchmarks (tests run for milliseconds).

There was an attempt to handle it by alternating between algorithms during the bench, but apparently it’s not enough (the whole bench is shorter than the nominal ramp-up period).

On many machines and over a long period of time I constantly have all kinds of results: On the same machine, the “hashing performance” varies from ~100 MB/s if ST started when PC is idle up to ~350 MB/s if ST started when PC is fully awake and busy. (can be simulated by simply killing a busy ST instance and restarting it immediately) So in real world use, the hash choices are made based on CPU speedup timing against ST startup (basically random), regardless of decent effort to make a correct calculated decision.

The (simple?) solution: Move the hash performance benches down the startup queue, i.e. do something else first (heavy, but something that doesn’t need hashing choices set yet) for a while (e.g. load a config). That will keep the CPU busy for several seconds during it’s speed-up ramp and then, when it’s running full speed do the hash checks.

P.S. - I know nothing burns down if the wrong choice made (weak hash on/off; minio vs crypto). But IMO this (simple?) change may prevent the waste of efforts made so far :wink:

Or we could warm the CPU up for a second or so.

Yea that’s the instinctive solution, but thinking about it: “warm-up” for the sake of “warm-up” would just waste CPU cycles and introduce unnecessary (i’m thinking 3 to 5 sec.) delay to the already lengthy startup time.

So if the same can be obtained by just shifting the startup routines’ order, why not?

(the answer may be “because benchmark routines can’t be performed later due to some internal dependencies”, but idk, you know this better than me for sure…)

I don’t think we’d do anything meaningful to warm it up anyway.

It’ll vary even more if the computer is loaded by something else at the same time. I don’t think this is worth worrying about.

1 Like

What I mean is to do it sequentially, not in parallel:

  1. finish loading something → (CPU is fully awake) → 2) then run benchmark

I mean that if you’re already loading all cores with something else, the benchmark will also only show a fraction of the available performance. There are many variables that affect the benchmark result, and we can’t affect most of them. But we only have two use cases for the benchmark: algorithm selection, and general statistics.

For algorithm selection we only need to get it right most of the time, and the only requirement for that is that two benchmarks run repeatedly within a few milliseconds of each other can be compared.

For general statistics we really only need to make sure to do the same thing every time, so we can compare changes over time and correlate to code or compiler improvements, general user base changes, etc.

The actual value for a specific run of the benchmark on any given device almost never matters.

But the argument here is that if you wake up from sleep, the cores are clocked down, you run benchmark 1 which performs badly as the cores are clocking up, and you run a second benchmark which probably has the cores clocked up and end up with skew. I it’s fair to do a throw away round of benchmarks just to get the CPU to clock up.

We already run each benchmark thrice, interleaved, and select the best.

Good enough I guess.

Well, but that’s the point, it’s not really good enough. I’ve been monitoring the hashing selections on 2 of my machines for quite a while, and it’s random at best (while there is a clear winner between the two).

Anyways, I’ll try to compile it locally and test some of the ideas mentioned. I’ll post my findings back here.

I.m not convinced that this is worth doing anyway, but how about a “benchmark for next time” process?

get saved result
if new possible result introduced {
    run benchmark
    save result
    if there is a result saved {
        delay long enough for things to have calmed down
    run benchmark
    save result

This makes a couple of assumptions

  1. the best algorithm doesn’t change frequently
  2. it is expensive or inconvenient to switch algorithms mid-run

If the first isn’t true why benchmark at all?

If the second isn’t true, we could switch immediately.