Should "rolling" replace "weak" hash terminology?

This has come up a ton of times with the startup benchmark logging, where we report that we choose the “weak” hash even though it is slower than the “strong” hash. Just now someone interpreted it in terms of security:

What I (interested layman) associate with the “strongness/weakness” of a hash is how likely collisions are (which usually also correlates with how costly it is to compute). So I think the hash used for detecting shifted blocks is weak (-er than sha256) to minimize the performance impact. So calling this hash function weak by itself makes sense, but the combination of both the weak hash for shifting detection and the “normal” hash would probably be better dubbed “rolling”. This would also have the benefit that googling the term actually brings up relevant things, not just generic blah on “weakness of some hash functions”. If I am completely on the wrong track with my reasoning, please relieve me of that particular ignorance, and if not, I would propose a bit of s/weak/rolling/gc on our code.

2 Likes

I agree. If it helps avoiding the impression (from the startup benchmark output) that we switched to using a weaker algorithm and it became slower, all the better. Go wild with renaming.

1 Like

Ah! I was looking into how to stop using Weak Hashing on the assumption it was a bad thing. Changing the name or having a better description in the log files would help reduce confusion.

Yeah.

I wanted to quickly replace “weak” with “rolling” in logging, but it introduces some discrepancy between user facing logs and config names (and protocol and internal variable names, but that’s not really a problem): weakHashThresholdPct and weakHashSelectionMethod. In the context of the protocol it totally makes sense to speak of “weak” hash, as there is nothing rolling about them there. I see two options:

  1. Replace “weak” with “rolling” in logs as originally intended - users wanting to adjust the weakHash... settings can hopefully make the connection.
  2. Extend “weak hash” to “rolling weak hash” or even “additional rolling weak hash” - makes the connection to config options easier but is clunky.

I am aware I am making a mountain of a molehill in this “decision”, but the more insignificant the harder making a non-arbitrary decision :stuck_out_tongue:

I’d just do a config migration between the two variables and call it rolling everywhere. Its still rolling in the protocol as you have to roll the data to get that value.

I guess that’s the clean thing to do, but it means a change in both config and protocol “just for cosmetics”…

I’d say just change the log messages. Nothing controversial there. Config and envvars etc are not worth it imho.