Picking up our recent discussion, I thought the release candidate testing phase was meant to catch exactly this kind of situation? Bugs introduced found in the new version should be fixed before it is declared final.
I was quite aware of that regression, so tried to be quick when reviewing #8147 and #8157 so the fixes could be picked up before release. So how can we make sure such things do get picked up? Have some label for “release-critical” PRs? Let co-maintainers manage the release branch as well, not only main via PRs?
I feel quite confident with using Git in general, but the branching / release model for Syncthing always has me thinking twice and is obviously error-prone. What I learned from the other thread (don’t base merging decisions on release timing) is all good. But in the opposite direction, when handling a release, merges since the last -rc tag should definitely be skimmed for cherries to be picked IMHO. Even better, cherry-pick regression fixes to the release branch right after they are merged on main, then tag the next -rc.X.
I don’t think the process is especially complex or error prone. The “branching model” is entirely linear, i.e. nonexistent, most of the time. We release an rc.1 at t=0. We release an rc.2 two weeks later. We release the stable version one week later, after having rc.2 out for a week of testing. Bugfixes merged to main between rc.1 and rc.2 get into the release.
The fixed fix for the bug in question was merged yesterday, which is way too late to make it into a release today.
Now, as I did the release, I could have decided that that bug was a release blocker and released an rc.3 today instead of the stable, pushing the release cycle forward. Or we can decide that it’s critical that we immediately push out a 1.19.1-rc.1 and release that as an emergency patch release in a week. Neither of those seem proportional, and if they were, someone should advocate for them and explain why they are.
I guess our perceived importance of the matter differs. For me it was clearly something needing an urgent fix. And the fix could have been really minor as well, with no potential to break anything. Only one line for fixing the regression, as mentioned in #8146 (comment).
What good is the testing period between -rc.2 and the final version, of regressions (not random bugs) found in that time frame are not fixed? In doubt, I’d rather squeeze in another few days on -rc.3 for such a case, than skipping the regression fix. We don’t need to wait a week after -rc.2 either when the regression fix is ready, but just go ahead and tag -rc.3 immediately. That won’t even push the cycle.
The one week waiting period is for pushing something to stable after having released it as RC – the point being to have the RC out in the wild for a while to catch potential disasters. We could have pushed an rc.3 with the fix in question yesterday, but (if we’re to follow some sort of procedure) that still needs to mature for a week before release.
There are two conflicting goals for the release process:
Release precisely according to a predictable schedule
Release only full featured, bug-free things
We by necessity don’t adhere 100% to either – the reality is always somewhere in the spectrum in between. Regressions are bad, and I’m a bit sad we let this one slip in. Yet I don’t think it’s the end of the world.
I don’t think anything can “mature” when it’s not allowed to change in the same period. In that sense, the procedure should be bypassed / changed for only the very narrow topic of regression fixes, defined by:
Stuff that worked in the last release,
then got broken in the development cycle,
was trivially fixed before the scheduled release date,
with a small change that obviously concerns only that regression.
For the case at hand, all four conditions were met. That would be my prerequisite list for making -rc.3 releases, immediately and therefore with maximum extra testing time without delaying the cycle. Same applies for regression fixes after -rc.1 by the way, so people who found the regression have the most time available for testing the fix.
Non-trivial regression fixes are a different can of worms, effectively needing to revert feature additions if they went in too soon. Let’s keep that out of scope for now.
Yeah I mostly agree, in general. What we’ve done previously to cause that to happen is to flag it on the PR – “this should go into / trigger a new RC”.
In this case that didn’t happen, and the fix was arguably non-trivial; I did an initial attempt which was trivial, but not the whole truth and met with some resistance. The more proper fix was first buggy and broke more of the GUI than it fixed – and that was really the point where we should have triggered the rc.3 since that was the fix for the regression. We’d then have to have released an rc.4. Which is entirely the point about why we don’t just merge-and-release in quick succession.
So, again, process is fine, but we’re applying it as humans and my judgement was that this was a bit messy and the existing rc.2 would survive until next release.
So there is an official way to apply my judgment as well and indicate it? I guess nobody sees the whole picture at all times so it would be good to give more people a chance to flag such changes. Just like we do with reviews–more eyes, more perspectives and usually more problems caught before even merging the PR.
You say the process is fine, but the part where you said “we have an rc.1 and an rc.2 on a timed basis” is different from the situation you just described, no?
Regarding time between merging a fix and releasing a new RC: Another precondition could be added:
The fix was verified by the user reporting the regression.
In all the usual cases recently, that would have been the sharp eyes and excellent testing habits of @tomasz86, who also caught the brokenness in the first fix. For such users, it’s okay to test the fix by building themselves, but with people waiting for an RC to download, we have a chicken-and-egg problem.
I think the official-est way is to highlight on a PR that it is an important thing that we should not forget to include in the next RC, or even make sure to create a new RC for:
@syncthing/maintainers: I think this should trigger a new RC
Everyone concerned gets highlighted, and the intention is preserved in the history of the world.
Just to be clear, my “so, process is fine, but” above means “so, it’s all good and well to have a process and attempt to follow it, but”, not that the process is awesome and ever sufficient. Rather the opposite. But yeah, the rc.1 is quite strictly timed, the rc.2 is always present and released no later than two weeks after rc.1, so that part is usually quite process-following.
I don’t think this is something we need more process to solve, if that’s what you’re saying. Maybe more communication.
Sorry about the process stuff, I did misunderstand what you wanted to say.
I fully agree that the predictable nature of RCs is a strong point, especially if -rc.2 is tagged “no later than …”.
The change I would welcome in “the process” is to be more liberal with tagging RCs after -rc.1. IMO it should only ever be regression fixes after that “merge period”, nothing else. Hence also why I get confused and started advocating for separate branches, to not effectively hold off everything else from merging to main.
What problems do you expect if we have quicker, regression-driven RCs in between? The current story would have resulted in back-to-back rc.3 and rc.4, yes. But people living on rc versions should expect to get hit by bugs occasionally, and the accelerated approach would actually minimise the time they are affected.
No real problem, other than that it looks ugly and uncoordinated to keep releasing broken RCs. In this case I just didn’t feel it was justified, and (incorrectly, clearly) didn’t get the feeling anyone else thought it was either. Had there been a note on the PR similar to the above I’d have been happy to drop another PR and delay the release a bit.
As you say, ideally by cherry picking that change into a release-next branch immediately at that point so it’s not forgotten, or as soon as possible after the PR is merged. If it’s not urgent enough to cause a new PR then and there, in which case nothing needs to be remembered more than a couple of minutes.
The way I see the process as it is currently is roughly:
rc.1, strictly scheduled, whatever is in main goes in, nothing complicated.
rc.2, immediately if and when an important problem in rc.1 is discovered and fixed, otherwise two weeks after rc.1 containing all non-scary improvements since rc.1.
Most commonly rc.2 is just a later point in main than rc.1 – maybe all the way to where main happens to be at rc.2-release-time, otherwise up to and including the safe stuff and stopping before whatever feature would go better into the next real release.
rc.3, rc.4 etc as needed to fix problems in rc.2. Now we’re late in the cycle though so we’re minimizing changes, and this is likely to be releases from a specific release branch going out from the rc.2 point.
This is roughly what’s happened for the last while at least.
Changes between rc.1 and rc.2 just get less testing.
Contrary to your recent statements, merging big, “risky” stuff should be thoughtfully avoided in that period. Meaning I need to stop thinking about a feature when it’s done, then grok it again when review-and-merge-scary-changes time comes.
The time to test regression fixes is artificially shortened, leading to higher risk (when the fix is itself broken) of delaying the final release (which I’d like to avoid as well).
Embracing the concept of a slightly nonlinear history with commonly parallel main and release branches would be part of a solution to this I think. The main difference being that release is managed by maintainers only using cherry-picks (as now), but main is controlled by the regular PR workflow including write access (and therefore possibly deciding to merge something) for others like me.
Regarding the timed RC points, many other projects seem to talk about that as “freezes”. So we’d have a soft feature freeze or better “dev preview snapshot” (rc.1 time, still open for small changes, main otherwise blocked ), a hard feature freeze (two weeks later, where rc.2 happens now), followed by a final period of only regression fixes. These freeze points don’t necessarily have to correlate 1:1 with the RC tags (more in between please for regression fixes). And they don’t have to correlate 1:1 with working on specific branches (main should ideally be open for merging next-release features anytime).
Note in such a context the “rc.1” label is misleading, if it’s really a development preview where more (small) features are likely to be added, not a “candidate that will become the release unless …”
Given we have few RC testers (sadly), not many people will be annoyed by that . And regarding “ugly” (non-linear) commit history: I for one would be happy to see the context of a change (was it a regression fix on release or just the first, untested and potentially buggy shot at some change on main being queued for the next release).
My personal opinion is that a combination of using the release-next branch when “scary changes” occur after rc.1 and/or pinging @syncthing/maintainers (didn’t know that, nice) would be helpful in preventing bugfixes to slip through the cracks.
The reasoning to change from “regressions only” after rc.1 to “no scary changes” was that it’s a disservice to users and also annoying to us if a bugfix might take up to 7 weeks from merging to being released. With that same reasoning I’d advocate to liberally cherry-pick after scary changes get merged after rc.1. Either by pinging maintainers or even opening a PR against release-next.
I had a bit of a slow/low-investment patch when it comes to Syncthing, especially releases (except android…), I’ll strive to up my involvement again too here
Nope, I really don’t think you should ever do that. Just merge anything. If something comes after a scary fix, and is an important and safe bugfix, bring it to the attention and it will be cherry-picked. That isn’t going to happen a lot, and not merging brings all the problems you describe.
Yeah right, our RCs could maybe better be previews, or betas, or what not - does it matter?
There must not be any first, untested and potentially buggy shots on main.
Sure, I just violated this rule repeatedly with a stream of buggy UI changes (which would have been even buggier without the good reviews, no complaints there), but that just highlights a combination of me not being invested enough in this UI changes and the UI being an untested mess. It doesn’t change that this must be evaded as best as we can, and a similar streak on the backend would not be acceptable to me (as in I’d ask for tests from myself ).
Thanks for the clarification. I guess we don’t have an immediate need for changing the habits then. Now I understand the reasoning better and will try to heed these guidelines. The main point of misunderstanding was really that -rc.1 does not mark the “end of a feature development cycle”, but -rc.2 does. I was under the firm impression that I ususally have no more than one week after release to finish and advocate for merging feature.
So yes, the naming is kind of important there, a -pre.1 (or -preview or -tp.1 though alphabetically following -rc) would much better convey the purpose of the release. After all, they are announced and downloadable on the public releases page. For auto-updates of course it won’t matter at all.
Fully agree that this is really hard to get right on the UI side. I repeat, we are very lucky to have @tomasz86 diligently testing every button regularly. For myself I can say that many features I have never even used (versioning restore, ignore patterns, introducers, etc.) and wouldn’t know what to expect. Really thankful we have a good and strict compiler toolchain on the backend with helpful warnings.
I’m thankful for the kind words, but really, it’s usually more of pure luck rather than extreme diligence on my part .
It’s only that I’ve recently been working on a few of my own PRs, and when doing so, I always try to do it on the latest main. Thanks to this, I was able to discover the latest bugs very quickly. Otherwise, to tell the truth, we only run stable releases on the real devices, as I’m too anxious about the data to run RCs (even if the danger is low), so normally I’m not that fast to “catch brokenness”.
That said, I indeed do try to click on each and every button during my testing process , including those that I don’t really use myself (e.g. Auto Accept). Also, it’s often said that developers don’t like working on graphical interfaces, but for the end user, the GUI is everything, so I think it’s worth it to always make 200% sure everything’s working as intended .