I want to discuss how we should do code review. Ideally, the discussion should result in a set of guidelines we can agree on. This post is not a suggestion for those guidelines; it is merely my thoughts on code review in a more or less structured manner as a start, things to agree or disagree on.
Purpose of code review
There are several reasons for code review. On one hand, it’s a filter against bad stuff. Obviously, we want to avoid shipping broken stuff. We want to catch bugs earlier and not afflict our users with them. Perhaps more importantly, we want to avoid adopting code we’re uncomfortable maintaining. Any given piece of code is a little bit like a pet; it will grow and need care and feeding over many years.
On the other hand, especially with newer contributors, it’s also a potential teaching moment, a conversation with someone who maybe never interacted with our project before.
Stages of code review
When looking at a new PR, I apply review in stages. This goes from the more general to the more specific. Something like this, where we don’t reach the next stage until we’ve cleared the previous ones:
- Conceptual. To begin with, is this a feature or change we want? Does it fit into Syncthing? Does it solve a problem we want to solve? Does it do it reasonably, or is it entirely misguided? Problems at this level typically generate a “wait, let’s back off and talk about what you’re trying to do” response without looking at the details of what the code does.
- Architectural. This is looking at the code but at a high level. Does it attempt to shoehorn something into the
model
that should rather be added as a filesystem layer? Is it adding many config options to a versioner when it should instead create a new versioning kind? That sort of thing. If the architecture is problematic, there’s no reason to look at details in the code because it will likely need to be largely rewritten. - Technical. Now, we really look at the code. We read to understand, find bugs, tease out unclear or missing things, etc.
- Stylistic. I avoid this type of review unless it is egregious, in which case it should have been an issue already at the architectural step… The code won’t look like I wrote it, and that’s fine; we all do things differently.
In many cases the first steps are implicit and never seen; it’s unlikely that a minor bugfix PR has conceptual problems. However, it’s a mistake to perform a technical review on an architecturally or conceptually unsound change – it wastes everyone’s time and communicates the wrong expectations.
Amount and intensity of review
There is also such a thing as over-review. The effort put into the review should be proportionate to the consequences of getting it wrong. The cost of mistakenly accepting a bug might just be that we need to fix it afterwards – unless it’s a bug that causes data corruption or another catastrophic failure.
The cost of trying to perfect every little detail before merging may be that the merge never happens, and we lose both the contribution and the contributor. Judgement calls need to be made, considering the complexity and risk of the change and the surrounding code.
Rejecting a change
At some point, we may need to say no. This might be at the conceptual review step; it doesn’t matter how nice the code is if we don’t want to do what it does. This is fairly unusual.
Much more common is that a PR gets stuck at the architectural review stage: it’s not good enough to merge, even with the best of intentions, and it’s not moving forward for a long time. Perhaps someone else can step up and adopt the PR, doing the required changes and taking it over the finish line. When that doesn’t happen and it lingers on for months or years, it needs to be closed and removed from consideration.
Getting stuck at technical review is worse. We’ve already implicitly agreed that the change is in principle sound and should be accepted and we’re looking at possibly minor details. At some point the reviewer might consider just doing the final adjustments themselves and merging it. If that doesn’t seem feasible, closing may be the better option.
Timeliness
This is a tough nut to crack. Ideally, code review would be quick & immediate. In practice, it’s both hard and sometimes boring to review changes…
The recipient matters; old, grizzled contributors can take some “abuse” in terms of having their PR reviews linger, while that’s a very bad first impression to give a new contributor who enthusiastically wanders into our project. Prioritising the new and the simple is reasonable.
There are ways of having GitHub systematically assign reviewers, but I’m not sure we’re a large or diverse enough team for that to work out. There are areas of the code that only a single person really understands well (and it’s not always me).
Ganging up
As a reviewer, inviting someone else to review for a second opinion is, of course, fine. Otherwise, I often feel that the fewer cooks the better the soup. For a newer contributor, getting conflicting advice from multiple reviewers will be very confusing and possibly a blocker since they may not be able to judge whose advice to take, and what is just advice as opposed to must-follow directions. The more the contributor and the reviewers know and understand each other the easier it gets, obviously.