Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Suggestions on best practices to include in PR chapter #48

Open
joelostblom opened this issue Oct 4, 2024 · 4 comments
Open
Labels
discussion Tasks related to discussions or brainstorms

Comments

@joelostblom
Copy link

@lwjohnst86 at long last I managed to clean up my thoughts somewhat and paste them into this issue. There are still several resources I would want to read through, but I think they would only introduce tweaks and no fundamental change to this workflow. This more on the "best practices" side of things than the mechanics, which is something I think is really important and does not come naturally just because someone knows how to mechanically go through the steps of creating a PR or leaving a review. Let me know what you think!

PR best practices

Motivation

Why bother learning how to write proper PRs? A few excerpts from other articles that I think cover the most important points and we could summarize:

Bad PRs make it harder to collaborate, slowing down the development process. When a PR is hard to review, it gets fewer comments and disagreements, which leads to bugs and degraded code quality. Put together, these problems are ultimately much more costly than getting the code right the first time. Bugs create negative user experiences or slow time to market. Degraded code quality makes the codebase harder to build upon in the long term.

At the extreme, bad PRs can lead to “tick the box” style reviewing in which engineers approve without bothering to read the code. Negligent review defeats the purpose of creating a PR in the first place. Code developed without proper collaboration also inhibits the growth of junior developers, who need thoughtful feedback to learn. Having a robust review culture can help prevent skill stagnation. Here we outline several best practices to ensure that PRs are easy to review and serve their purpose to improve the development process.

https://doordash.engineering/2022/08/23/6-best-practices-to-manage-pull-request-creation-and-feedback/

A reviewer generates high-quality feedback when you allow them to focus on the interesting parts of your code. If you require them to untangle your code or police simple mistakes, you both suffer.

Improving code review technique helps your reviewer, your team, and, most importantly: you.

  • Learn faster: If you prepare your changelist properly, it directs your reviewer’s attention to areas that support your growth rather than boring style violations. When you demonstrate an appreciation for constructive criticism, your reviewer provides better feedback .

  • Make others better: Your code review techniques set an example for your colleagues. Effective author practices rub off on your teammates, which makes your job easier when they send code to you.

  • Minimize team conflicts: Code reviews are a common source of friction. Approaching them deliberately and conscientiously minimizes arguments.

The golden rule: value your reviewer’s time

This advice sounds obvious, but I often see authors treat their reviewers like personal quality assurance technicians. These authors make zero effort to catch their own errors or to design their changelist for reviewability.

Your teammate arrives at work each day with a finite supply of focus. If they allocate some of it to you, that’s time they can’t spend on their own work. It’s only fair that you maximize the value of their time.

Reviews drastically improve when both participants trust each other. Your reviewer puts in more effort when they can count on you to take their feedback seriously. Viewing your reviewer as an obstacle you have to overcome limits the value they offer you.

https://mtlynch.io/code-review-love/

For the PR author

  • Start by creating an issue where the change is proposed.
    • Creating a PR right away without discussing the changes
      can lead to wasted time and effort,
      either via substantial re-writes,
      or even that the PR is rejected.
  • Review your own code before creating a PR
    • Don't simply check for typos; if you were reading this code for the first time, what might confuse you? Code should be understandable without having to read a long PR discussion about what it does; could you re-write the code in a clearer way (it's not just for your immediate reviewer, but also for anyone reading the code years from now)?
      (TODO Link to something about how to write clear code?)
  • Write a clear description of what the PR is about.
  • A PR should contain just one semantic/logical change
    • Could be a single commit or a connected set of several commits,
      but should not introduce multiple unrelated changes.
      • E.g. if you are changing some logic of how a dataframe is wrangled,
        you shouldn't also create a new visualization in another part of the codebase
        just because you happened to think about it around the same time.
        Only if these changes are semantically/logically connected
        should they go in the same PR.
  • A PR should be as small as possible
    • Aim for PRs under ~400 (or "a few hundred", TODO how arbitrary is this number?) lines of code changed.
      • This will make it easier for the reviewer to find the time to review the PR
        and also to understand the changes.
      • Quick reviews ensures that PRs don't go stale
        and reduces the chances of merge conflicts.
        It also often means that less time is wasted
        reminding yourself what this PR is about
        as the changes are fresh in the mind of both the author and the reviewer,
    • Limiting PRs to just one semantic/logical change
      naturally makes them shorter.
  • Describe what each change is doing in each commit message.
    • Make sure your commits are atomic, using git add -p is helpful for this
      and also prevents accidentally committing more than needed
      (e.g. committing entire files when there are only some lines that changed).
      It allows allows you to review your changes.
    • A commit message should include the WHAT and also the WHY (in most cases)
      • Just typing git commit to bring up the editor
        instead of git commit -m 'message'
        can encourage the writing of more elaborative messages.
        Together with git's "verbose mode",
        this can also serve as a review of the changes
        (the "patch" shows in the editor).
    • A fun example of a good commit message https://dhwthompson.com/2019/my-favourite-git-commit (this might seem long to beginners although I agree with making it this long if there is a need; maybe complementing with a shorter good message would be helpful)
  • The commits that are merged into the main branch,
    should follow the conventional commit guidelines.
    That means if you are adding the PR code via merging/rebasing,
    then all commits should be formatted like this.
    If you are squashing commits,
    then only the PR title needs to follow this style.
    • In general, squash and merge is recommended
      as this gives a clearer history on main
      that still saves all the individual commits on the feature branch
      if the changes made in a PR are needed to be bisected further.
  • Respond to your reviewers comments when you address the changes they suggested
    and request a new review (don't silently update the code).
  • Be grateful rather than defensive
    when the reviewer finds issues without your code
    and asks you to change it.
    • It would have been a much bigger concern for you
      if your bugs made it into production.
  • Avoid getting personal in the discussion,
    talk about the code, not the person reviewing it
    (avoid using "you" as much as possible).
    • Be patient with conversations.
      Your reviewer might be incorrect
      or you might be misunderstanding something.
      The only way to find out is through effective communication.
  • Ask for clarification.
    • Sometimes you receive review comments that you don't understand.
      Instead of guessing what the reviewer means,
      ask for clarification and come to an agreement before making changes.

For the PR reviewer

  • Automate checks where possible: e.g. syntax issues, code formatting, PR conventions, software tests.
  • Mechanics: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/reviewing-proposed-changes-in-a-pull-request
  • Read commit messages to understand why a change was introduced.
    • If it is really hard to understand what the code does without reading the commit message,
      suggest that the code is rewritten in a clearer manner
      or an explanatory comment is added.
  • Review quickly, within 1-3 days.
    PRs go stale if they sit for days or weeks.
    This means that the author and the reviewer will lose context
    and the team will move forward slowly.
    If you don't have time to review quickly,
    ask you management to re-prioritize your time to benefit the organization.

Resources

@lwjohnst86
Copy link
Member

Very nice! Very similar to what I was thinking to have in the chapter, and how we do things in Seedcase 🥳🥳

@lwjohnst86 lwjohnst86 added the discussion Tasks related to discussions or brainstorms label Oct 7, 2024
@signekb
Copy link
Member

signekb commented Oct 7, 2024

Very nice! (and hello, @joelostblom ! 🥳 )
I think there's some every nice and interesting points here that we should definitely include in the chapter!

The points under For the PR author made me think about where it's best practice to add what information (in issues vs PR descriptions vs commit messages):

  • For instance, in the second example PR (Disable uri-reference format check in jsonsschema#2771), the description of the PR seems like some of it would make more sense to include in an issue in the form of a bug report? At least some of headings are similar to what you might expect in a bug report (problem and how to reproduce).
  • Similarly, the good commit message example is very long and the description would be something that I would expect in a PR description - not in a commit message? But now that I think about it, when we do a squash merge, we can actually get the PR description in the commit message. For me, that makes sense; having the PR description in the squash merge commit message, but not having such long explanations in each commit message.
  • In addition to the points above, I think there's a balance to be struck in how long the descriptions should be that's very much in line with the "keeping PRs small for a faster and smoother review process" point. I, at least, felt a bit overwhelmed by the length of the descriptions in the linked PRs under For the PR author.

What are your thoughts on this, @lwjohnst86 and @joelostblom ?

(Neither of those points are fully fleshed thoughts btw, just the thoughts that popped up while I was reading the issue ☺ )

@joelostblom
Copy link
Author

Hello @signekb 👋 Nice to e-meet you! Thanks for sharing your thoughts on this, here are a few comment on what you wrote:

...the description of the PR seems like some of it would make more sense to include in an issue in the form of a bug report?

I agree that for that particular PR, the "Problem" and "Reproduce" headings could have gone in an issue. I believe the reason they went in the PR was that they captured what was happening in a few different issues with a general description (which could have been a third issue instead, closing the other two).

Similarly, the good commit message example is very long ...

Yes, I think (but haven't confirmed) that this message is likely from a git, but non-github (or at least non-PR based) flow where I tend to see more elaborate messages. As you say, with squash merge, then PR descriptions go into the commits and make the difference between these two flows less apparent which I think is convenient.

... I think there's a balance to be struck in how long the descriptions should be that's very much in line with the "keeping PRs small for a faster and smoother review process" point. I, at least, felt a bit overwhelmed by the length of the descriptions in the linked PRs under For the PR author.

I understand that the long descriptions for these PRs can be overwhelming and it would make sense to include examples of a simpler PR as well. The PRs I listed were either touching something deep within the code base (as for "Remove deep validation...") or added a large new feature (as for "Integrate VegaFusion into JupyterChart"). It could have been useful to break those up into smaller PRs with shorter descriptions, I don't think it was easily possible for either of those as they both represent "one unit of change" in some sense. Given that restriction, it was incredibly useful to have these long descriptions when reviewing because it allowed me to understand the PR author's thought process and understand better why they chose to solve the problem this way, both of which would have been hard to recontstruct from the code alone or a shorter message. But again, all for adding in simpler PR examples too.

@signekb
Copy link
Member

signekb commented Oct 15, 2024

@joelostblom You too! 👋
Thanks for elaborating. That makes a lot of sense: Naturally, there's different needs for different contexts (like a non-GitHub/non-PR-based flow would need more elaborate commit messages and longer descriptions can be necessary with PRs that change something "deep" in the codebase).

Again: thanks for you inputs! They're super useful for when we start going into the specific sessions/chapters! 🎉”

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Tasks related to discussions or brainstorms
Projects
Status: Todo
Development

No branches or pull requests

3 participants