How to do a good code review

English | Polish

This article was originally published in Polish in issue 112 (March/April 2024) of the Programista magazine.

Conducting good code reviews is a skill worth mastering. In this article, we will discuss the advantages and disadvantages of this process and explore the types of projects where it is most beneficial. We will consider the best approach to take when reviewing code, how to effectively carry out the process, which aspects of the code to focus on, and finally – how to write comments on the code in a way that benefits the project. The goal is to ensure that the review process serves as an opportunity for fruitful communication among team members rather than a source of conflict.

Have you been assigned your first code review and don't know where to start? What should you focus on when reviewing someone else's code? How should you format your comments? Or maybe the code you’ve written will be reviewed by others, and you're unsure of what to expect or how to respond? If so, this article is for you! Even if you're an experienced programmer, you'll find a set of tips here — some you might agree with, some you might not — but they will certainly give you something to think about.

We're talking, of course, about the process in software development where one programmer reviews another programmer's code and is tasked with providing comments and approving it if everything seems fine. This practice can take various forms in different organizations and teams. It might be as informal as calling a colleague over to glance at the code on your screen and share their thoughts. On the other hand, it could be a fully formalized process using specialized tools within a version control system (e.g., a "pull request" on GitHub or Perforce Swarm), requiring the reviewer’s approval before a merge or submit can proceed.

We will discuss code review as a process within the context of working in a specific organization, within a particular team, on a given project. However, the same principles can also apply to open-source projects, where feedback provided by various users — via "issues" and "pull requests" on GitHub, for example — also serves as a form of code review for publicly available code.

When and why to do a code review?

Deciding whether to implement the practice of code reviews in a project should be made on a case-by-case basis, taking into account the nature of the project and understanding the pros and cons of this practice. The first advantage that likely comes to mind is improving code quality. Having an additional pair of eyes review each new or modified piece of code offers the opportunity to evaluate it from a different perspective, potentially catching and addressing any errors before they make it into the repository. These errors might include logical mistakes, improper handling of edge cases (e.g., when input data is empty), memory leaks, performance issues, or even security vulnerabilities.

However, there are other, less obvious benefits to performing code reviews. Introducing this practice ensures that at least two people have seen and, to some extent, understand each piece of code. This leads to the desirable dissemination of knowledge within the team, rather than isolating each engineer in their own “silo” of expertise. As a result, someone else may find it easier to work on that code in the future if needed. In other words, the so-called "bus factor" increases. Finally, the mere awareness that a teammate will review our code — not just the compiler — can motivate us to be more diligent in writing code that not only works correctly but is also well-thought-out and readable.

We can also, of course, list the disadvantages of conducting code reviews. The first and most obvious is the valuable time it requires from developers. Every moment spent reviewing and commenting on someone else's code is time not spent writing or debugging your own code. From the perspective of those managing resources and developers' time, this might even seem like a waste, or at least a factor that reduces productivity. Additionally, the need to break away from one's tasks to focus on a different part of the program can be mentally taxing.

A common argument made by opponents of incorporating code reviews into the process is that errors in the code should instead be caught by well-written and regularly executed tests — unit tests, integration tests, etc. — or by manual testing performed by the QA team, as well as by other automated tools like static code analysis. This is, of course, true, just as it's true that a person reading the code cannot "mentally test" as many possible scenarios as good tests can check.

Nevertheless, one does not exclude the other. In fact, you could say that automated tests and code reviews complement each other perfectly. A reviewer might notice a scenario not covered by the tests. Moreover, tests do not assess critical aspects of the code, such as its structure, readability, the appropriateness of variable and function names, comments, or included documentation, all of which can be crucial for future code maintenance.

Another potential drawback of code reviews is that they can prolong the process. The need for every code change to be approved, similar to the requirement for passing automated tests, can result in even the smallest change being committed to the repository only after one or several days. This situation might be acceptable depending on the nature of the project and how quickly the code needs to be iterated. The efficiency of the review process is also significantly influenced by how it is organized. If any team member can approve a change, the process can move faster than when every review must be approved by a single team leader, who might be very busy or on vacation.

To summarize this section, whether or not it's worthwhile to conduct code reviews depends on the nature of the project. If you're developing a prototype, an internal tool, or a temporary solution, it might not be necessary since the highest code quality may not be required. In such cases, the speed of development, ease of modification, and the agility of the entire process are more important.

On the other hand, if the project involves the development and maintenance of a system that will be delivered to clients, serves as the main product of the company, or acts as a platform on which other software operates (e.g., a compiler or an operating system), then code quality should be a priority. However, it’s always worth maintaining at least a minimal level of code quality, even if not enforced by extensive processes. After all, we know that "temporary solutions often become permanent." Table 1 can serve as a summary of this chapter.

Not necessarily Yes
Less critical project, e.g., an auxiliary tool Important project, e.g., the main product of our company
Time is key – quick iterations Quality is key – minimizing errors
Temporary project, e.g., a prototype Long-term project – existing or will be maintained for a long time
Project for a narrow audience, e.g., internal use Project for a wide audience, e.g., publicly available on the Internet
Security and reliability are not crucial, e.g., a single-player game run locally Security and reliability are crucial, e.g., a web server, platform for other software

Table 1. Is code review worth it for a given project?

How to approach code review?

In this article, you won't find specific examples of code or programming constructs to promote or avoid. Therefore, this article can be useful to you regardless of the programming language you use. Instead, we'll focus on the general approach to conducting code reviews. But first, let's start at the beginning…

When performing a code review, just like when programming, good programming practices apply. You’re likely familiar with many of these practices, whether they're formalized or not, regardless of whether you've taken courses, studied computer science, or are self-taught and a practitioner. I'm referring to principles of correct, efficient, and simple use of the available constructs in a given language and libraries. Everything you would keep in mind while writing your own code is also worth considering when reviewing someone else's code.

Organizational, team, or project-specific guidelines also apply. If you have a documented "coding standard" that dictates specific naming conventions for variables (CamelCase or snake_case?) or error handling strategies (throw exceptions or return numeric error codes?), it's important to check for adherence to these rules during code reviews. Such a document is valuable as it can resolve many ambiguities and serve as a reference point. Even if no formal standard is documented, every team that has been working together for a while will have certain unwritten rules (often referred to as "tribal knowledge") that are worth discovering and adhering to.

Lastly, personal etiquette is crucial during code reviews. Ultimately, comments in a review are a form of communication between people. We’re writing them for a living person, not for a compiler or ChatGPT. We will delve into the specifics of writing review comments later, but it’s important to emphasize now that "netiquette" — rules of politeness and good manners that should apply on the Internet, and indeed always and everywhere — also applies here. In the context of code reviews, understanding the art of giving and receiving feedback is particularly useful. It’s worth exploring this topic further.

In a certain large project, there was a rule that every code change had to be reviewed by a lead programmer who oversaw the entire codebase. Let's call him "Big Ben." He was the one who laid the foundations of the project many years ago, and now, even though its maintenance has been handed over to a new team, he still ensures that everything aligns with his original vision. He requires verbose comments explaining even the most obvious aspects of the code. He forbids the use of smart pointers and other modern language features. Everything must be named exactly as he wants. There is no room for discussion about details or personal opinions. It’s easy to imagine how the developers on this team feel about him and their work on the project.

The above story is entirely fictional, and any resemblance to real people or projects is purely coincidental. However, no one would want to end up on such a project. Similarly, when conducting code reviews, we don't want to become such a "Big Ben." Therefore, it’s important to start with the basics and consider what mental approach would be appropriate when performing a code review.

The first and most important principle to keep in mind, whether you are conducting a code review or receiving one, is to let go of your ego. This may sound like some profound wisdom from a distant Eastern guru or a hippie who took too much LSD, but in practice, it means something very simple: do not become too attached to your code and do not take feedback personally. One might say, "Remember, your code is not you!"

The code you write does not exist for its own sake but serves a purpose. An experienced programmer knows that often, modifying one key line of code can be more valuable to the project than adding hundreds of new lines. Even more valuable and satisfying can be removing thousands of lines of code that have become unnecessary. After all, shorter code means fewer corners where undiscovered bugs might lurk. Therefore, it’s not worth becoming overly attached to the lines of code you have added.

This raises a question about responsibility. If we shouldn’t become attached to the code we’ve written and fiercely defend it, should we avoid referring to it as “our code” and instead treat all code as a collective effort? I believe one can overdo it in either direction. Treating every line of code as a collective creation, rather than something we wrote, could lead to shirking responsibility for its quality, errors, readability, and maintenance, much like how people didn’t care for property during the communist era when everything was “shared,” “state-owned,” and thus “nobody’s,” which additionally discouraged careful work.

In my opinion, to find a balance between these extremes, it’s best to approach code reviews like an engineer or mechanic rather than an artist. It’s not good to treat your code as a masterpiece inspired by "inner creativity," where nothing should be altered and everything must remain exactly as it is for others to admire. With such an approach, every comment about the code can be seen as a personal attack.

On the other hand, approaching code review like a mechanic means thinking of code changes as if you’re repairing a car in a workshop. The car might be dirty, old, or a bit rusty, but once you’ve fixed a particular system, you can be satisfied with how it now operates. You take responsibility for the repair, ensuring it was done carefully and according to best practices. Perhaps some compromises were made, and certain parts were patched together with makeshift solutions like tape, but you know precisely what you did and why. Ultimately, the goal is for the car to run smoothly and for the customer to be satisfied.

When it comes to responsibility for specific parts of the code, a related question is who should conduct the code reviews. There is no one-size-fits-all answer. This decision may depend on the specifics of the project and team or may even arise naturally through informal communication among programmers. One option is to assign code reviews to one or a few highly experienced individuals holding senior positions. However, there is a risk that such an authoritative figure might harm the project by imposing rigid or outdated views, much like the aforementioned "Big Ben." Another option is to have each module, file, or directory assigned a "owner" — the person who wrote the code, knows it best, and is responsible for maintaining it.

Paradoxically, involving various team members, including "junior" developers, in code reviews can also be a solution. These less experienced developers, even if they are not yet familiar with the project or have limited programming experience, can bring a fresh perspective to the review process. Their questions can spark interesting discussions, and they will undoubtedly learn a great deal from conducting these reviews.

How to conduct a code review?

With the general approach discussed, let’s move on to specifics. How do we go about reviewing someone else's code? Where do we start? This article doesn't focus on specific tools or programming languages but assumes that we have a convenient tool that allows us to view code and clearly highlights the changes made — added, modified, and deleted lines.

Code changes can be reviewed from various perspectives — either at a more general level or focusing on details. This is illustrated in Figure 1. At a high level, we look at which files, classes, and possibly entire directories and modules have been added. We try to understand their architecture — the logical significance of each element and the relationships between them. This is akin to learning any new code that we need to work with. The only difference here is that in a code review, we have a clear distinction between functions or classes that are new (added by the author of the code we are reviewing) and those that remain unchanged. The latter may not even be visible in the view comparing just the changes, but sometimes we need to refer to them to get a complete picture of the overall architecture.


Figure 1. Different levels at which code can be reviewed

When we feel that the entire picture is becoming overwhelming, it can be helpful to draw a class diagram on a piece of paper, in UML notation or any other method, possibly one of our own design. This helps visualize and understand the relationships between classes — inheritance, composition, and others. Another approach to grasping the overall structure of the code is to compile and run it in a debugger, stepping through the code to trace the control flow and understand which functions call which others, under what conditions, and so forth.

On the other hand, code changes can also be reviewed from a detailed perspective, meaning examining the code line by line. If the review involves only a few lines of code, we will obviously focus on this level only. As we read each line, we can mentally check several things. Is the arithmetic or logical expression correct? Is the pointer or reference being accessed here guaranteed never to be null, preventing a program crash? Is the variable named according to the coding standards, and does its name reflect its purpose?

Both aspects are important and complement each other. Comments provided in a code review can address both the overall structure of the code and specific instructions. Focusing solely on the overall structure might cause us to miss many errors in specific instructions, while examining the code line by line could result in overlooking the overall organization of functions and classes. However, it's challenging to simultaneously consider both aspects.

Therefore, it's beneficial to approach the review of a more extensive code change in multiple "passes." Start by quickly and superficially reviewing the entire codebase — "scrolling through" to get an idea of how much code there is to examine, which parts of the project are affected, and trying to identify key classes and functions that have been added, modified, or removed. This initial pass helps determine which parts of the code would benefit from a more detailed examination first, rather than reviewing it from top to bottom.

Understanding new code builds gradually. Even if we are familiar with a project and its existing code, we need to remember that we don't know the new code or the author's intentions from the outset of the review. Therefore, it's wise not to rush into making comments. What might initially seem like an error could turn out to be correct and intentional once we have a clearer understanding of the overall proposed change. It's beneficial to jot down ideas for comments separately and incorporate them into the official review only after thoroughly reviewing and understanding the entire code. In the meantime, many of these initial thoughts might change or turn out to be incorrect.

The number of lines of code is not a measure of project quality or programmer productivity. Similarly, the number of comments in a review should not be an end goal or a metric for evaluating anything or anyone. One change in the code might seem perfect to the reviewer, meeting all quality criteria without requiring any corrections, while another might raise concerns in many areas. Humorously, the quality of code is sometimes measured by the number of "WTFs per minute" heard during its review. However, this number depends on both the code author and the reviewer.

There is also a saying: “Give a programmer a review of 10 lines, and they’ll provide 10 comments. Give them a review of 1000 lines, and they’ll just write LGTM (Looks Good To Me).” This second scenario could indicate that the reviewer did not thoroughly analyze the entire new code. However, it’s worth noting that reviewing small changes is generally easier and more effective. Therefore, it’s better to break down large tasks and substantial new code into smaller, more manageable chunks. This can be done by working on a separate branch and merging the changes into the main branch only when the entire functionality is complete.

What to pay attention to?

When reviewing code, various types of feedback may come to mind that we might want to share with the author. It’s important to recognize that not all feedback is equally significant or worth arguing over if the other party disagrees and has a different opinion. One way to categorize feedback is into objective issues (e.g., logical errors that could lead to program crashes) and subjective issues, which might be a matter of personal preference (e.g., the best way to name a variable). This situation is illustrated in Figure 2.


Figure 2. Types of possible code feedback sorted by importance

However, this categorization is not absolute, and we should never be 100% certain of our perspective. What might seem to us as an obvious error might not actually be one if the author has ensured that a certain pointer or reference will never be null at that point due to another part of the code we overlooked. On the other hand, the naming of a variable can be important if the proposed name is unclear or misleading.

In general, it’s advisable to give the code author some leeway and not impose rigid rules on the overall structure of the code or the naming of its elements according to our preferences. Let them "express themselves." After all, they are the code author, and we are merely the reviewer. It’s useful to have a written "coding standard" in the project and possibly use tools for automatic formatting, like Clang Format. This way, fewer minor details are left to subjective interpretation.

There will always be questions for which there is no objectively best answer, such as: "Is this function name the best possible here?" It’s worth asking a meta-question: "Is this important?" Comments on less critical and more subjective matters can certainly be made, but only the objective and significant issues (like code errors) should be debated and insisted upon for change.

One example of aspects to consider when analyzing code from a high level is the division into classes, their semantic meaning, and the relationships between them. It's important to think about whether the proposed architecture is the best possible for the given use case or if it’s perhaps overly complicated. Some programmers, fascinated by the classic book from "Gang of Four," try to apply design patterns everywhere to make their code as general and future-proof as possible. They often overlook that future changes to the code might be unpredictable, and the more complicated the design, the harder it will be to modify it later.

At an intermediate level of detail, we can analyze the performance of the proposed algorithm. Performance issues are sometimes dismissed with the quote popularized by Donald Knuth: “premature optimization is the root of all evil.” However, this quote is one of the most misused and misunderstood in computer science and is often taken out of context. Knuth was referring to the improvement of minor inefficiencies and micro-optimizations that could make code less readable and harder to maintain, such as rewriting code in assembly.

It is true that when optimizing performance, it’s important to first use a profiler to measure performance and identify which parts of the code to focus on — those that are bottlenecks in the system. Nonetheless, when designing algorithms and data structures, it's valuable to keep good performance practices in mind, as well as considerations for security. These aspects should also be evaluated during code reviews. After all, an algorithm with poorer computational complexity, such as O(n²), may work well with test data, but as data size increases, it may become so inefficient that users might report the program as hanging (though for small data fitting in cache, such an algorithm might even be faster).

Focusing on the lowest level of detail, we can evaluate the appropriateness of variable and function names, as well as comments in the code. Many argue that comments should not be necessary at all if the code is inherently readable. While I believe this is too extreme, good comments and readable code are indeed closely linked.

Listing 1 provides examples of what can be considered bad comments and how they can be improved. The programming language used here is C++. While the choice of these specific examples may be subject to discussion, it’s generally accurate to say that bad comments are those that merely replace readable code. It is problematic to repeat in a comment what the code does if it is already clear. It is also unhelpful to explain the purpose of a variable or function in a comment rather than giving it a meaningful name. Finally, using ASCII Art-style comments to divide a function into sections instead of refactoring the code into separate functions is also a poor practice.

Listing 1. Examples of poor comments and their improved versions

return buffer_size; // Returns buffer size.
-->
return buffer_size;

// 7% is the percent of users to be selected for discount.
// Formula calculates number of users to receive discount.
AssignDiscount( GetTotalUserCount() * 7 / 100 );
-->
constexpr int percent_users_with_discount = 7;
int users_with_discount = GetTotalUserCount() * 7 / 100;
AssignDiscount( users_with_discount );

void RunProgram()
{
    ///////////////////////////////////////////////////////
    // Initialization
    …
    ///////////////////////////////////////////////////////
    // Calculations
    …
    ///////////////////////////////////////////////////////
    // Writing the results
    …
}
-->
void RunProgram()
{
    Initialize();
    Calculate();
    WriteResults();
}

However, comments can be useful for explaining aspects that are difficult to express through code alone. A good comment might explain an obscure algorithm, what it does, what it's called, or where it originated. Similarly, comments can specify assumptions about variables or function parameters, such as whether a pointer can be null and what that implies (e.g., if a parameter is optional), special values (e.g., a returned index of -1 means an element was not found), the range of numeric values (e.g., it should be between 0 and 100), or the units in which a value is expressed (e.g., meters, milliseconds, percentages).

A separate category includes comments documenting the interface of a function, class, or entire library. Here, it’s valuable to describe each function and its parameters, as well as each structure and its fields, so users can understand how to use these elements without delving into their internal implementation. These comments should also be reviewed and treated as part of the documentation, alongside Word documents or Confluence pages that provide a broader overview of the code. In fact, comments and documentation can be the same when using tools like Doxygen or Sphinx, which generate documentation from comments written in a specific format.

The aspects of code to consider during a review mentioned above are just a few examples. It's not possible to cover them all, especially without focusing on a specific programming language. Therefore, doing a good code review is a skill best learned through practice.

How to write comments?

Finally, let's consider the best way to craft comments during a code review. As previously mentioned, politeness and professionalism are crucial. While this might seem obvious, it can be challenging. Comments in reviews are an asynchronous form of communication similar to comments on articles, social media posts, and other online discussions, where even well-meaning and serious individuals sometimes resort to unprofessional behavior.

When writing comments in a review, it's essential to control your emotions. A good rule of thumb is not to write anything you wouldn’t say face-to-face. It’s helpful to read your comment again before sending it. This helps catch typos and other text errors and gives you a moment to reflect on its content. Additionally, in a professional context, such as code reviews as part of your job, maintaining professionalism in behavior and communication is key. Jokes and off-topic discussions might be better suited for company channels on Discord or Slack.

We’ve discussed how, when writing code, it’s important to detach your ego and not take feedback personally. This principle also applies in reverse: when providing comments on someone else’s code, focus on the code itself rather than the author. This situation is illustrated in Figure 3. “You’re a bad programmer because you wrote this code this way” is an example of a very poor comment. “This code is not written in the best way” sounds better. Even better would be: “In my opinion, this code could be improved because (…) Wouldn’t it be better to write it like this: (…)?” This latter version incorporates several best practices, which we’ll discuss in detail.


Figure 3. Focus on the code, not the author

First and foremost, humility is crucial when doing code reviews. We should never be 100% certain of our judgments but should always consider that we might be wrong. Therefore, it’s helpful to start with phrases like: “In my opinion…,” “According to me…,” or “It seems to me that…” Ultimately, we are expressing our views — especially when dealing with subjective matters (e.g., “I think it would be better to place this function in class X”) or even when something seems like an obvious error (e.g., “According to my knowledge, this function won’t work correctly with such parameters”).

Secondly, the statement “because (…)” is where we provide justification. It’s valuable to support your opinion with arguments. If you see an error, provide examples of input data or variable values for which the result will be incorrect. If you believe a function call doesn’t comply with the library’s specification, include a link to the documentation that explains its correct usage. For C or C++ and its standard library, this might be a specific page from cppreference.com. If the code seems inconsistent with generally accepted programming best practices, it might be worth including a link to a relevant webpage found online to support your claim that a particular language construct is discouraged or to show how it should be used.

Thirdly, a good comment should include a specific suggestion for change: “Wouldn’t it be better to write it like this: (…)?” It’s beneficial to be positive and not only point out the problem but also offer possible solutions or alternatives. Instead of simply saying, “Variable names i and j are unclear,” you might also add, “Perhaps it would be better to name them group_index and element_index?”

Finally, the suggestion “Wouldn’t it be better to write it like this: (…)?” is intentionally phrased as a question. This form is not only more polite but also encourages a response and further discussion. If it turns out that we were mistaken, missed something, the code is indeed correct, or for other reasons, the author disagrees and does not want to make changes, it will be natural for them to respond to such a question, e.g., “No, because in my code (…) and therefore I believe this is the better implementation.”

Returning to the role of the code author who responds to review comments, it is important to clearly signal whether we agree with the reviewer and will implement or have already implemented the requested changes, or if we disagree and wish to debate the points (providing supporting arguments for our stance). Depending on team customs, this could be a more formal statement like “OK, I agree, I will fix this,” or even just an emoji like “thumb up” 👍. It’s also nice to thank the reviewer for their comments.

Summary

To answer the question of whether code reviews are worthwhile, the short and general answer is: it depends. However, if, in a given project, we have decided — or someone else has decided — that code reviews should be part of the process, it is important to perform them diligently and adhere to best practices as described in this article. In this way, the review will benefit the project rather than merely consuming valuable time.

When conducting code reviews, we should also ensure that this process does not become a source of conflict between people but, on the contrary, serves to unite them in a common cause. Code reviews can be viewed as an additional channel of communication among team members, where the common focus on the code serves as a pretext for collaboration.

Adam Sawicki
February 2024

This article has been translated from Polish with help of the free version of ChatGPT.

Comments

[Download] [Dropbox] [pub] [Mirror] [Privacy policy]
Copyright © 2004-2024