Review Before Check In
Unreviewed code makes its way into releasable code, which causes a higher incidence of bugs and more effort in generating releases.
- You DoInspections according to a FormalReviewProcess
- Developers are allowed to check in code "at will"
- You are using NamedStableBases?
- Moving version labels requires work
- You want to prevent version control "thrash"
Require that no code can be checked in to the baseline version control system without having gone through the code review process.
Note that you can use the IndependentDevelopment?
pattern to manage versions of code prior to checking in to the main code base. This helps prevent "trash" (multiple check-outs/check-ins for code review fixes).
If you are using NamedStableBases?
, you might want to check code in before a review, but not label it for a release. This solves the problem, but does not prevent "thrash", plus requires someone to make sure the file is labeled correctly later.
All code in the version control system is "guaranteed" to have been through the FormalReviewProcess
, providing comfort in its quality level. Checkins may be delayed for days until the reviewers can hold the review, possibly resulting in schedule delays, slow feedback, and additional checkin conflicts.
This is one of the CodeReviewPatterns
Actually, I've been using my own version of this for about 6 years now, first
with the editor CodeWright
on an MS-Windows platform and the version control
system MS SourceSafe
, currently using XEmacs on HP-UX and the version control
system IBM's CMVC. What I do as part of my personal development process is that
whenever I finish making changes to a file, I first proof the changes by doing
a side-by-side visual differencing of the file with my changes versus the most
recent version of the file under version control before I check in my changes.
I have found that doing this mini-code review causes me to think through the
changes I have made more thoroughly, and in at least a couple of cases has
caused me to realize that I didn't do the correct change or I missed something,
then I will rework to fix the problem. The main reason why I do this is to
catch silly errors like the accidental insertion of garbage characters (which
the compile will usually catch) or the accidental deletion of lines or words
(which scares me to death, because generally the compiler won't catch these
kinds of errors, and then you've introduced some unknown bug which is now
lurking in the code waiting to be discovered at the most inopportune time).
-- Ed Remmell
Would a comprehensive suite of UnitTests help with some of these problems?
Pretty much all the projects I've worked on use CVS for revision control. I religiously practice several forms of review before checking in a file, via this commit procedure.
- If I've not done a 'cvs update' recently (in the last half hour or so), I do one. (Not really review, but this helps ensure that you don't collide with other developers.)
- Make sure all changed files compile. (Seems obvious, but you'd be surprised how many times I've seen commits that had no hope of compiling.)
- If I'm changing public interfaces, recompile everything that could possibly be affected, preferably the entire system (or as much of it as I can manage). This may not be practical on really large projects, but you'd be surprised at how quickly you can build on modern computers if you've got a reasonable build system. (Blink.com is currently about 160K lines of Java, in nearly 700 files, and builds in less than 40 seconds on a 600 MHz Pentium III, giving me the good fortune to be able to rebuild our entire code base from scratch dozens of times per day.)
- I always generate and review a list of which files I'm going to commit, and which other files in my checkout are changed but are not being committed. (The generation of the list of changed files is actually done in by the 'cvs update' step above.) If a file is changed and I don't know why it's changed, I go find out before doing anything else.
- I do a 'cvs diff' on the entire set of files I'm about to commit and read through my changes. (Well, I'll admit I skimp a bit if I'm committing the results of a massive RotoTill. But one shouldn't be doing that sort of thing all that often.)
- I commit. But that's often not the end of it. For Blink, I wrote a little perl script I call MayorGiuliani? which does pre-commit checking, and won't allow the commit if the file fails certain style and other checks. If I get stomped on by the jackboot, I have to fix the problem before I commit. (And yes, I am more than a little anal retentive.)
As a side note, one of the glories of our build system is that if even one file doesn't compile, you won't be able to build any part of the system. (This is accidental, actually.) So developers face a lot of peer pressure not to commit broken things. We also have a `Broke the Build' jar into which offenders place $2 every time they break the build and another developer notices.
All of this is really just promoting the idea that when you put something into the revision control system, you're done with it, and when you're done with something it should be 'production quality'. In most circumstances, you don't commit stuff that you're not ready to release. This is particuarly true when you're modifying code that's already running in a production release (as opposed to adding new functionality that hasn't been through the testing cycle yet).
The above are good practices, but they're not review as generally understood: having other people go over the code. For what it's worth, ExtremeProgramming would lead you to do some of these things at least as often, plus running a suite of UnitTests for the whole system, before checking in - but would
not require a separate "review", since all your code was reviewed as it was created by you and a co-worker PairProgramming.
wrt reading through the diff before committing, I found after doing this a few times that I benefitted from automating it. Now I have a couple of scripts called diffprep and diffcommit. diffprep generates a diff and prepends an octothorpe (#) to each line. Then read thorugh the diff, inserting (what will be) my commit log messages into it as I go - without
leading # characters.
diffcommit is an ugly perl one-liner which hoiks my comments out of the diff, figures out which files they apply to, and generates 'cvs commit' commands to safely store those files away, using those commit messages. Most of the development time was spent battling the StupidShellQuotingRules?
. -- DanBarlow
The big problem with code review before checkin, is that the developer has already done their unit tests:
To the best of their knowledge, there are no bugs in the code.
So they have little motivation to change it just to improve some abstract concept, like some other person's idea of "quality."
Making further changes means they'll have to do the manual testing half of the work again, because they might introduce bugs, and this ain't no fun.
Summary of recommondations I've heard is that you really have to do code reviews before the coder runs the program the first time.
People argue about if you should do it before or after the first clean comple.
I'd say in the case above, the problem is that the unit tests are too hard
to run. When I'm working on classes that have unit tests (sad to say, few
in our current source tree do, though that situation is improving), I compile
and unit test in one statement:
* bin/utest -m com.blink.util.unit_tests.EmailAddress
The -m flag rebuilds any changed files before running the given unit test.
Yes, in most development environments unit tests are too hard to run -- because they have no automated unit tests at all: Every test is a one-off ad-hoc run that is dreamed up, run and checked by hand each and every time.
I use JakartaAnt
to compile my code and every now and thenI also run a task called CheckStyle
which is downloadable at (http://jakarta.apache.org/ant/external.html
). This enforces me to always follow the CodingConventions
as I go along and the code looks good mostly through the day. -- PeterAxelsson
The biggest problem I see with ReviewBeforeCheckin
is that it encourages large changes. Having a review before check in does not make sense if you do small changes, particularly iterative changes. I am trying to get my developers to do daily check ins, and eventually get to multiple check ins per day. We can't do this and have reviews, too. -- WayneMack
Small changes are no less subject to being bad so they need reviewing. It is the small quick change that can often have the worst results. If the review process is reasonable then it should be able to be done quickly. If it is heavy then...dump it.
Small changes are less subject to being bad and big changes are more subject to being bad. That is the reason I am encouraging small changes. Problems are not random occurrences and complexity is a significant factor. How can you possibly have your developers turn in code every 4 hours and have reviews? The developers would spend all of their time in review meetings. -- WayneMack
Then we have a different experience base. I've found small changes just as likely to be bad and more likely to be overlooked. It just takes one small mistake to cause corruption, memory leaks, bad threading behaviour, etc. The problems can take many resources later to find and fix. A large chang is just a lot of small changes aggregated. There is no reason at all to suspect they have different inherent quality. As for the review meeting i ask a few people to
come over to my cube and we walk through the changes and the code. I email the people when I'm about ready and we setup a time. Usually bugs are not found because I unit test heavily, but often enough there are things i didn't consider that need to be addressed. If I can't find anyone, I wait. It's far more important to have quality than it is to have frequent check-ins. And it's not like we are talking about waiting a week between checkings, we are just talking about waiting a bit longer. This doesn't seem a lot to ask.
Huh? Defect injection rates are proportional to lines of code written. A big checkin has more lines of code, ergo more defects (on average).
Furthermore, with more frequent checkins, it makes it easier to locate a bug that didn't occur in prior versions -- you can binary search the projects version history to find the checkin that introduced the bug. Since it's a small checkin (since you're doing them so frequently), you have less code you have to read to figure out what went wrong.
This whole pattern is suspect. In the Real World, I was recently instructed to check my code in before
code review. And, given the advantages of checking code in more frequently (easier to track changes, earlier exposure of buggy code), I can't see an argument for delaying checkins.
The argument that frequent checkins makes it easier to locate bugs assumes the bug is found immediately following a checkin. If a unit test failed to find the bug right away, then you may go a few revisions before finding it and as a result won't know in which revision the bug was introduced.
I have recently left a project with a large, close-coupled, undocumented codebase, with very little unit testing; having had this negative experience in a CVS system, I can definitely recommend:
- 1. code reviews should precede checkin: catch bugs, bad style, or plain wrong functionality or design before you commit them, so that you do not have to do another change to correct them. Keep the change to a single commit and you simplify the commit history.
- 2. unit test as you code, review the unit tests before the code review (to check the funtionality of the interface), review, retest before a commit. The code on the main development branch should always pass all the reviews, unit tests, and functional tests that you have.
- 3. frequently review and commit on change branches to keep the main branch clean and the change good.