Suppose that a team instituted PairProgramming and UnitTests upstream from reviews. Might the problems found by reviews reduce so far in number and importance, that the reviews were no longer economic? -- RonJeffries
s upstream of reviews, would probably result in fewer errors to be identified during the review. But this does not imply the review is no longer cost-effective. If anything, most of the empirical evidence in industry would suggest the opposite conclusion from the above paragraph.
The amount of measurable evidence of the benefit of code reviews is so widespread and so overwhelming that several studies have suggested no other single "process improvement" activity has even come close to matching its beneficial effect on product quality. And the mountain of empirical evidence suggests that it is faster and more efficient at reducing bugs than testing in the long run
. Some zealots of FormalReview?
s even go so far as to say that, if done "right", functional testing (and its associated personnel and support costs) can be significantly decreased, or dispensed with altogether.
In fact, HarlanMills
' much ballyhooed CleanroomSoftwareDevelopment?
methodology does exactly this latter extreme, and certainly has its fair share of proponents singing its praises (all impartial of course ;-) :-)
that I personally am not saying that I like or dislike Cleanroom methods. Only that it's an example of the opposite extreme, which has its own experts to rationalize it with numbers.
Remember that it's not just a matter of whether you find more/less overall total errors as a result of the reviews, there's also the question of how much more quickly they can be traced back to the offending code (or whatever it was) and then fixed. This is what makes code reviews numbers look so impressive in the various empirical studies that compare productivity and cycle-time savings on the overall lifecycle in order to justify the time-hit early on for the review.
We all know that testing proves only the presence of errors; it does not prove their absence. Code reviews (actually, design reviews and requirements reviews too, even reviews of UnitTest
s ;-) are certainly no different in this respect. Some have suggested that "there is no error that can be detected by inspection that cannot be detected by a test", but in fact there are many classes of errors for which this isnt true (as is vice-versa). Functional testing tends to be better at validation ("build the right thing") to ensure that you are in fact satisfying the customer. Reviews tend to be better at verification ("build the thing right").
I fully agree that PairProgramming
can be seen as a sort of live, two-person ContinuousReview
. However, one of the primary purposes of a review is to have a so called "fresh pair of eyes" to look at the code from a fresh perspective (one less entrenched in, or attached to, the work itself than its author(s). So "fresh eyes" here means "not the programmer." While PairProgramming
uses two heads rather than one, it doesn't quite pass this criteria. CollectiveCodeOwnership
might, but does it happen before the code is released
or afterward? If it is before, then that's your fresh pair of eyes right there. If not, is there perhaps a fresh pair of eyes looking at the code when you RefactorMercilessly
? That would do it too. If you choose not to do it anyway, that's fine, but be aware that you are making this explicit trade-off.
Also remember that reviews serve multiple purposes, and different kinds of reviews emphasize these purposes in different priority. Inspections are more focused on error-detection, and communication/teaching takes a second-seat. Walkthroughs on the other hand emphasize communication/teaching of the knowledge embodied by the artifact, and hope that error-detection will come about as a side-effect of such understanding.
You say that "One of the biggest drawbacks to code reviews is that they take place after a relatively large amount of code has been produced." But I really don't see how this is a required element of reviews. It is solely dependent upon how much you choose to review at once, and is almost left entirely to your discretion. Furthermore, I would think that if you RefactorMercilessly
to say things OnceAndOnlyOnce
in the code, all the while using small classes with small methods, then I don't see why a review for a single object/class of code would necessarily be a "relatively large amount of code."
Are code reviews time consuming? You betcha!
And the more formal your reviews are, the more time consuming they tend to be. But who says they have to be formal? Just because formal inspections may be too
much of a time-consumer for you doesn't mean you can't do informal reviews rather than dispensing with reviews altogether. You don't need to get rid of reviewing altogether, just dispense with the high ceremony and conduct them more informally instead. That would still seem to be in alignment with XP IMHO.
For some references on reviews, take a peek at:
- Software Inspection : An Industry Best Practice; by David A. Wheeler (Editor), Bill Brykczynski, Reginald N., Jr Meeson; IEEE Computer Society Press, February 1996, ISBN 0818673400 ; URL: http://www.computer.org/cspress/catalog/bp07340.htm
- The WWW Formal Technical Review Archives at http://www2.ics.hawaii.edu/~johnson/FTR/ (particularly the link to the FTR library pages)
- Papers with "Review" and/or "Inspection" in their title by Michael Fagan (see http://www.mfagan.com/)
- Papers by Robert G. Ebenau as well as the book Software Inspection Process (ISBN 0070621667 )
- Daniel P. Freedman and Gerald M. Weinberg's Handbook of Walkthroughs, Inspections, and Technical Reviews (ISBN 0-932633-19-6 , see http://www.dorsethouse.com/books/hdbk.htm)
- TomGilb and Dorothy Graham's book Software Inspection, which is one of the more comprehensive ones (also see http://www.result-planning.com)
Aside from the CleanroomDevelopment?
, there are several "purists" who argue (with numbers and data no less) that if done "full-tilt", effective reviews can remove the need for UnitTest
ing (provided you do it "their way" of course). Fagan is one such purist. I think Ebenau & Strauss argue this as well. And Gilb & Graham have some stories of various shops that apparently reached this conclusion based on their own data+experience as well (you should be able to find pointers to some of these in the above resources).
However, I think most of us here would agree that eliminating testing is too extreme. Sommerville's "Software Engineering" text thoughtfully mentions the views of these "static verification purists" suggesting testing be eliminated altogether and then declares in the next sentence that "This is nonsense." But he isn't for the other extreme either (eliminating reviews altogther) and suggests it is equally ill-advised. Most would recommend a more balanced approach between these two extremes.
There is a plethora of evidence to suggest that reviews, even in the face of extensive FnctionalTest?
ing still give a tremendous bang for the buck. I would expect PairProgramming
to remove some of the errors that would have been found during reviews. I have no reason to suspect it would do this so drastically as to render reviews cost-ineffective. I would need some convincing empirical evidence for that, and it would have to be very convincing indeed, and repeated in many different scenarios before it could begin to counter the existing mountain of empirical and anecdotal support for reviews.
The main issue here is the time-consuming process of formal inspections right? Go ahead and throw out the ceremony, there is still something there of great value (even if you do PairProgramming
). Keep it as informal as you like; You still get lots of feedback and communication (as well as finding and fixing/resolving any defects/issues that arise). Don't throw them out because you don't like one particular method of conducting them; Rather, find a way of conducting them that works for the whole team, while still passing the "fresh pair of eyes" criteria. I have no doubt you could succeed at this if you wanted to. Reviews will be what you choose to make of them, but you have to fit the review to you instead of the other way around. If you sincerely want them to work, they can
(and if you don't, rest assured they won't).
I can't code without UnitTest
s, so I don't understand the comment about how you should have a review after you code but before you test. I would love to have an independent review of the UnitTest
s, however. If they all run, the code is cool. If not, there's more work to do. -- KentBeck
Are you saying your code+test feedback loop is too tight and fine-grained to make this practical? -- BDA
The code+feedback loop is much too tight and fine-grained. Working in Java, I might write a test in 20 seconds, compile in 20 and run an extensive test suite in 60. Then do it again. It's not always that fast, but it's pretty tight. (My guess is that Smalltalkers would find it excruciatingly slow! ;-) -- KielHodges
Not to suggest that PairedPairProgramming?
isn't better than PairProgramming
, but recently I had an experience of PairProgramming
in a shop that just does reviews; we did PairProgramming
instead of the review, it went faster, and got better results than the reviews. Our reviews were a combination inspection/walkthrough; we were supposed to give refactoring advice, although due to the near-total lack of EgolessProgramming
, that was rarely done (and rarely accepted well.) I found in the PairProgramming
segments, when I was riding, I found more bugs, found more places to refactor, than during the reviews. I also learned a lot more about the code. When I was driving, I also got much better feedback than normally.
As far as the UnitTest
s go, I'll agree that waiting until after review to run them would be painful. I've only occasionally had formalized UnitTest
s, but even doing a few by hand can work in a very tight cycle (code, test all the breakpoints obviously affected by that code). I'd rather not estimate an exact time, because that frequently happens in flow, where time has no meaning and a pause for review would be suicidal.
I can't write the code unless I'm running the tests. I'm not done until the tests run. I'm done when the tests run. That's why I can't imagine waiting for a review until after I've written the code and test separately. I don't write them separately. It's just not possible for me any more.
Now, given that style of working (CodeUnitTestFirst
) and PairProgramming
, what could reviews add? Another way of asking the question is, under what circumstances would I want to hold reviews, too? If I wasn't confident. If my defect rate was higher than we wanted. If the consequences of a defect was high enough.
I would also hold reviews of the test cases if the team needed to learn to write tests better. -- KentBeck
I've seen people hold reviews to guarantee policy conditions. For instance, if you want to avoid distributed deadlock, you have to refrain from doing this, this and this. -- MichaelFeathers
methodology that is sufficiently effective at reducing bugs makes it uneconomical to introduce an additional process. No known process will catch all bugs in a non-trivial, real-world system and the fewer the bugs that exist, the fewer that will be found. My impression is that C3 gets too few defects in production to make it worthwhile to add a new practice. -- KielHodges
I don't think there is a need for a third or fourth set of eyes looking at a piece of code after it has been written. If I thought it was necessary, I would have put it in. But I could be wrong. But only experience (yours, not mine) will confirm that. My experience so far is embedded in XP as specified so far. -- KentBeck
But you already HAVE a third or fourth set of eyes looking at a piece of code after it has been written! I bet that it is a rare method that is not refactored two or three times after the first pair gets through with it. Each time a method is refactored, it is reread. Refactoring makes people read code very carefully, and so ensures that it gets reviewed. -- RalphJohnson
I have removed all my comments from this page. For more on adding additional components such as reviews to XP, see ExtremeReviews
I have trimmed many of my comments and consolidated most of them into my initial comments at the beginning of this page -- BradAppleton
Even with pair programming that is only one other person. In a large project a change may impact numerous people and reviews are good way to communicate that changes are happening and what those changes are. Code ownership doesn't mean people will see these changes again for a long time. TDD are only unit tests. In a large system it is often the case that people do something that works in a unit test, but won't work in a system test. In a large system it is often the case that people don't do something that they need to do. It is often the case the people forget tests that they should have run but didn't. These things happen all the time and reviews are a good way of picking them up. If these issues can't or don't happen in your world then good for you, but there are many worlds in which they do happen.