I like short songs!
You can remember about 5-9 things without being a Vulcan (see SevenPlusOrMinusTwo
- Your programming language (long term memory, doesn't count here)
- Your editor (long term memory, doesn't count here)
- Your current goal
- Say, a LoopInvariant? or MethodSignature?
- Say, the names, meanings, and uses of two or three variable names
- What some bit of code appears to really be doing, and therefore why it does or doesn't accomplish its apparent goal.
For any given bit of program, that's probably more than 8 already, so basically any amount of reading code is a mentally strenuous task. The more code you have to read before you can form units of meaning larger than line-level constructs in your programming LanguageOfChoice
, the harder your task.
be lazy. Make it easy on yourself. Never write a method that would require hard thinking to analyze. Factor out units of semantic meaning, even if they are only ever used once. Define them reasonably close by each other until and unless you identify a clear opportunity for meaningful reuse. Use ShortMethods
This does not prescribe any particular order of authorship. It simply states a desirable end result and a rationale. For the clash between TopDown
, see the section on egg eating in GulliversTravels?
It's more complex than the above, also, due to the ability to push and pop clusters of short term context on one's mental stack. (See SevenPlusOrMinusTwo
A hypothetical scenario:
It is 10 o'clock on the first day of coding for the new project. We have decided to try out this ExtremeProgramming
I have taken on my first EngineeringTask
. It happens to be a complex mathematical algorithm. I write UnitTest
s. I code a fifty-line method. I run the unit tests, and debug my method until it works. I remove internal duplication in my method. It is still forty lines long. I rerun the unit tests.
I compare what I have done to my novice understanding of XP. I have already done the simplest thing
- That makes my new functionality work
- That keeps existing functionality working (there isn't any)
- That satisfies OnceAndOnlyOnce (there isn't any external duplication yet)
But, before I release my code, I break it into ten methods, averaging four lines apiece.
Why do I do this? In other words, since the rule OnceAndOnlyOnce
doesn't cause me to write short methods at first, what rule does?
- Breaking up methods makes them SelfDocumentingCode, without comments. Each time that you pull out a piece of the method, you have to create a name for it. Well-chosen names allow you to avoid comments.
- Small chunks are more easily understood than larger chunks. It is easier to reuse things that you understand well. The smaller your Legos, the more flexibly you can build.
See also ToNeedComments
- If you took the 50-line method out of a textbook and wanted to make it look just like the source, it is probably the simplest thing. Renaming the methods to make them self-documenting isn't important if nobody is going to read the code. In this case, you might tell them to read the textbook.
- Overuse of short methods is a little like using goto: it obscures the flow of control by forcing the reader to jump around and look at a lot of little pieces of code in no obvious order. Keeping each method to 6-8 lines is not helpful if a reader has to remember the details of a few dozen methods in several classes to figure out how they all fit together.
We've all been through this: you look at A's method, and all it does is call B's method. B calls C's method, which calls D's method, and so on until you finally get to a method that actually does something. Then, you have to mentally unwind the stack to see what happens next. You wish that there was just one big definition somewhere that shows you all the important stuff in one place.
If you're browsing down the stack a->b->c->d->..., either you believe there's a problem with the definition of the called methods, or they're badly named. Unless I believe there's a bug in it, I'll tend to assume that (nth-fibonacci-number 10) pretty much DoesWhatItSaysOnTheTin, and am no more likely to go and look at its implementation than I am to start reading the standard library functions.
It's easy to get into the trap of logical step 1 calls logical step 2 calls logical step 3, etc. I did that quite often for years before I realized what was wrong with it and what else could be done. An example of where this trap tends to occur is in nested loops. To make the code more legible, we make a method out of the loop body, thinking that's the best refactoring path, but that results in the ugly 1 calls 2 calls 3... pattern. The better refactoring is usually to replace nested loops with consecutive set operations such that step 1 processes a vector, producing a new vector as input for step 2, etc.
I tend to find that the unit tests provide the "one big definition" that tells me what this method is supposed to do. If you test only public/protected methods which are composed of calls to private ShortMethods
, then it's a matter of identifying how the private methods contribute to the result that is desired in the public/protected method. -- AdewaleOshineye
Whenever I add a new feature to a program, I am likely to first write it as one or a few long methods, so that I can see it all at once and make changes in one place. Then, when it works, I refactor to move all the pieces where they belong (i.e., ConquerAndDivide
). It is beneficial to move things to appropriate places, but I wonder about the "information loss" that comes from eliminating that initial cohesiveness.
I've come to the conclusion that using short methods is a trade off. Short methods make programming go faster (with the same quality), but they also make it hard for people new to the project
or coming back after a break
to understand the code.
Once people are familiar (or re-familiar) with the code, the short methods allow them, too, to go faster.
I gladly make this trade-off, and just accept the fact that my code is harder to understand for new people (and me, after a vacation).
Short method understanding is also discussed in RavioliCode
But this page is also asking what rule, or rules, does one follow when breaking up the code. In other words, what rule tells you how short is short enough? It is not OnceAndOnlyOnce
From above on this page:
The rationale? Breaking up methods makes them SelfDocumentingCode, without comments.
Never write a method that would require hard thinking to analyze. Factor out units of semantic meaning, even if they are only ever used once.
I suspect it comes down to ensuring that each method does exactly one thing and giving each method a good name.
The rule I use is SeparateTheWhatFromTheHow
, one of my ThreeRules
It follows the XP principle that code should clearly communicate its intent.
argues for FewShortMethodsPerClass
and against ManyShortMethodsPerClass
When breaking a long method into several smaller methods there is often state being carried from line-to-line in the large method. Splitting into smaller methods means that state must either be transferred via return results, or via fields in the object. If there is only ever one piece of data being transferred between each of the smaller methods, then the return result suffices. But in languages without multiple return values the alternative (storing that state in fields) may result in exposing data to all methods that is only relevant to some.
See also ContractiveDelegation