Gilb Measurability Principle

PeopleWare page 59 (second edition) provides the most helpful and concise version of this principle, which De Marco and Lister call Gilb's Law:

Anything you need to quantify can be measured in some way that is superior to not measuring it at all.


TomGilb himself says (<-Competitive Engineering, Chapter 4 Performance):

"Performance requirements must express quantitatively the stakeholders’ requirements. I have come to believe, through experience, that all the performance attributes we want to control in real systems are capable of being expressed measurably. I find it intolerable that critical performance ideas are expressed in mere non-quantified words. Expressions like “vastly increased productivity” annoy me! Not one of those three words has a precise and agreed, unambiguous interpretation. Yet, I have consistently encountered a world in multinational high-tech companies, amongst educated, intelligent, experienced people, where such vague expressions of performance, especially of quality, are tolerated; such expressions seem not even recognized as being dangerous and capable of improvement.

"Performance attributes are more than a collection of names like ‘reliability,’ ‘user friendliness’, ‘innovation’, ‘transaction time’ and ‘cost saving’. Each performance attribute needs to be precisely defined by a set of numeric, measurable, testable specifications. Each performance attribute specification will include different specified levels for different conditions [time, place and event]. Unless there is clear communication in terms of numeric requirements, there is every chance of the real requirements not being met; and we have no clear indication of the criteria for success and failure.

"Sometimes, it seems difficult to identify satisfactory scales of measure. Often, the only ones that can be found are indirect, imprecise, and have other problems associated with them. From my point of view, these problems can be tolerated. The specific scales of measure, and meters for measuring, can always be worked on, and improved over time. In all cases, an attempt at quantified specification is better than vague words."


See TwelveToughQuestions by KaiGilb?.


See ExtremePlanning for discussion of an estimation technique which works fairly well in actual practice.


John, Gilb's 'metrics stuff' is obviously about a whole lot more than estimation as your quote shows - in fact he has little helpful to say about estimation except that it's easier to estimate small steps accurately - which is also by far the most important thing.

The measurable attributes point is much broader and should never become a BeancountersWetDream. We've found this emphasis in Gilb to be important, practical, scalable and helpful from the earliest, very chaotic stages of EvolutionaryDelivery if not used too rigidly as I say in TomGilb.

But I agree that there is a danger of breaking many of ED and XP's LowCeremonyMethod aspirations in immature application of both the metrics and inspection emphases of Gilb (though the man himself is a great antidote to this). See also MotherhoodStatement -- RichardDrake


Quote:

It sounds like a BeancountersWetDream, but in practice I can't even guess when the particular piece of code I am working on at the moment will be ready, and if I don't know, how will anyone be able to measure it?

I think this harkens back to what TomDeMarco said in ControllingSoftwareProjects?. Everyone says, "I'm a lousy estimator." What you need to know is just how _bad_ an estimator you are, so you can adjust your estimates appropriately. To do this, you first need to start measuring your estimates versus reality. XP uses measures called LoadFactor and ProjectVelocity to do this. -- JohnBrewer


I take it that Project Velocity is a measure of the speed and direction with which the project is going down the gurgler!!! PJB OZ

I have one problem with this principle. By Gilb, if I need to quantify attribute X0, there exists some way in which it can be measured which is superior to not measuring it at all.

If I understand this correctly, I may have to trade off accuracy of measurement against time (among other costs) expended performing the measurement. Now the accuracy of this measurement is an attribute, X1, of the system. In order to know whether I have hit on a way of measuring attribute X0 that is actually superior to not measuring at all, I must now, by Gilb, choose some way of measuring attribute X1.

As is now clear, by repeated application of Gilb I have a recursively infinite set of attributes which I need to measure. Unless some way of measuring each successive attribute Xn, such that the sum of the costs for the measurements of X1 to Xn does not grow without bounds as n tends to infinity, the measurement process becomes infinitely expensive. Such a way of measurement cannot exist because the costs must be vanishingly small for sufficiently large values of n.

(What kind of measurement can be performed in less that one picosecond?)

An explanation of the flaw in the logic here would be greatly appreciated, because the principle is one that seems worth adopting. -- JoeChacko

You don't need to quantify the accuracy of the measured accuracy of your measurement. So you don't measure it. Read the quote: "Anything you need to quantify..." The fact that you could come up with a way to measure something does not mean that you must.

The point of the quote is that if you want to measure something, you can - so quit trying to give excuses of why it "can't be done."


There seems to be some confusion here between measurement and estimation:
"... in practice I can't even guess when the particular piece of code I am working on at the moment will be ready, and if I don't know, how will anyone be able to measure it?"
A: Do you know when you started working on it? Do you know how many hours you've worked on it? When it is done, can you detect that it's done?

If at some point in the future you can determine that at point 'X' it was done, then you can measure how long it took to develop. (It's "X minus start_time" ;-) There; it's measured. Now, next time you work on something of similar size and/or complexity, you could reasonably guess that it will take about that long to do.

Measurements feed estimates. -- JeffGrigg

The sleight of hand here is, of course, that you have assumed a measure (implied by the terms SimilarSize? and SimilarComplexity?). But you give no hint as to its definition. -- HenryAndrew

These metrics can be defined very simply as "% size/complexity compared with project X". If you feel that your intuition is not accurate enough, you can define a more objective set of measurements for size and complexity. In most cases, though, I would suggest that your judgment "this task is slightly smaller than project X, say 80% the size" is sufficient because there are lots of other PeopleWare factors that affect development time. You're never going to achieve perfect accuracy. The idea of measuring things (even if the measurements are made intuitively) is that it gives you a record of the reasoning behind your estimations and decisions. Without this record it is much harder to improve. -- DavidPeterson


CyclomaticComplexityMetric is a measure. Is it good?

Well, is it better than having nothing at all?

I think it would be more useful to ask whether it is more useful than subjective judgement of complexity. From what I can figure there are only two reasons why I'd want to figure out the complexity of a particular unit of software: Understandability and Testability. Given DoTheSimplestThingThatCouldPossiblyWork and TestFirstProgramming, I can't see that complexity measures are that important. -- JasonYip

And the other question is: Is it something you need to quantify? Gilb's principle may be right or wrong (my guess is that this just depends on how you interpret "need") but pointing to a bad metric doesn't prove anything. Maybe there are better ways to measure code complexity. Maybe it's not really code complexity as such that you need to measure, but something else like flexibility or clarity or what DickGabriel calls "habitability". Maybe there's no real benefit from measuring this kind of stuff at all and we should just rely on measured (aha!) ProjectVelocity. Lots of maybes. But the fact that there's one unappealing way to measure something to do with the complexity of code is obviously no evidence that there isn't a way to measure whatever kind of badness we really care about here that's better than not measuring. -- GarethMcCaughan


The good thing about well-defined metrics is that one can easily automate their measurement. So, having a large system (hundreds of thousands of lines), one can run a program to count the number of lines or methods in each class; then you can look at the classes that seem to have unusually many (or unusually few). Better yet - "how many bug fixes have required changes to this class?" Or, "how many times has this class been changed in the last year?"

Crude cheap measurements can give you a good idea of where to focus your attention. Then you can apply your knowledge and experience to determine what, if anything, should be done.

The fallacy of PointyHairedBosses is that they confuse the metrics with reality: Reality is complex; there are exceptions to every rule. -- JeffGrigg


Are there exceptions to the rule that there are exceptions to every rule? -- DanielEarwicker

Yes. StrangeLoops?. That rule is its own exception. -- PaulHudson (off for a shave)


Had a StrangeLoops? experience working as an academic writing Grid software. My performance metric is the number of peer-reviewed papers published. So I wrote a paper on the lunacy of this metric. (Unlike other academic artifacts the software cannot readily be judged by reading it, only using it). My peers published the paper!


Saying that things could be measured better isn't that interesting. What is missing is the analysis of the expected costs and benefits of performing extra measurement. Typically the costs are: and the benefits are: I don't think that's sufficient grounds to exhort better measurement of everything.

You would prefer, perhaps, that things were not under control?

There are rather more benefits (and costs). See PerformanceIndicators


It's typical for us to see requirements that we must pass test X, which is often a load related test. The test changes quite a bit. This sucks because you can't say if i make my code do Y operations per second then we are golden. Because it is system wide everything works together and your part is just one piece. If someone uses a mutex incorrectly then it all fails.

To the system's stakeholders, it doesn't matter how many operations per second your code does if the system as a whole doesn't meet its performance targets for some other reason. In fact, YAGNI applies. You (or someone, anyway) should worry most (today) about the performance of those parts of the system that are most likely, in the short term, to affect the performance of the system as a whole.
CategoryMetrics

EditText of this page (last edited May 31, 2005) or FindPage with title or text search