Schedule Questions: Pair Programming and the PNR Curve
February 21, 2007
Converting effort estimates into project durations and team sizes is an important part of project planning. How this is done varies from project manager to project manager, to some it is an art, others a science, and to many a case of living with everyday constraints. Today I will focus on the science and its implication to pair programming.
If your initial project estimates indicate 72 person months of effort how do you best resource it?
1 person for 72 months?
72 people for 1 month?
6 people for 12 months?
Intuitively we know that 1 person for 72 months might work (providing they had all the right skills) but typically business wants the benefits of a project as soon as possible. 72 people for 1 month is extremely unlikely to work, unless the project is simple and massively parallel, like cleaning oil off rocks on a beach. Usually adding more resources beyond an optimal level provides diminishing rates of return. As the old saying goes, “You can not make a baby in one month with nine women, (although it might be fun to try.)”. Also as Fredrick Brooks stated in, The Mythical Man Month ”Adding resources to a project that is already late will make it later”. So, given an effort estimate how do we determine the optimal team size and schedule to deliver quickly and keep costs down?
While adding team members to a project increases costs in a linear fashion, the project timeline is not reduced in a corresponding linear way. Research by Putnam Norden found that for projects that require communication and learning (like software projects), the effort to time curve follows a Rayleigh distribution. Putnam confirmed that this curved applied to software projects in his article “A General Empirical Solution to the Macro Software Sizing and Estimation problem”, IEEE Transactions of SW Engineering, July 1978 and the curve became known as the Putnam Norden Rayleigh curve or PNR Staffing curve as shown below.
The curved blue line is the PNR staffing curve, the X axis shows time and the Y axis shows costs. As we move to the left (shortening the project timeline) the Adjusted Staff Months curve gets steep and steeper indicating increasing costs, but not much of a shortening in timeline.
There are some important points on the curve to understand. The Optimal delivery time (shown as the lowest cost point of the curve “To”) indicates the lowest cost that the project could be delivered for. However it does not factor in delayed return on investment, inflation, or work in progress costs. Most companies are looking for the best compromise between low cost and short timelines as indicated by the Nominal delivery time “Tn”.
Barry Boehm did an interesting study into attempts to shorten project schedules; he examined over 700 projects that attempted to deliver code in less than the Nominal delivery point Tn. None of the projects were successful in reducing the schedule below 75% of Nominal delivery point Tn, and he christened the area the “Impossible Region” to indicate that you can not compress schedules beyond this point. (See Barry Boehm “Software Engineering Economics”, Prentice Hall, 1981),
As you approach the Impossible Region the gradient of the line becomes vertical, adding resources increases the project spend, but does not shorten the timeframe. Following Fred Brooks’ observation, I guess the line starts to curve back, so adding more people not only adds costs, but makes delivery longer. I can see how this would be the case as decisions and communications become more complex and extra people actually slows things down.
So, how do you determine the best compromise between short timeline and low cost (Tn)? Barry Boehm provides the following formula in the COCOMO estimation engine:
Tn = F * Effort ^ 0.33
(The effort cube rooted multiplied by a scaling factor)
Where F is a factor that varies on project type:
COCOMO II Default 3.67
Web Development 3.10
E-Commerce Dev. 3.20
Military Development 3.80
Embedded Dev. 4.00
So if our example 72 person months were a web application this would be:
Tn = 3.1 * 72^0.33
= 12.9 months
Indicating a 72 person month web development project would be optimally implemented in just under 13 months by a team of 5 to 6 people. The actual formula used in the COCOMO model gets refined by various project and environmental scaling factors, but at the heart is this scheduling formula.
The Agile Angle
The basis for these observations and calculations were derived by analysis of statistics from thousands of real projects and experimentation in the 1970’s and 1980’s, but are they still applicable today? The COCOMO model has been updated to COCOMO II to address modern developments like component based development, COTS, and reuse, but I wonder if project dynamics are now fundamentally different?
While Craig Larman traces agile practices back to the 1950’s in his excellent book “Agile and Iterative Development: A Mangers Guide” I think their use has only really been adopted by the masses in the last few years. Consider the following changes:
• A switch from single pass waterfall approaches to rapid, iterative development
• Increased business involvement throughout the project resulting in less rework
• Better communications via pair programming, co-location, daily Scrums, etc
• Continuous integration tools, automated unit testing, IDEs, refactoring tools, etc
Our processes and tools are so good now it is tempting to think that we can beat yesterday’s productivity figures and break any staffing curve models from by-gone days. However I remind you that software development is more about communications and decision making than processes and tools. In my post “Don’t (Just) Drink The Kool-Aid” we saw how people factors impact project performance ten times more significantly than process or tool factors. So if PNR curves are a good method of predicting cost/time ratios for workers who need to learn and communicate I suspect they are still applicable. It is easy to get caught up in the hype of process and technology development, but communications is still the name of the game.
So should we still use PNR Curve formula to predict project sizes? I would suggest we do as communications are still the main driver, but offer that perhaps the scaling factor (F) might need adjustment due to improved tools. Perhaps web development projects should now drop from F=3.1 to F=3.0, but this is just my conjecture, I have no hard data to support it (perhaps the folks at IFPUG do).
What about pair programming? When using pairs on a team we could either:
1) Continue as is, use the model to determine optimal team size and then encourage pairing to increase efficiency.
2) Treat a pair as one hyper-effective person, so count pairs not individuals and increase team sizes accordingly.
This second option seems counter intuitive to agile, we strive for small teams to reduce communication channels, so I’m not convinced by this idea. I would be very interested to hear people’s thoughts on agile staffing curves. So, lots more questions than answers in this post, please let me know if you have any research or thoughts on the matter.