Measuring Performance

I’ve been asked many times, by both peers and managers, a variation of the question, ”Why does software cost so much?” This is an age-old question: as evidenced by Tom DeMarco’s essay by the same name written in 1995.

This question is really a challenge (by the person posing the question) that the software efforts of the organization are too expensive. The response - both earnest and flippant - is, “Compared to what?”

This then leads to a fairly simplistic analysis: can you obtain a similar software product or service at a lower cost? Smart organizations routinely do so; embracing integration of third-party libraries, third-party services, and even whole applications - in all situations where the customer experience is improved by doing so.

In contrast, you should always develop the parts of your system that are your key market differentiators. Failing to do so means that your competitors can match your offering... just as easily as you acquired it.

Given this, it seems likely that - even while embracing integrations - you’ll continue to develop some portion of your offering in-house. And thus, the original question will still be asked.

Revisiting the question

A less caustic and more useful variation is asking, “How do you know you’re building software as cost effectively as possible?”

To answer this, you need to know your cost compared to your output. But herein lies the trap: measuring software output is incredibly difficult because the act of measuring it introduces the wrong incentives - and consequently, the wrong outcomes.

This brings to mind the saying "They know the cost of everything, and the value of nothing."

The side effect of measuring lines of code, function points, or even story points, is to incentivize a greater volume of these to be produced. Paradoxically, less is more in software. Less code to manage results in fewer defects and lower long-term maintenance cost. Consequently, the result of directly measuring software output is increased costs over time.

Measuring (and thus rewarding) software production in isolation results in increased costs over time because it produces the wrong activities: emphasizing output over more useful considerations such as quality and customer satisfaction.

Reframing the question

The difficulty in answering this question is why I'm so excited about the recently published book, Accelerate, by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. In it, they reframe the question from productivity, and instead, to performance. This is a breakthrough in that it shifts from a question that results in unintended and unwanted consequences, to one that creates positive change. What the authors assert is that there are four measures that characterize high-performing software organizations:

  • Deployment Frequency

  • Lead Time

  • Mean Time to Restore

  • Change Fail Percentage

Deployment Frequency is how often software is deployed to production. This is also useful as a proxy for the batch size of the change delivered - with the understanding from Lean Production that smaller batch sizes are better.

Lead Time is the time it takes to go from a customer making a request to the request being satisfied. To reduce variability in the measurement, what's directly measured is the time it takes to go from code committed to code successfully running in production.

Mean Time to Restore is a measure of how quickly service can be restored after an outage (unplanned or planned). 

Change Fail Percentage is how often changes to Production fail; that is, the change requires mediation post-deployment.

Creating Positive Change

The data for these four measures show that high-performing organizations:

  • Deploy multiple times per day,

  • Implement customer changes in less than one hour,

  • Have service interruptions that last less than one hour,

  • And have a failure rate of less than 15%.

Intuitively, the last three are hard to argue with: faster features with fewer issues!? Sign me up. The first, multiple deploys a day, is harder to argue directly and deserves further discussion. 

But first, a word of caution: while I've certainly used metrics in a variety of ways, I've always done so knowing that they can be expensive to collect, can be misleading, and can lead to bad behaviors. Accelerate shares this caution:

In pathological and bureaucratic organizational cultures, measurement is used as a form of control, and people hide information that challenges existing rules, strategies, and power structure. As Deming said, "wherever there is fear, you get the wrong numbers" (Humble et al. 2014, p. 56).

If you're not using the metrics to make decisions, then stop collecting them. And if they're creating unintended and unwanted consequences, then stop collecting them.

Deploy Multiple Times Per Day

Most people, especially non-technical, would equate more frequent changes with a greater likelihood to get it wrong (that is, more frequent service failures).

The authors argue the the opposite is true. That measuring deployment frequency is a proxy for measuring batch size - with more frequent deployments correlating to smaller batch sizes. And that smaller batch sizes "reduces cycle times and variability in flow, accelerates feedback, reduces risk and overhead, improves efficiency, increases motivation and urgency, and reduces cost and schedule growth (Reinertsen 2009, Chapter 5). Wow!

Of course, if changes are being introduced to the service offering more frequently, then they have to do so without interrupting the service. That is, they have to be high-quality and low-friction for the customer. Otherwise, if you introduce failures frequently, then your customers and stakeholders will force you to slow down.

Personally, I know how difficult it is to get organizations to agree to more frequent deployments. I also believe that this is the right set of activities to focus on; and that it can't be accomplished without implementing Continuous Integration, Continuous Testing, and Continuous Deployment. I would argue, one the biggest advances in software the past five years has been the clarification and adoption of these practices. And it's where we're heading too.

Measuring Performance

While I don't have a direct answer to challenges on cost or productivity, reframing the question to one of performance can result in positive organization change. And that is a healthier and more productive discussion to have.