Most software development teams use the Scrum method, or some version of it, because it has proved to be a reliable way of working. Teams develop or update products in short, focused sprints to meet specific goals, analyzing their work performance through reviews, retrospectives, and other techniques.
Scrum processes produce multiple outputs, or metrics, that can be measured and tracked over time. These metrics provide important data about your team’s workflow and performance, and they can help you decide where to focus your efforts.
Your metrics can also be used to gauge important outcomes that are harder to quantify on their own. As an example, you might want to measure the effectiveness of a particular strategy or infrastructure change. By capturing and recording metrics relating to system stability, deployment frequency, and change failure rate, you can see how well your strategy is working, since an ineffective strategy or architecture is likely to lead to observable problems such as an increase in faults.
You might also want to get an idea of what value you are delivering to your customers. This is hard to quantify, as it is a fairly high-level concept. It might be reasonable to assume that the value you deliver is a function of the amount of work completed in a given timeframe, adjusted to account for the number of faults introduced in that period. By tracking these easily observable metrics, you can make deductions about less observable high-level outcomes.
In this article, you will learn about the metrics that you most need to track and why, so that you can get the best results from your team.
About Scrum Metrics
To choose the most effective metrics for your team, you should first know why you want to track them. Teams often want to know how they’re performing, especially compared to their past performance, and how they can use that data to plan for future workflows.
You can also use metrics to track targets for your team, but take care to consider the ramifications of this. Goodhart’s law stipulates: “When a measure becomes a target, it ceases to be a good measure.” That’s because it’s possible to manipulate the things we measure. Sprint velocity is a good example of this. Because velocity is directly derived from the team’s estimates, if you assess team members based on their velocity, you are begging them to give you inflated estimates.
With that in mind, most of the following recommended metrics are harder to manipulate. This means you’ll get more reliable data and higher-quality outcomes.
Although it’s unwise to treat velocity as a target, this is still a crucial metric to track for informational purposes. Described as “an indication of the average amount of Product Backlog turned into an Increment of product during a Sprint by a Scrum Team,” velocity is an important metric because it serves as a powerful input to other planning processes. Without velocity, you might not know how much work to put into your next sprint.
If your team members are experienced enough that their estimations are fairly accurate, your velocity can indicate how much work the team will get done in a given iteration. This provides predictability, which is crucial when trying to plan future sprints.
A closely related metric is sprint burndown. It’s technically less of a metric and more of a visualization of other metrics, but your burndown for a given sprint is important to track. It gives you a glimpse into your “ideal future” for the sprint, showing how much work you need to complete at regular intervals in order to meet the projected goal.
If you use your burndown effectively, you’ll know well in advance if your sprint goal is unrealistic. If your team is completing three stories per day on average, and your burndown tells you that you need to complete seven per day, you know that you won’t meet the goal this sprint. While this will result in an adjusted velocity at the end of the iteration, you can also use this information throughout the sprint to more effectively manage stakeholders’ expectations. You can also allocate team effort to the pieces of work that really matter, knowing that you can push less crucial tasks to a future iteration as needed.
This isn’t a single metric but a collection of related metrics that describe how quickly you can turn an idea into a finished product for customers. Code throughput consists of the following metrics:
Time to Market
Time to market, or lead time, indicates how long it takes for a ticket to get from the backlog into production, including all of the phases in between. This metric takes into account manual quality gates or benchmarks, planning, refinement, and any other process that might slow down the timeline of delivering value to your customers.
There are a few different ways to tweak this metric. You could focus only on features, for instance, or on the time for bug fixes to take effect once they’re entered into the system.
Work Item Age
Work item age tracks how long things are in-flight—specifically, how long they’re in the hands of a particular party, usually the developers. The actual tracking of this metric can vary between methodologies and project management tools. Still, most mature tools will have a reporting functionality that lets you see how long pieces of work were at a particular stage.
There is no one-size-fits-all approach here because the ideal duration for a piece of work will vary between teams. If you don’t track this metric, though, a disconnect can easily form; for example, the team acts like tickets will only take one to three days and plans accordingly, but the reports later show that tickets take roughly five and a half days on average.
Generally, you only track the actual time it takes for a developer to work on the ticket. The periods the ticket spends in the backlog and in manual processes before and after the dev work (such as quality and compliance processes) are not tracked, although they are included in the “time to market” metric. If these individual processes are of interest, though, you can track them as well.
Deployment frequency—as the name suggests—is a measure of how often your team releases new code, features, and fixes into a customer-facing production environment. Generally speaking, deploying more frequently is a good thing because it means you can deliver value and fix issues more often. Of course, your deployment model and product need to support this. For teams with a mature CI/CD practice, such frequency will come naturally. For those who are still working on scheduled quarterly releases of many features at once, this can be trickier.
These metrics together give you an idea of your code throughput. Higher throughput is generally better because it means you can iterate and deliver value with greater frequency. However, this also increases the frequency with which you can break things or introduce new problems for your customers. That is where the next set of metrics comes in.
Another category of worthwhile metrics to track is system stability. Instead of focusing on how frequently you can deliver value to your customers, you’re measuring the stability, quality, and maturity of your supporting processes and making sure that when—not if—something goes wrong, it’s handled appropriately.
Change Fail Rate
This metric measures the quality of the code you push out. Frequent deployments and short cycle times don’t matter if your team regularly releases faulty code. By measuring how often changes introduce faults into production, you can judge whether your quality practices are effective enough.
For example, you could have what you believe to be a thorough testing plan. Maybe you even have the fabled “100 percent code coverage” (though many developers will point out that bugs are possible with even the highest coverage). If your change failure rate is 20 percent, for example, this means that a fifth of your deployments or changes are causing new bugs to appear, indicating that your quality gates are not working as intended. This metric alone won’t solve your quality issues, but it will give you an idea of how pressing they are and where you need to focus your efforts.
Time to Restore
Outages are inevitable—if it can happen to Facebook, it can happen to you too. Time to restore measures how long it takes for your team to resolve an outage or failure. If you routinely take too long to resolve outages, not only does it affect your customers (and potentially violate your SLAs), but it also indicates deeper quality issues (why are the outages happening so often) and potential architectural or infrastructure issues (why are the outages so time-consuming to recover from).
Of course, there would be no outages in an ideal world, but because they can and do happen, it’s important to gather data on them and do what you can to mitigate their impact.
There are multiple effective metrics to track in order to improve your scrum practices. This list is not by any means exhaustive, but it can serve as a starting point if you’ve been having trouble identifying the metrics that matter to you and your team.
Keep in mind the difference between a measure and a target. Measures and metrics help you plan for the future, while targets give you something to aim for. When these two concepts collide, the result is unreliable data. Be mindful when deciding on what you want to track and why.
The metrics outlined in this article can help you find ways to track how your team is doing so that you can optimize your workflow. That way, your team and your users will benefit.