Measuring Engineering Team Performance

Traditional metrics used to judge engineering team performance are easy to game, especially when they're focused on outputs vs. outcomes. Learn which metrics to use in order to measure the outcomes that actually indicate great performance.

March 8th 2022

by Jonathan Napitupulu

in Teamwork

The engineering team is one of the most important teams in a company, and their performance is one of the primary determinants of how well the company does. The performance of the engineering team needs to be measured so managers and other stakeholders can track their progress.

This article will share some of the metrics to measure engineering team performance and how you can use these metrics to identify opportunities for improvement in specific areas, assignments, and task-team fit, leading to better planning.

Why it’s Important

Performance assessments are important for managers to understand how well their team is working. While it’s easy to tell if the team is meeting deadlines, productivity level is harder to determine; without this information, it’s hard to see if adjustments to the team or their workflow are needed.

There are other reasons to assess performance, as well.

Assessments help engineers know if they’re meeting performance standards.
They allow the team to see how much progress they’ve made.
They help identify any problem spots or areas where people are struggling.
Information and solutions that result from the evaluations can be used to help plan for future projects.
Monitoring productivity helps ensure that people are consistently challenged, but not overburdened, leading to more productive and more satisfied employees.
Performance assessments can identify areas where the team is over or under-staffed.

Accurate evaluations will show managers which areas need to be improved.

Metrics You Need

You may already be aware of some traditional engineering metrics, such as the number of lines of code produced, or the number of bugs fixed.

In the book Accelerate: The Science of Lean Software and DevOps, Dr. Forsgren and her DevOps Research and Assessment (DORA) team built four new key metrics compiled from six years of research and 23,000 datasets from companies all around the world. These metrics—designed to be more accurate and informative for measuring engineering productivity—are lead time, deploy frequency, time to restore, and the number of issues per change.

Lead Time

Lead time, as defined by Forsgren, is how long it takes to go from backlog status into production. This metric includes time spent developing new features, fixing bugs, and testing. Lead times need to be realistic and appropriate to the scope of the project.

For example, if you’re working on an urgent bug fix with no other work in the backlog, then your lead time will be very short—maybe as little as an hour or two. However, if you’re working on a new feature that needs to be integrated with multiple parts of your current application, your lead time might be weeks or months.

Lead time is a crucial factor in the software development lifecycle, but measuring lead time for any given aspect of development can be challenging. Adopting agile methodologies and continuous integration practices makes it possible to identify which areas are slowing down your release cycles.

To improve lead times for developers, try to break work into smaller chunks, which allows more people to work on the project simultaneously. This will also help your team focus on delivering code instead of spending their time sorting out technical issues.

Deploy Frequency

Measuring engineer performance isn’t easy, but having metrics like deploy frequency can give insight into how the team is doing and highlight areas that need improvement.

The idea behind this metric is to emphasize the importance of continuous deployment and rapid iteration. Ideally, everyone in the engineering organization should be deploying at least once a day, and ideally more often than that.

You can also use this formula to determine the performance of an engineer:

Engineer Performance = (Deploy Frequency) / (Deploy Duration)

This is a quick formula, and shouldn’t be used to measure performance without considering other factors. For example, if you’re deploying every day, but each deployment takes two hours, your engineers have much less time to spend on actual development, which is itself a problem.

Deploy frequency can provide visibility into engineers’ productivity and help identify bottlenecks in their processes.

Time to Restore

This metric measures how many hours it takes to solve a production issue. For example, how many hours pass between when engineers are assigned to the issue and when the issue is resolved?

You may also want to ratio this in a proportion of story points. For example, if there is a crash log in production, you want to know the root cause of the error and which story caused the error. The QA team should be able to perform root-cause analysis quickly and inform the product manager, who then assigns engineers to the task.

Being able to solve production issues quickly gives a sense of ownership to engineers. If they’re empowered to fix it as soon as possible, they’re more likely to care about the customers affected by the error and the damage to the company’s reputation.

Of course, this might look different if the system is dependent on external factors. For example, if your system is hosted by a cloud service that’s experiencing an outage, you shouldn’t measure time to restore from the point at which the provider went down, but rather from when the cloud service returns.

Ideally, this isn’t just a measure of the engineering team’s performance, but of the system design performance. Your system should have automated recovery to restore itself without manual intervention.

Number of Issues Per Change

You can also measure the engineering performance through the number of issues that arise per change. Keep a record of how many of the problems raised by the QA team made into production per change for each team member. If your team is having issues making it into production every other deployment, there might be underlying issues you need to address, either with the team or with your expectations.

You need to look at the overall issue numbers, overtime hours, other known complications, and the system, to see what might be contributing to the low-quality code. On the other hand, if several of the issues are consistently coming from the same person, you can look closer at that person’s work to see where they’re struggling.

These Metrics vs. Traditional Scrum/Agile Metrics

Traditional scrum/agile metrics are easy to manipulate. The two most obvious examples are story points and bugs. One way to manipulate these metrics is by fixing bugs in your code, and introducing an unrequested new feature with it, thus changing the bug fix to a new feature.

Another way of manipulating metrics is increasing the number of story points. This is especially likely if the engineer feels they have particular expertise on a story, and can intentionally make it more complex than usual or necessary. Or when engineers encounter a bug, they will have extra spaces to justify the bug fix for different points.

Even people who don’t set out to manipulate these metrics might artificially inflate them. For example, some people argue that the number of lines of code written is a good measure of productivity. While this may have been true in the past, it’s no longer accurate. A developer can easily write a hundred lines in an hour if they’re just copying and pasting with minor changes. That can grow to an output of nearly a thousand lines of code in a day—a lot of output, but not necessarily a lot of productivity. In addition to artificially inflating the metrics, the more lines of code there are, the greater the chance of bugs. Writing high-quality code that takes as few lines as possible takes more time, but is cleaner and more robust.

Some programmers will write if-else conditions everywhere, which is easy to read, but can quickly become bloated and repetitive. If an inexperienced programmer does this, they might just need more guidance. For an experienced programmer, though, it’s more likely they realized that lines of code—not quality—is how their productivity is measured.

In a microservices system, any engineer should be able to spend a day with the source code and understand what the system does. If your engineers are focusing on writing a lot of code, they won’t be focusing on maintaining the simplicity and readability of the project.

Conclusion

Traditional metrics are easy to game, but in this article, you’ve learned some new metrics to measure not the output but the outcomes that make for great engineering team performance. These metrics will help the manager identify engineers who are underdelivering, and can help inform decisions if the team as a whole isn’t meeting expectations.

Some of the metrics discussed can be changed as necessary to fit your specific situation. For example, you want the engineering team to have some sense of ownership, so you want to pay more attention to restore time. On the other hand, if you want your team to be lean, you might want to extend the lead time.

Modern development needs modern metrics, and by implementing these, you’ll have a better idea of how your team is really performing.

Subscribe to The Steady Beat

A weekly-ish round-up of hand-picked articles and resources for people who make software products: designers, engineers, product managers, and organizational leaders.

Subscribe now