A metric is, roughly speaking, a mechanism to collect data on something. For our purposes, a metric is a mechanism to gather data about how well some aspect of a software project is working, in order to provide feedback that can be used to improve the project or the management of the project.

Value of Metrics

An important thing to consider is the value of metrics - a metric implemented without a specific goal in mind is just going to waste time, which will make your metric overall start to show that you are doing poorly. With this in mind, here are some important things that metrics can provide:

  • Clear feedback, allowing for targeted changes to improve the project
  • Providing quantified information to customers, leadership and developers about the development effort
  • Identifying areas of concern for investigation before they become problematic

Two Major Types of Metrics

For the purposes of this discussion, I will be dividing metrics into two broad categories: Project Health Metrics, which operate at a high level to identify when your project might be running into trouble, and Focused Metrics, which provide more directed information. We will focus on the Code Health Metrics subset of Focused Metrics, given my background in that area.

Project Health Metric

The goal of a Project Health Metric is to show the overall status of the project. It should reflect things like customer perception of the product, sustainability, productivity, team efficiency, and any other information that is relevant to identifying the over-all well-being of the project. An important aspect, however, is that it blends these things together into a single value, or a handful of values, as the goal behind a Project Health Metric is to provide a simple indication to leadership to the overall state of the project.

A good example of a Project Health Metric would be HSDIMUL. HSDIMUL falls into this category since it blends customer perception and the product life-cycle stage in which bugs are found, indicating that something is wrong with the project if the score is high. Since it does not provide details about what causes the score - it could indicate issues with code quality, testing coverage, or even something abstract as architecture - it cannot be considered a Focused Metric.

It should be obvious that, while potentially valuable for initiating investigation and remediation when a project is suffering, a Project Health Metric doesn’t provide much in the way of information for improving a specific problem area. To that end, it should be paired with Focused Metrics, our next area of discussion.

Focused Metrics

There is huge potential space for focused metrics, as anything which can be improved can potentially have a metric applied to it. As some example categories of focused metrics there could be code health metrics, team dynamics metrics, knowledge acquisition metrics, even team lunch metrics! To keep the discussion focused, we’ll take a look at some concrete examples from Code Health Metrics and take a theoretical look at some metrics for team dynamics.

Code Health Metrics

Code Health Metrics are Focused Metrics that provide feedback on specific things related to code quality.

One common example of a Code Health Metric is the output of a coverage tool. A coverage tool provides information on the percentage of lines of code that have been tested by your test suite - making it a metric related to test coverage (arguably this could belong in a Test Health Metric, but in the spirit of developers writing unit tests, we’ll keep it as part of Code Health). If the test coverage scores are too low, the chance of bugs getting missed and going out when the code is shipped gets higher. Notice that test coverage pairs nicely with HSDIMUL - if HSDIMUL scores are high and test coverage is low, it probably indicates that you need to improve your testing requirements.

Another, example would be the output of a linting tool. This can be used as a metric to indicate the number of style violations that occur in the code - assuming that the style was chosen well, this metric provides indications about how cleanly the code is written, a strong indication of how maintainable the code will be over time.

As a final example, consider a standard that limits the number of lines in any given function to 15 lines of code. In this case, the metric is obviously the number of lines of code, but by making it a standard we make the feedback cycle immediate - any time that metric’s value goes too high, it is immediately sent back to be fixed.

Team Dynamic Metrics

A Focused Metric for team dynamics, an area which I confess I’ve not dealt with metrics in personally, would be metrics designed to reflect how well a group operates as a team. The goal to metrics in this group is to provide feedback to leadership on the pain points within the team, so that leadership can work to improve the overall operation of the team.

As an example metric, consider the number of times a ticket has been reassigned. In an ideal case, such a ticket might only need to be assigned once - with the ideal candidate for resolving the ticket selected on the first try. In a less ideal case, you might see ten reassignments before the ticket finally lands on the someone equipped to handle it. In this case, it suggests that there may be a poor understanding in the team of the product and who is knowledgeable in what areas.

As a second example, one that is even fuzzier, a manager could track the number of heated arguments that take place on the team over a given time. A lot of arguments might indicate that the team isn’t meshing well together, while no arguments would probably indicate that the team isn’t really paying attention to what their coworkers are working on.

Recommendations for Metric Selection

Keep Feedback Loops Short

The whole point of a metric is that it provides quantified feedback. To make the best use of this feedback, the sooner it is provided to those it relates to, the better off the project will be. In some cases, this might mean looking at the metrics in your planning meetings, or even daily meetings. In certain cases, such as coding style standards, you may be able to directly convert the target value for the metric into a standard, and make the feedback happen automatically.

Anything that would act to increase the size of the feedback loop for a metric should be eyed with suspicion, and if possible removed entirely.

KISS - Keep It Simple Stupid

Just as in the development of your product, you should always apply the KISS principle to metrics. The more you require people to do, either in terms of analysing the metrics or inputting/collecting data for them, the higher the chances that the metrics will not really reflect reality.

As an important part of this, keep the actual calculations and data gathering for the metrics as transparent as possible. If you can’t look at the data and look at the metric, and understand why the metric value is what it is, the metric is too complex and should probably be thrown out. This helps keep the metrics useful for everyone involved, not just a handful of specialists who know how to make software jump through just the right hoops to provide a usable output.

Automate, Automate, Automate

People never like to do extra work, so automating as much of the data gathering of metrics as possible is essential. Code related metrics can often be gathered automatically as part of a continuous integration server, while issue trackers lend themselves to automating metrics related to tickets.

If you can’t fully automate it, try to make adding that extra information part of the bare minimum required. For example, add a validated field on an issue tracker to gather that information, so that the issue can’t be closed without entering it. Otherwise you run the risk of every issue being too important to take the time to correctly update the metrics - rendering your metrics worthless.

Don’t Over-Do It

With the current trendiness of “data-driven-anything” it might be tempting to throw every metric you can find at your project. Having too many metrics is the same thing as having too few - if people are too overwhelmed to evaluate what the metrics mean, they’ll just ignore them. Which means you’ve wasted a lot of time getting the metrics set up for no reward. Remember, not everyone is a data scientist. And just as importantly, not everyone who thinks they are a data scientist really are a data scientist.

As an important related point, avoid metrics that only go to leadership. Once these come out, they will only serve to encourage an air of paranoia. And not the good kind that keeps code and information systems safe from attackers, but the kind that reduces productivity and quality due to a constant fear of what happens if their metrics don’t look right - besides a metric that isn’t public (to the team at least) doesn’t provide feedback when and where it matters the most.

Some Specific Recommendations

Finally, some specific guidance. Use only a handful of Project Health Metrics - maybe three or four, and use the investigation they drive to help you identify what Focused Metrics you need. I would strongly consider only adding new Focused Metrics if you can relate them back to problems exposed by a Project Health Metric, to help you avoid metric sprawl. In addition, to help with the problem of out-of-control metric proliferation, only retain metrics that are useful - if a metric never budges and it’s a value you are happy with, there’s really no need to keep collecting it.