The SPACE framework for developer productivity
The DORA metrics are a perfect starting point. They are easy to implement and there is lots of data to compare. If you want to take it one step further and add more metrics, you can use the SPACE framework for developer productivity (Forsgren N., Storey M.A., Maddila C., Zimmermann T., Houck B. & Butler J. (2021)).
Developer productivity is the key ingredient to achieving a high engineering velocity and a high DVI. Developer productivity is highly correlated to the overall well-being and satisfaction of developers and is, therefore, one of the most important ingredients to thrive in the war of talents and attract good engineers.
But developer productivity is not just about activity. The opposite is often the case: in times of firefighting and meeting deadlines when activity is normally high, productivity decreases through frequent task switching and less creativity. That's why metrics that measure developer productivity should never be used in isolation, and never to penalize or reward developers.
Also, developer productivity is not solely about individual performance. As in team sports, individual performance is important, but only the team as a whole wins. Balancing measures of individual and team performance is crucial.
SPACE is a multidimensional framework that categorizes metrics for developer productivity into the following dimensions:
- Satisfaction and well-being
- Performance
- Activity
- Communication and collaboration
- Efficiency and flow
All the dimensions work for individuals, teams, and the system as a whole.
Satisfaction and well-being
Satisfaction and well-being are about how happy and fulfilled we are. Physical and mental health also fall into this dimension. Some example metrics are given here:
- Developer satisfaction
- Net promoter score (NPS) for a team (how likely it is that someone would recommend their team to others)
- Retention
- Satisfaction with the engineering system
Performance
Performance is the outcome of the system or process. The performance of individual developers is hard to measure. But for a team or system level, we could use measures such as LT, DLT, or MTTR. Other examples could be uptime or service health. Other good metrics are customer satisfaction or an NPS for the product (how likely it is that someone would recommend the product to others).
Activity
Activity can provide valuable insights into productivity, but it is hard to measure it correctly. A good measure for individual activity would be focus time: how much time is a developer not spending on meetings and communication? Other examples for metrics are the number of completed work items, issues, PRs, commits, or bugs.
Communication and collaboration
Communication and collaboration are key ingredients to developer productivity. Measuring them is hard, but looking at PRs and issues gives you a good impression of how the communication is going. Metrics in this dimension should focus on PR engagement, the quality of meetings, and knowledge sharing. Also, code reviews across the team level (cross-team or X-team) are a good measure to see what boundaries there are between teams.
Efficiency and flow
Efficiency and flow measure how many handoffs and delays increase your overall LT. Good metrics are the number of handoffs, blocked work items, and interruptions. For work items, you can measure total time, value-added time, and wait time.
How to use the SPACE framework
All the dimensions are valid for individuals, teams, groups, and on a system level, (see Figure 1.5).
It is important to not only look at the dimension but also at the scope. Some metrics are valid in multiple dimensions.
It is also very important to select carefully which metrics are being measured. Metrics shape behavior and certain metrics can have side effects you did not consider in the first place. The goal is to use only a few metrics but with the maximum positive impact.
You should select at least three metrics from three dimensions. You can mix the metrics for individual, team, and system scope. Be cautious with the individual metrics—they can have the most side effects that are hard to foresee.
To respect the privacy of the developers, the data should be anonymized, and you should only report aggregated results at a team or group level.