Measuring Developer Productivity: Real-World Examples
How 17 tech companies actually measure engineering productivity — and a framework for choosing your own metrics.
Source: Pragmatic Engineer — Gergely Orosz & Abi Noda
The Core Insight
Instead of debating how to measure developer productivity in the abstract, the authors studied what dedicated DevProd teams at 17 companies actually measure. These teams need real metrics to prioritize work, prove impact, and identify friction — so their choices are pragmatic rather than theoretical. The findings reveal broad consensus around a handful of patterns, while also showing that every company tailors its approach to its own goals and culture.
No single metric captures productivity. Look at it through the dimensions of speed, ease, and quality — they exist in tension with one another, surfacing tradeoffs.
Google’s Philosophy
Google’s Developer Intelligence team measures everything through three dimensions: speed, ease, and quality. The specific metrics change depending on what’s being measured (a tool, a process, a team), but the dimensions stay constant. They combine quantitative data from logs with qualitative signals from surveys, diary studies, and interviews — a mixed-methods approach that catches things pure metrics miss, like technical debt or perceived slowness that doesn’t show up in the numbers.
LinkedIn’s Three Channels
| Channel | Description |
|---|---|
| Quarterly Survey | ~30 questions assessing developer experience across tools, processes, and activities. Personalized per developer using data from real-time feedback systems. |
| Real-Time Feedback | Tracks developer actions in tooling and sends targeted micro-surveys based on triggers. Smart throttling prevents survey fatigue. |
| System Metrics | Build times, deployment frequency, CI determinism, and more — calculated from system data via the Developer Insights Hub (iHub). |
Key LinkedIn metrics include Developer NSAT (overall satisfaction), build time (P50/P90), code reviewer response time, post-commit CI speed, CI determinism (the opposite of flakiness), and deployment success rate. They compare objective measurements with subjective satisfaction — because even if the numbers look fine, if developers say they hate their builds, that matters.
A notable technique: LinkedIn uses winsorized means instead of pure medians. Medians can mask real improvements in outlier cases (e.g., cutting a 25-second build to 3 seconds). Winsorized means clip extreme values to a percentile boundary rather than discarding them, producing a metric that actually reflects tail improvements.
Peloton’s Four Pillars
| Pillar | Metrics |
|---|---|
| Engagement | Developer Satisfaction Score. Captured via a bi-annual survey sent to a random half of developers, so each person only participates once a year. |
| Velocity | Time to 1st and 10th PR for new hires, plus Lead Time and Deployment Frequency. |
| Quality | % of PRs under 250 lines, Line Coverage, and Change Failure Rate. |
| Stability | Time to Restore Services. How quickly the team recovers when something breaks. |
Scaleup Patterns
Companies in the 100–1,000 engineer range (Notion, Postman, Amplitude, GoodRx, Intercom, Lattice) converge on a few common themes. They emphasize moveable metrics — things their DevProd team can directly influence and use to demonstrate impact.
Common Metrics:
- Ease of Delivery — qualitative, used as a north star by many teams
- Engagement — counterbalances speed metrics to prevent burnout
- Time Loss — percentage of time lost to environmental friction, translatable to dollars
- Change Failure Rate — incidents divided by deployments
Unique Metrics:
- Adoption Rate (DoorDash, Spotify) — how many devs actively use a tool or standard
- Design Docs per Engineer (Uber) — tracks whether teams write design docs before building
- Experiment Velocity (Etsy) — measures learning speed, not shipping speed
Surprising Findings
DORA and SPACE aren’t used wholesale. Only Microsoft (who authored SPACE) adopted it as a framework. Others cherry-pick individual metrics from these programs as components of broader strategies.
Qualitative metrics are everywhere. Every company uses both qualitative and quantitative measures — a major shift from five years ago when most relied on quantitative data alone.
Focus time is a top-level metric. Stripe tracks “days with sufficient focus time” and Uber tracks “weekly focus time per engineer” — deep work has become a first-class measurement.
Choosing Your Own Metrics
Don’t Start Here: Jumping straight to metrics before defining what you want to understand. Picking DORA or SPACE wholesale because they seem like an industry standard.
Start Here: Google’s Goals, Signals, Metrics (GSM) framework. Define goals first, identify signals that would indicate success, then work backwards to metrics.
The Three Buckets for Engineering Leaders
If you’re a CTO, VPE, or Director reporting up to leadership, reframe the problem. What your CEO actually wants is confidence that engineering investment is well-stewarded. Organize metrics into three buckets:
- Business Impact — what are we building and why, is it on track
- System Performance — uptime, incidents, infrastructure health, user NPS
- Engineering Effectiveness — the speed/ease/quality metrics covered throughout this article
Together, these tell a complete story that non-technical stakeholders can follow.
The Bottom Line
There is no one-size-fits-all set of developer productivity metrics. Every company in this study measures at least 5–6 different things, blending qualitative and quantitative signals. Start with the problem you want to solve — frictionless shipping, developer retention, software quality — and work backwards. Choose metrics your team can actually control, pair objective data with subjective experience, and resist the temptation to collapse everything into a single score.