Measuring Developer Productivity: Real-World Examples

How 17 tech companies actually measure engineering productivity — and a framework for choosing your own metrics.

Source: Pragmatic Engineer — Gergely Orosz & Abi Noda

The Core Insight

Instead of debating how to measure developer productivity in the abstract, the authors studied what dedicated DevProd teams at 17 companies actually measure. These teams need real metrics to prioritize work, prove impact, and identify friction — so their choices are pragmatic rather than theoretical. The findings reveal broad consensus around a handful of patterns, while also showing that every company tailors its approach to its own goals and culture.

No single metric captures productivity. Look at it through the dimensions of speed, ease, and quality — they exist in tension with one another, surfacing tradeoffs.

Google’s Philosophy

Google’s Developer Intelligence team measures everything through three dimensions: speed, ease, and quality. The specific metrics change depending on what’s being measured (a tool, a process, a team), but the dimensions stay constant. They combine quantitative data from logs with qualitative signals from surveys, diary studies, and interviews — a mixed-methods approach that catches things pure metrics miss, like technical debt or perceived slowness that doesn’t show up in the numbers.

LinkedIn’s Three Channels

Channel	Description
Quarterly Survey	~30 questions assessing developer experience across tools, processes, and activities. Personalized per developer using data from real-time feedback systems.
Real-Time Feedback	Tracks developer actions in tooling and sends targeted micro-surveys based on triggers. Smart throttling prevents survey fatigue.
System Metrics	Build times, deployment frequency, CI determinism, and more — calculated from system data via the Developer Insights Hub (iHub).

Key LinkedIn metrics include Developer NSAT (overall satisfaction), build time (P50/P90), code reviewer response time, post-commit CI speed, CI determinism (the opposite of flakiness), and deployment success rate. They compare objective measurements with subjective satisfaction — because even if the numbers look fine, if developers say they hate their builds, that matters.

A notable technique: LinkedIn uses winsorized means instead of pure medians. Medians can mask real improvements in outlier cases (e.g., cutting a 25-second build to 3 seconds). Winsorized means clip extreme values to a percentile boundary rather than discarding them, producing a metric that actually reflects tail improvements.

Peloton’s Four Pillars

Pillar	Metrics
Engagement	Developer Satisfaction Score. Captured via a bi-annual survey sent to a random half of developers, so each person only participates once a year.
Velocity	Time to 1st and 10th PR for new hires, plus Lead Time and Deployment Frequency.
Quality	% of PRs under 250 lines, Line Coverage, and Change Failure Rate.
Stability	Time to Restore Services. How quickly the team recovers when something breaks.

Scaleup Patterns

Companies in the 100–1,000 engineer range (Notion, Postman, Amplitude, GoodRx, Intercom, Lattice) converge on a few common themes. They emphasize moveable metrics — things their DevProd team can directly influence and use to demonstrate impact.

Common Metrics:

Ease of Delivery — qualitative, used as a north star by many teams
Engagement — counterbalances speed metrics to prevent burnout
Time Loss — percentage of time lost to environmental friction, translatable to dollars
Change Failure Rate — incidents divided by deployments

Unique Metrics:

Adoption Rate (DoorDash, Spotify) — how many devs actively use a tool or standard
Design Docs per Engineer (Uber) — tracks whether teams write design docs before building
Experiment Velocity (Etsy) — measures learning speed, not shipping speed

Surprising Findings

DORA and SPACE aren’t used wholesale. Only Microsoft (who authored SPACE) adopted it as a framework. Others cherry-pick individual metrics from these programs as components of broader strategies.

Qualitative metrics are everywhere. Every company uses both qualitative and quantitative measures — a major shift from five years ago when most relied on quantitative data alone.

Focus time is a top-level metric. Stripe tracks “days with sufficient focus time” and Uber tracks “weekly focus time per engineer” — deep work has become a first-class measurement.

Choosing Your Own Metrics

Don’t Start Here: Jumping straight to metrics before defining what you want to understand. Picking DORA or SPACE wholesale because they seem like an industry standard.

Start Here: Google’s Goals, Signals, Metrics (GSM) framework. Define goals first, identify signals that would indicate success, then work backwards to metrics.

The Three Buckets for Engineering Leaders

If you’re a CTO, VPE, or Director reporting up to leadership, reframe the problem. What your CEO actually wants is confidence that engineering investment is well-stewarded. Organize metrics into three buckets:

Business Impact — what are we building and why, is it on track
System Performance — uptime, incidents, infrastructure health, user NPS
Engineering Effectiveness — the speed/ease/quality metrics covered throughout this article

Together, these tell a complete story that non-technical stakeholders can follow.

The Bottom Line

There is no one-size-fits-all set of developer productivity metrics. Every company in this study measures at least 5–6 different things, blending qualitative and quantitative signals. Start with the problem you want to solve — frictionless shipping, developer retention, software quality — and work backwards. Choose metrics your team can actually control, pair objective data with subjective experience, and resist the temptation to collapse everything into a single score.