The Measurement Problem: AI Doesn’t Fit KPIs

Episode 21

Hi there,

AI has moved firmly into production. Across industries, it is embedded in workflows, augmenting teams and accelerating execution. Adoption is no longer the constraint. Investment is growing, use cases are expanding, and expectations are rising.

But one issue is becoming increasingly difficult to ignore:
organizations still cannot clearly measure what AI is actually delivering.

According to PwC, while CEOs are prioritizing AI investments and expect them to drive efficiency and transformation, only a limited share report tangible financial impact today, and many acknowledge that their organizations lack the capabilities to properly track and quantify that value. This is not a temporary lag between adoption and results. It reflects a deeper mismatch between how AI creates value and how organizations measure performance.

Inside the Issue

  • Why AI output doesn’t translate into measurable outcomes

  • The gap between perceived productivity and real business impact

  • Why traditional KPIs fail in probabilistic systems

  • What leading teams are starting to measure instead

Adoption Is Scaling Faster Than Measurement

Most organizations still rely on KPIs designed for deterministic systems — where inputs, outputs, and outcomes are directly linked. More effort leads to more output, and more output leads to measurable value. AI breaks that chain. It introduces variability, context-dependence, and non-linear impact, which makes the connection between activity and outcome significantly harder to trace.

Data from McKinsey & Company continues to show that while AI adoption is widespread, only a smaller subset of companies reports meaningful financial returns, and even fewer are able to scale those returns across the organization. PwC reinforces this signal: executives expect AI to transform performance, but linking that transformation to clear financial metrics remains one of the biggest unresolved challenges.

This creates a structural gap. AI is being deployed into systems that were never designed to measure its effects.

The Productivity Illusion

One of the clearest manifestations of this gap is the divergence between perceived and measurable productivity. At the individual level, AI reduces friction. Tasks are completed faster, iteration cycles shorten, and teams report higher efficiency. Research from Microsoft shows that employees consistently feel more productive when using AI tools.

However, this improvement often fails to translate into system-level performance. Revenue, margins, and delivery timelines do not necessarily improve at the same rate. In its Strategic Predictions for 2026, Gartner highlights that many AI initiatives will struggle to demonstrate business value, not because the technology underperforms, but because organizations are unable to convert localized efficiency gains into measurable outcomes at scale.

The result is a misleading signal:
work feels faster, but the system is not necessarily performing better.

Why Traditional KPIs Break Down

The underlying issue is not execution — it is measurement design. Traditional KPIs assume stability, repeatability, and clear causality. AI systems operate differently. They are probabilistic, meaning that outputs vary depending on context, data quality, and usage patterns. Performance is not fixed, and quality is often subjective.

Research from Stanford HAI emphasizes that evaluating AI systems requires fundamentally different approaches, as outcomes cannot be consistently benchmarked in the same way as traditional software. The same system can deliver high-quality results in one context and fail in another, making aggregate metrics difficult to interpret.

This creates a silent failure mode for organizations. KPIs continue to report stable performance, but they do not capture where AI is actually creating or destroying value. As a result, decision-making becomes distorted — systems that appear effective may not be, while valuable improvements remain invisible.

What Gets Missed

A significant portion of AI’s impact does not show up in standard metrics at all. It manifests in how organizations operate rather than what they produce: faster decision cycles, reduced cognitive load, improved coordination between teams, and increased adaptability in complex workflows. These effects are real, but they are indirect, making them difficult to quantify using traditional frameworks.

PwC notes that while AI is expected to drive productivity and reshape business models, the inability to measure that impact clearly is becoming a growing concern at the executive level. This is reinforced by Gartner, which predicts that organizations that fail to adapt their measurement models will continue to invest in AI without achieving proportional returns.

In practice, this means that value is often present but not captured. The issue is not absence of impact, but lack of visibility.

Rethinking Measurement

Some organizations are beginning to adjust by shifting focus away from purely output-based KPIs toward system-level indicators. Instead of asking how much was produced, they are starting to examine how effectively the system operates with AI embedded in it — how quickly decisions are made, how reliably insights are acted upon, and how much friction is removed from critical workflows.

This shift is still early and far from standardized. However, it reflects a necessary transition. AI does not simply increase output; it changes how value is created. Measuring it requires moving closer to the dynamics of decision-making and execution, rather than relying solely on traditional performance metrics.

Closing

AI is not just challenging existing systems. It is challenging the assumptions behind how organizations define success.

The problem is no longer whether AI can deliver value.
It is whether that value can be correctly understood.

Because if the metrics are misaligned, the conclusions will be flawed — and organizations will continue to scale systems they do not fully understand.

Sources & Further Reading

Stanford HAI — AI Index Report
https://hai.stanford.edu/ai-index

Thank you for joining us for another edition of The Foundation.

If AI is already part of your workflows but its impact is still unclear, the issue is often not the model — but how it’s integrated into real systems.

We help teams design and deliver AI solutions that work in production.

P.S. We want to make sure this newsletter hits the mark. So reply to this email and let us know what you think.