What is a realistic ROI expectation for agentic AI?

A well-run AI Factory at a mid-market or enterprise organisation can deliver 200–400% ROI on transformation spend over 3–4 years. Our operator case reached 300% ROI with 35% operational cost reduction and 55+ production agents. Vendor projections of '10x in year one' are almost always selling you a deck, not an outcome.

How do you build a credible business case for agentic AI?

Unit economics from day one: cost per task before and after automation, fully loaded (not just the model cost, engineering, governance, and monitoring too). Rank candidate processes by (volume × cost per task) × automation feasibility. Show payback per process, not programme-wide. CFOs trust process-level numbers far more than 'transformation' narratives.

Why do 42% of AI initiatives show zero ROI?

Three root causes: 1) They automated the wrong process (low volume, low value, or high variance). 2) They built a bespoke one-off instead of a re-usable factory pattern, so the cost amortises over one agent instead of many. 3) They left the operator who owned the process out of the build. Fix any two and ROI follows.

Agentic AI ROI: The Honest Numbers, Amjid Ali

If you ask a vendor for agentic AI ROI data, you’ll get forecasts. If you ask an analyst, you’ll get ranges. If you ask a CFO who’s already paid for a stalled pilot, you’ll get a raised eyebrow.

There’s a reason the eyebrow goes up. RAND’s 2025 study put the AI-project value-failure rate at 80.3%. Beam.ai’s analysis found 42% of enterprises’ AI projects show zero ROI. Another 18.1% deliver value but cannot justify their cost.

Put differently: more than 60% of enterprise AI spend currently returns less than it cost. That’s the base rate your business case is fighting.

This piece is for the finance and strategy readers. What’s actually payable. What’s realistically not. How I built the 300% ROI outcome on a real P&L, and how you should frame your own.

What the 300% number actually means

Let me define what “300% ROI” means at the Oman conglomerate where we ran the AI Factory from 2021 to 2025. It’s not a forecast. It’s a retrospective calculation agreed between the Group CIO office and Group Finance.

Inputs to the numerator (value):

Fully-loaded hours saved per role, measured by SOP analysis before and after automation, audited against time-tracking and payroll data.
Error-reduction value (re-work hours avoided, compliance penalties avoided, customer-refund delta).
Revenue impact where an agent directly accelerated a sales or collections cycle, booked to the relevant functional P&L.
Inventory-carrying and working-capital impact where supply-chain agents improved stock-turn.

Inputs to the denominator (cost):

Build cost: AI team headcount (partially allocated), platform subscriptions, model API spend, one-off integration cost.
Run cost: ongoing model API spend, infrastructure, MCP server operation, human oversight hours, governance overhead.
Opportunity cost: what the same headcount would have delivered on non-AI work.

Time horizon: three fiscal years. Earlier than that and you’re front-loading benefits against capex. Later and the model layer drift starts distorting the run-cost line.

The 300% figure is (value / cost) × 100 − 100, measured across those three years. It is not pre-deployment projection; it is post-deployment measurement.

35% operational cost reduction is the other headline. That’s a function-weighted average across the 10 departments where agents were deployed, measured against the pre-AI operating cost baseline of those functions. Some functions (finance, shared services) hit 50%+. Others (CX, marketing) hit 15–20%. 35% is the blended number.

These are real numbers from a real enterprise. They are also upper-quartile. The median agentic deployment will not hit 300%. More on that below.

What actually pays back

Four categories, in order of return predictability.

1. Direct labour substitution (highest, most predictable)

Agents that replace well-defined, high-volume, low-variance human work. Typical targets:

Invoice reconciliation and three-way matching.
Bank and sub-ledger reconciliation.
Standardised customer queries (tier-1 service).
Routine procurement tasks (PO creation, vendor onboarding, contract extraction).
HR ticket triage and basic policy Q&A.

Typical payback: 6–12 months. ROI range: 180–400% over three years.

These are the easiest wins and should be the first cohort. Not because they’re glamorous, but because the unit economics are legible. Hours saved per run × runs per year × fully-loaded cost. Minus agent build and run cost. You can defend the number at board level in two slides.

2. Exception handling and quality uplift (strong, but longer payback)

Agents that catch errors, route exceptions, or pre-screen human decisions. The value is avoided re-work and avoided downstream cost, not eliminated headcount.

Anomaly detection in financial close.
Contract compliance checks.
Insurance-claims triage.
Regulatory-submission pre-review.

Typical payback: 9–18 months. ROI range: 150–250% over three years.

These are harder to build a business case for because “errors avoided” is a counterfactual. The discipline is to measure before you deploy: what’s the current error rate, what’s the cost per error? Then measure the delta. If you don’t baseline the pre-state, you can’t claim the value.

3. Revenue acceleration (variable, but potentially enormous)

Agents that accelerate revenue: lead qualification, outbound cadence management, cross-sell recommendation, collections prioritisation.

Lead-to-opportunity velocity improvements.
Collections days-sales-outstanding reduction.
Upsell/cross-sell attach rate increase.

Typical payback: 6–24 months (high variance). ROI range: 100–600% (very high variance).

These require hard attribution discipline. A/B control groups or matched-pairs analysis. Revenue outcomes are sensitive to too many other variables to claim the lift without controlled measurement. When it works, it’s the category with the largest absolute dollar impact. When it doesn’t, it’s the category where “it feels like it’s helping” survives longest past the point where the agent should be retired.

4. Capability leverage (indirect, hardest to claim)

Agents that unlock work the business wasn’t doing at all: structured analysis on unstructured data, compliance monitoring at scale, market-sensing, internal knowledge retrieval.

Typical payback: 12–36 months, often claimed as “new capability” rather than traditional ROI. ROI range: effectively unboundable, which is why auditors push back on it.

This is real value but hard to expense against a business case. I’d treat it as a bonus tier: don’t lead a CFO proposal with it; lead with categories 1 and 2 and let category 4 accumulate.

The unit economics of an agent

A concrete model. Assumptions typical of a mid-market Australian enterprise deploying its first five agents through a managed-service partner.

Single agent, direct labour substitution case:

Process volume: 24,000 runs/year (~100/business day)
Hours saved per run: 0.25 (15 min)
Fully-loaded labour cost: A$85/hour
Gross annual value: 24,000 × 0.25 × 85 = A$510,000

Costs:

Build cost: A$35,000 one-off (discovery, integration, evaluation harness, deployment)
Run cost: A$18,000/year (model API, platform share, observability, ~10% human-oversight hours in first year dropping to 4% thereafter)
Governance overhead: ~A$6,000/year (periodic review, evaluation runs, audit contribution)

Year 1 net: A$510k − (A$35k + A$18k + A$6k) = A$451k Year 2+ net: A$510k − (A$18k + A$6k) = A$486k

3-year cumulative value: A$451k + A$486k + A$486k = A$1.42M 3-year cumulative cost: A$83k 3-year ROI: ((1.42M − 83k) / 83k) × 100 = ~1,600%

That number is going to look suspicious. Three reasons it’s real:

The build cost is genuinely one-off. Production agents on a platform that’s already stood up cost a fraction of the first one.
The run cost assumes standard small-to-mid-model usage, not frontier-model-only runs. Right-sizing the model tier is 70% of the cost optimisation.
The volume is realistic for a common mid-market back-office process. If the volume is 10× lower, the ROI drops to ~150%, and the agent is borderline.

That sensitivity is the honest part. The unit economics collapse at low volume. Below ~2,000 runs/year for a 15-minute task, most agents lose money net of build cost. Scored process inventory exists to keep you from building those agents.

What kills the ROI in practice

Five patterns I’ve watched destroy otherwise-sensible business cases.

1. Wrong-model default

Teams default to the most capable frontier model for everything. That’s ~10–30× the run cost of a small model that would have done the job. The fix: model tiering by task complexity, measured, revisited quarterly as the open-source models close in.

2. No evaluation harness

Without an eval harness, you can’t swap models confidently. So you stay on the expensive one. Eval infrastructure pays for itself in the first model-swap cycle.

3. Under-priced human oversight

Plans assume 2% human oversight. Reality is 15% in year one, dropping to 5% by year two, 2–3% from year three. Pricing the human-in-the-loop cost honestly in years one and two is what separates defensible business cases from surprises.

4. Build-run imbalance

Organisations budget for the build, not the ongoing operations. They then cut the ops line, governance degrades, an incident costs more than the annual savings, and the programme gets pulled. The healthy ratio is roughly 40% build / 60% run over a three-year horizon.

5. Over-attribution to AI

The revenue-acceleration category is especially prone to this. “Sales is up 12%, we deployed an AI agent, therefore AI = 12% lift.” No. Controlled measurement or the number is fiction. CFOs who’ve been burned once by over-attribution never trust the next business case.

How to write a defensible business case

Five principles.

Categorise every claim. Direct labour, exception handling, revenue acceleration, capability leverage. Each with its own confidence level.
Baseline the pre-state. Current cost, current error rate, current volume. No baseline, no claim.
Price the run cost honestly. Include human oversight, evaluation runs, governance, model drift budget.
Use three-year cumulative, not year-one. Year one is skewed by build cost; year three reflects operating reality.
Publish the sensitivities. Volume down 50%, model costs up 2×, oversight stays at 10%. If the case survives plausible stress, it’s defensible.

If you build the business case this way, your own CFO becomes an advocate. If you build it the vendor way (pre-loaded benefits, under-estimated run cost, no baseline, no sensitivities), you’ll be in the 60% that fails.

The blunt version

Agentic AI has real, large, measurable ROI in the categories it was designed for. The 300% outcome I’ve written about is not unique; plenty of well-run programmes hit it. But the base rate is what RAND and Beam say: most projects return less than they cost, because most projects skip the unglamorous inputs (process inventory, scored backlog, honest unit economics, evaluation, governance) that generate the upside.

If your AI business case would survive an hour with your own internal audit team, it’s probably real. If it wouldn’t, it’s an optimism document.

If you’d like an operator’s read on your AI business case before it goes to the board, book a discovery call. Or see the full 300% ROI case study.

Agentic AI ROI: The Honest Numbers

What the 300% number actually means

What actually pays back

1. Direct labour substitution (highest, most predictable)

2. Exception handling and quality uplift (strong, but longer payback)

3. Revenue acceleration (variable, but potentially enormous)

4. Capability leverage (indirect, hardest to claim)

The unit economics of an agent

What kills the ROI in practice

1. Wrong-model default

2. No evaluation harness

3. Under-priced human oversight

4. Build-run imbalance

5. Over-attribution to AI

How to write a defensible business case

The blunt version

Frequently asked.

AI for CFO in 2026: Tools, Reviews and the 3-Pillar Plan for Finance

How Agentic AI Rewrites Corporate LMS: The Second-Brain-to-Course Pipeline

Microsoft Ads MCP Server: How to Build, Run, and Use It (2026)

Read another.

Agentic AI ROI: The Honest Numbers

What the 300% number actually means

What actually pays back

1. Direct labour substitution (highest, most predictable)

2. Exception handling and quality uplift (strong, but longer payback)

3. Revenue acceleration (variable, but potentially enormous)

4. Capability leverage (indirect, hardest to claim)

The unit economics of an agent

What kills the ROI in practice

1. Wrong-model default

2. No evaluation harness

3. Under-priced human oversight

4. Build-run imbalance

5. Over-attribution to AI

How to write a defensible business case

The blunt version

Frequently asked.

Further reading.

AI for CFO in 2026: Tools, Reviews and the 3-Pillar Plan for Finance

How Agentic AI Rewrites Corporate LMS: The Second-Brain-to-Course Pipeline

Microsoft Ads MCP Server: How to Build, Run, and Use It (2026)

Read another.