Category: Markets

Where signals emerge — innovation, startups, and market shifts worth noticing.

Browse: Innovation · Startup · Market Signals

Inference cost has collapsed. Enterprise AI business cases haven’t caught up.
GPT-4 class inference cost $20 per million tokens at launch in early 2023. In April 2026, equivalent performance runs $0.40. Most enterprise AI business cases were built somewhere in the middle — and haven’t been updated since.

That gap is not a technology story. It is an arithmetic problem wearing a strategy hat.

What moved

Inference costs have declined faster than the bandwidth price collapse of the early internet era, faster than PC compute, and considerably faster than any enterprise finance model anticipated. Artificial Analysis tracks it live: the cheapest capable models today run under $0.50 per million tokens. A flagship model that cost $10 per million tokens eighteen months ago now costs $2–3. The price range between the cheapest and most expensive capable options has widened past a thousand-to-one.

The driver is compounding. Better training efficiency produced more capable models at lower operating cost. Competition between providers accelerated the pass-through. Specialised chips entered the stack. The result: a cost curve that looks less like traditional software pricing and more like solar panel economics — each year’s curve is below where last year’s curve said it would be.

What did not move

Enterprise AI business cases.

S&P Global found that 42% of companies abandoned most of their AI projects in 2025. Cost and unclear value were the top reasons cited. IBM put the share of AI initiatives delivering expected ROI at 25%. MIT found that 95% of AI pilots delivered zero measurable P&L impact (MIT NANDA, State of AI in Business, 2025).

These numbers are real. But the interpretation of why projects fail is often imprecise.

Projects approved in 2023 and 2024 were scoped against the pricing environment of 2023 and 2024. The cost models that informed the go/no-go decisions used token prices that no longer exist. The ROI denominators were anchored to infrastructure assumptions from a period when GPT-4 access cost $10–20 per million tokens. The business cases that were rejected on cost grounds — the ones that landed below the internal ROI hurdle by a thin margin — were rejected against a cost basis that is now a fraction of what it was.

That is not a technology failure. It is a modeling lag.

Andreas’s view

My read on this: there are two different things getting conflated in the ROI conversation. One is genuinely poor outcomes — wrong use case, shallow integration, insufficient change management. That is real and deserves scrutiny. The other is a systematic understatement of AI’s economic potential because the cost assumptions in the business case never got refreshed. Those two phenomena look identical in the data.

I don’t think the 42% abandonment rate or the 25% ROI hit rate tells us much about what AI can do at today’s prices. It tells us how enterprises perform against business cases built on 2023 assumptions. The projects that got killed for cost reasons in Q4 2024 would look different rerun against Q2 2026 pricing.

My expectation is that the organisations getting ahead of this are running a specific exercise that most are not: taking the cost assumptions out of every AI initiative that was rejected or stalled in 2023–2025, replacing them with current market rates, and seeing which cases cross the ROI threshold now. Not all of them will. But some will — and the decision to revisit them is a spreadsheet exercise, not a technology project.

Three things I’m watching:
- Whether finance teams are treating inference cost as a stable input or a variable. Most enterprise budget models treat infrastructure cost as a constant. Inference cost is not a constant — it has been declining faster than almost any other enterprise input cost in the last three years.
- The spread between unit cost and total spend. Per-token costs have collapsed, but total enterprise AI spend is forecast to jump 65% in 2026 — from roughly $7M average to over $11M (IDC). Volume is expanding faster than unit costs are falling. The budget impact of AI is still growing, even as the underlying unit economics are dramatically more favourable than they were.
- How capital allocation committees handle the remodel request. The institutional question: if a CFO approved a 2023 AI business case that underperformed, how does the organisation handle finance coming back and saying “the cost structure changed — the case should have worked, we just used the wrong numbers”? That conversation is coming.
What this reveals

The collapse in inference cost is well-understood in developer circles. Engineers who run inference workloads reset their unit economics continuously — it is operational reality. The delay is in the enterprise business case layer, where cost assumptions travel up through approval chains, get embedded in multi-year plans, and calcify.

The cost curve does not care about the approval cycle. It moved while the slide decks were in review.

This is not an argument that all AI investments look better at current pricing — some of those failed pilots would have failed regardless, and the organisational conditions for AI success (clear scope, embedded workflows, meaningful accountability) have not gotten easier. But a non-trivial fraction of the projects that stalled on cost now live in territory where the math is different. Identifying them is a shorter path to AI ROI than starting new initiatives from scratch.
2026-04-22
Vertical AI is winning the deployment race

Horizontal is the substrate. Vertical is the value layer.

Gartner’s April read says eighty percent of enterprises will have adopted at least one vertical AI agent by year-end, and thirty percent of all enterprise AI deployments will be vertical-specific. Bessemer’s vertical AI report from this month is even more direct: vertical AI companies founded after 2019 are reaching eighty percent of traditional SaaS contract values while growing four hundred percent year-over-year. This is not a minor adjustment to the deployment landscape. It is a structural redirection of where the value of agentic AI accrues.

For boards in 2026, the implication is that the right framework for thinking about AI vendor strategy is no longer horizontal-versus-vertical. It is which verticals you bet on, and how early. Deployment speed defines advantage in this cycle, and the deployment race is now a vertical-by-vertical race.

The shift: vertical specialization beats horizontal generality at the workflow layer

Horizontal AI tools — the chat assistants, the general-purpose copilots, the broad productivity overlays — are still the largest category by usage. They are not the largest category by enterprise value. The reason is structural. A horizontal copilot is good at fifty things. A vertical agent is excellent at five things that are deeply embedded in a specific workflow.

When the enterprise needs to extract value, depth wins over breadth. Abridge in clinical documentation. Harvey and EvenUp in legal. Hebbia in financial research. Specialized clinical-coding agents at major payers. The vertical players ship integrations into existing systems, understand the regulatory and accuracy constraints of the domain, and deliver outcomes that horizontal tools cannot match without significant configuration effort that customers refuse to undertake.

The defensibility of vertical players is also higher than the market priced in 2024. The data flywheel inside a regulated vertical is genuinely hard to replicate. The customer relationships are stickier because switching costs include re-credentialing within the regulator’s expectations, not just re-implementing software.

Wide-shallow loses to narrow-deep at the workflow level.

The role change is the chief AI buyer becomes a portfolio manager

Inside enterprises, the executive responsible for AI vendor strategy is increasingly running a portfolio of vertical specialists alongside the foundation-model contracts. The horizontal tools form a substrate. The vertical agents form the high-value layer. The portfolio manager has to balance ROI realization against integration overhead, and has to decide which verticals to deepen versus which to defer.

The skill set for this role is closer to portfolio investment management than to traditional procurement or IT leadership. The portfolio manager has to read product roadmaps, anticipate vendor consolidation, manage concentration risk, and time entry into emerging verticals where category leaders have not yet emerged. None of this is in the standard procurement or CIO playbook.

Most large enterprises have not formally structured this role yet. The work is happening inside the CIO function or inside individual line-of-business AI initiatives, with no portfolio-level coordination. The result is double-procurement of overlapping vertical capability and missed early-mover advantage in verticals where the category leader will not stay reasonably priced for long.

The strategic consequence reshapes acquisition strategy

For enterprises in regulated industries — banks, insurers, hospital systems, large law firms, accounting firms — the vertical-AI thesis has a direct M&A implication. The category leaders in each vertical are trading at premium multiples now and will trade at higher multiples by 2027 once their data flywheels and customer concentrations are visible in audited financials. The window for acquisition at reasonable multiples is open in 2026 for most verticals. It will close.

For incumbents who do not acquire, the implication is partnership at scale. The vertical specialists need distribution that incumbents already have. The incumbents need capability that the specialists already have. The deal terms will tilt toward the specialists as their growth rates remain visible. Incumbents that delay partnership decisions to 2027 will pay more for less favorable terms.

For boards governing AI strategy, the directive question is whether the company is buying or building or partnering for vertical AI capability — and whether that decision is being made deliberately for each vertical, or by default by the absence of a decision. Default-by-absence is the mode most large enterprises are operating in. It is the most expensive mode.

Per vertical: buy, partner, build, or wait — pick deliberately.

So what boards should do this quarter

Map the AI vendor portfolio with horizontal versus vertical breakdown. If the breakdown is more than two-thirds horizontal, the company is missing the value-creating layer. If it is unmapped, that is a more urgent finding.

Designate an executive owner for vertical AI portfolio strategy with explicit authority across line-of-business silos. The decisions are too consequential to be made silo by silo. The horizontal-tool decisions can stay with the CIO. The vertical-agent decisions need a portfolio view.

For each major vertical relevant to the business, assign a clear posture: acquire, partner, build, or wait. Defaulting to wait by not deciding is the same as deciding to wait — and in most verticals it is the wrong decision in 2026. Execution speed will separate leaders from followers in this cycle.

2026-04-05
What 47 unicorns in one quarter actually means
What was announced

In Q1 2026, 47 startups crossed the billion-dollar valuation threshold for the first time — the largest single-quarter cohort in over three years. The pace is concentrated at the seed and early-stage end. Global venture funding hit roughly $300 billion in the quarter, of which 80% — about $242 billion — flowed to AI companies. Four companies (OpenAI, Anthropic, xAI, Waymo) absorbed 65% of all capital deployed.

Q1 2026 venture funding — concentration at the top.

What it means

Two things become visible at the same time. First, the market is willing to underwrite billion-dollar valuations earlier in the company lifecycle than at any point since the late-2020 boom. The valuation framework is no longer derived from realized revenue. It is derived from deployed compute and team density. Second, capital concentration at the top has reached a level where four companies define the cost of capital for everyone else. A new AI startup raising in 2026 is competing for the same dollars that just priced OpenAI at $122 billion.

The early-stage explosion and the late-stage concentration are two symptoms of the same conviction: capital has decided that AI is a winner-take-most market, and it is funding accordingly.

Andreas’s Take

My read on this: the unicorn count is a lagging indicator of a much earlier decision. That decision was made — quietly, by capital allocators — when the consensus shifted to a single conviction: AI capability gaps will widen, not narrow, over the next decade. From that conviction two strategies follow logically: fund the few names that might dominate the frontier (concentration), and over-fund the early stage so that whatever the next breakthrough looks like, you own a piece of it (proliferation). The 47 new unicorns are the proliferation half.

I don’t think this is a bubble in the conventional sense. A bubble is a price disconnect from fundamentals. What we’re seeing is a price connection to a forecast about fundamentals. If the forecast is right — capability gaps widen, AI returns accrue disproportionately to a few players — today’s valuations are conservative. If it’s wrong, half of these unicorns will not survive their next priced round.

What I’d say to boards and CFOs reading these numbers: don’t take comfort from “the market is hot.” Take instruction. Capital is signaling where it expects the next moat to form. The companies absorbing the capital are absorbing optionality, not just dollars.

Above the waterline: $188B. Below: optionality.

Recommendation

Three things for leaders watching this market:
1. Treat unicorn-count reports as competitive intelligence, not social proof. Look at which unicorns and what they are building — that is the signal of where the market expects gaps to open.
2. Reassess your own compute and talent allocation against the new benchmark. If AI startups can attract billion-dollar valuations on team and compute alone, your incumbent organization is competing for the same talent at a different cost basis.
3. Stress-test your strategic plan against a scenario where capability concentration plays out. What does your business look like if three or four frontier labs control the compute infrastructure and all serious AI deployment runs through them?
References and related signals
- Crunchbase: Q1 2026 venture funding shatters records — $300B global, 80% to AI
- TechCrunch (March 11, 2026): Almost 40 new unicorns minted YTD
- Crunchbase: January 2026 delivered the highest new unicorn count in more than three years
- Related Q1 mega-rounds: OpenAI $122B, Anthropic $30B, xAI $20B, Waymo $16B — four companies, 65% of global venture deployment
- Related infrastructure signal: hyperscaler 2026 capex commitments approaching $700 billion reinforce the same concentration thesis
2026-03-15
MCP became infrastructure and Apple decided to rent cognition
What was announced

Two announcements in the week of March 2–8, 2026 redrew the agent landscape. Anthropic’s Model Context Protocol crossed 97 million installs, with every major AI provider now shipping MCP-compatible tooling — moving the protocol from experiment to default infrastructure for tool-calling agents. Apple confirmed that the redesigned, AI-powered Siri targeted for release alongside iOS 26.4 will be powered by Google’s Gemini model running on Apple’s Private Cloud Compute. In parallel, Anthropic rolled out memory features to all Claude users and deployed Opus 4.6 as an add-in inside Microsoft PowerPoint and Excel.

What it means

The MCP install count makes the connectivity layer between agents and tools a solved problem at the standards level. That is a meaningful shift. For two years, the friction in shipping agents was that every tool integration was bespoke; the integration debt scaled linearly with the number of tools and the number of agents. With MCP at default-infrastructure scale, the integration cost is closer to fixed than linear, and the bottleneck moves from connectivity to orchestration and governance.

Apple’s decision to rent cognition from Google for Siri is the more strategically loaded story. It signals that even the most vertically integrated consumer-tech company in the world has concluded that building competitive frontier-model capability inside the company is not the right capital allocation. The Private Cloud Compute envelope handles the data-sovereignty argument. The Gemini choice handles the capability argument. The combination is an explicit acknowledgment that frontier-model capability has consolidated at a tier of providers most companies will rent from, not build alongside.

Andreas’s view

My read on this: the agent stack is settling into a recognizable shape. Standards layer (MCP, becoming generic). Frontier-model layer (a small number of providers — OpenAI, Anthropic, Google, with regional players underneath). Application layer (where most enterprise value is created). The interesting strategic action for the next 24 months is in the application layer, where the questions are which workflows to embed, which data to expose, and which orchestration logic to own.

I don’t think Apple’s choice is anomalous. It is the start of a wave. Companies that have been building internal frontier-model capabilities will increasingly find that the math does not work — the capex is consumer-internet scale, the talent is concentrated at three or four employers, and the capability gap to “good enough internal model” widens every six months. The economically rational answer for almost everyone is: rent the cognition, own the integration and the data envelope around it. Apple has now made that a defensible board-level position.

The way I see it: the most important architectural question right now is whether the cognition layer (rented, frontier-model, expensive but improving exponentially) is clearly distinguished from the integration layer (owned, workflow-specific, where the moat actually lives). Where those layers are blurred, I’d expect companies to find themselves overpaying on one side and under-investing on the other. The Apple-Google deal is the clean reference architecture for how that separation can look.

Three things I’m watching

Three things I’m watching as this plays out:
1. I’ll be watching whether companies architect the cognition layer and the integration layer separately — treating frontier-model providers as utilities while building proprietary infrastructure around workflow integration and the data envelope.
2. The companies that preserve optionality will be the ones that default to MCP-compatible tooling for new agent integrations. The standards layer is no longer a strategic differentiator — the question is how quickly organizations stop treating it as one.
3. I’ll be watching how internal frontier-model build efforts hold up against the Apple-Gemini reference case. Where differentiation rests on owning the model, I’m interested to see whether those bets come with a credible 36-month capex and capability projection — and what happens when they don’t.
References and related signals
- Crescendo AI: latest AI news and developments
- Related signal: Anthropic’s Opus 4.6 PowerPoint and Excel integrations move frontier-model capability deeper into the enterprise default tooling, accelerating the rented-cognition pattern.
- Related signal: NVIDIA GTC 2026 (March) emphasized agentic frameworks and Fortune 500 production deployments — the application layer is where the next wave of enterprise AI value is being created.
- Related signal: 95% of generative AI pilots still fail to reach production. The connectivity layer being solved does not solve the operating-model layer.
- Related signal: Apple choosing Gemini over OpenAI for Siri changes the competitive math for every enterprise still scoping a frontier-model partnership.
2026-03-08
Humanoids crossed from demo to deployment in one week
What was announced

At CES 2026 in Las Vegas (Jan 5–9), a cluster of robotics announcements crossed the same threshold in a single week. Boston Dynamics unveiled the production-ready electric Atlas with Hyundai committing the first fleet to its Metaplant in Savannah, Georgia, and announced a partnership with Google DeepMind to integrate Gemini Robotics models into the platform. LG demonstrated CLOiD performing real household work — laundry, dishwasher loading, food preparation — in a staged living environment. EngineAI introduced the T800 with a $25,000 starting price and mid-2026 shipping. CES listed 40 companies referencing humanoids on the show floor.

What it means

Side by side, not face to face.

For three years humanoids were a category of demo videos. CES 2026 is where the category became a category of contracts. Production is committed, factories are named, prices are listed, and the foundation-model layer (Gemini Robotics, comparable initiatives at other labs) supplies the cognitive component that previously made every demo brittle. The constraint is no longer “can it walk on stage.” The constraint is “what does the deployment workflow look like, and who owns the integration.”

From this follows a second-order effect: industrial buyers now have a real procurement question to answer in 2026 — not in 2030. Hyundai’s timeline (Atlas at Metaplant, dedicated robotics factory targeting 30,000 units per year by 2028) is the explicit benchmark. Every competing automaker, every large logistics operator, and every contract manufacturer now sits with a known reference deployment to react to.

Andreas’s view

My read on this: the news is not that the robots are good enough. The news is that buyers have decided they are good enough to commit — and the price has moved into range. At $25,000, a humanoid sits below the annual cost of an industrial worker in most developed markets. That shifts the question from “is this technology real” to “where does it amortize fastest.”

My three takeaways:

1. The barrier that fell was cognitive, not mechanical. The hardware has been close to ready for years. What changed is that foundation models — think Atlas plus Gemini Robotics — absorbed the cognitive deficit that kept robots out of unstructured environments. CES 2026 looks different because the system is different, not just the chassis. I think anyone framing this as “better robots” is underestimating the speed of what comes next.

2. The 2030 humanoid timeline is already stale. In my view, this is now a 2026 pilot conversation for any organization with manufacturing, warehousing, or fulfillment in its operations footprint — anywhere unit-level labor is the dominant cost driver. Not as a capex bet, but as a learning investment. The compounding advantage goes to whoever builds operational muscle around these systems first.

3. The real cost of waiting isn’t hardware — it’s the operating model. Hardware will be available to everyone. What won’t be available off the shelf is three years of deployment experience. My expectation is that late movers won’t just be buying machines from competitors — they’ll be importing the playbook for how to use them.

References and related signals
- CNBC: Humanoid robots take over Las Vegas at CES 2026
- Engadget: Boston Dynamics unveils production-ready Atlas at CES 2026
- SiliconANGLE: Hyundai and Boston Dynamics join forces on humanoid factory deployment
- Interesting Engineering: nine humanoid robots that defined CES 2026
- Related signal: NEURA Robotics (Porsche-designed Gen 3), Richtech Dex, AGIBOT industrial humanoids — same week, same pattern.
2026-01-11