Sam Altman Says OpenAI Is Now an AI Inference Company — What That Shift Means for Multi-Cloud Buyers

Sam Altman Called It: OpenAI Is Now an Inference Company

Sam Altman said something in April that most people glossed over. “We have become an AI inference company now.” Not a research lab. Not a product company. An inference company — meaning the business is moving tokens to wherever compute exists.

That framing matters more than the deal terms that surrounded it. On April 27, OpenAI and Microsoft restructured their partnership to end Microsoft’s exclusive cloud arrangement. The next day — not the next week, the next day — OpenAI announced GPT-5.4 as a limited preview on AWS Bedrock, with GPT-5.5 coming within weeks and Codex available through AWS infrastructure. The speed of that move tells you the announcement was ready before the ink dried. This wasn’t opportunism. It was a strategy that had been waiting for the legal clearance to execute.

You should read that sequence carefully if you’re making decisions about where to run AI workloads.

The Deal Nobody Fully Understood

The Microsoft-OpenAI restructure generated a lot of coverage, most of it focused on who “won.” That framing misses the point.

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

Here’s what actually changed. Microsoft retains a non-exclusive license to OpenAI’s IP and models through 2032. Models still get released on Azure first, though the exclusivity window isn’t specified. Microsoft remains a 27% shareholder. OpenAI continues paying Microsoft a 20% revenue share through 2030 — the same rate that was previously scheduled to drop to 8% by 2030. Microsoft, in exchange, no longer pays a revenue share to OpenAI for serving their models.

The AGI clause was removed entirely. This is underreported. The original deal had a provision where the entire arrangement dissolved if OpenAI declared it had achieved AGI — with no agreed definition of what AGI meant. That gave OpenAI’s board enormous unilateral power over Microsoft’s investment. After the Sam Altman board drama in 2023, that clause looked less like a quirky philosophical hedge and more like a loaded gun pointed at a $13 billion position. Removing it was probably worth more to Microsoft than the revenue share math.

The financial logic for Microsoft is actually strong. If OpenAI continues its growth trajectory, locking in 20% instead of watching it drop to 8% could represent tens of billions in additional revenue. The Information initially reported this as a Microsoft win, then revised to “win-win” after more sourcing — the framing that makes most sense. Both sides avoided a protracted legal battle, and both got something they needed.

But the strategic story is OpenAI’s.

What “Inference Company” Actually Means

When Altman says inference company, he’s describing a specific business model. Research labs sell capabilities. Product companies sell experiences. Inference companies sell compute cycles — and their margin depends on serving those cycles as efficiently and as broadly as possible.

The constraint on that model, until April 27, was that OpenAI could only serve inference through Azure. That created real problems. AWS CEO Matt Garman explained it plainly: “This is what our customers have been asking for for a really long time. Their production applications run in AWS, their data is in AWS, they trust the security of AWS.”

That’s not a preference. That’s a procurement reality. Enterprise AI adoption has been partly gated by the friction of moving data across cloud boundaries. Companies that had already built on AWS were making model choices based on what was available in Bedrock — which is why Anthropic captured significant enterprise share simply by being there first. As one Signal analysis put it: “Many companies defaulted to Anthropic/Claude because they were already on Bedrock — this is huge for OpenAI model accessibility.”

Rezo’s read on the situation is the clearest: “OpenAI has grown too big for any single cloud to fully serve.” That’s the real story. The Microsoft deal wasn’t a breakup. It was an acknowledgment that the original structure was written for a smaller company.

The AWS Bedrock Announcement Is the Evidence

The specific details of the AWS launch matter here. GPT-5.4 went into limited preview on April 28. GPT-5.5 was announced as coming within weeks. Codex is available through AWS infrastructure. And Amazon Bedrock’s managed agents platform was rebranded as “powered by OpenAI” — meaning it uses OpenAI’s harnesses and models in the same way they’ve been available on Azure.

That last point is significant. This isn’t a stripped-down API integration. It’s the full stack: models, agents, managed infrastructure. No workarounds. OpenAI’s products running on Bedrock the same way they run on Azure.

For context on what’s actually available, GPT-5.4 and Claude Opus 4.6 have been competing for enterprise workloads across coding, writing, and agentic tasks. The difference now is that GPT-5.4 is accessible to teams whose production data never leaves AWS — which removes the primary reason many of them were defaulting to Claude.

The Bedrock managed agents rebrand is also a signal about where Amazon is placing its bets. They’re not trying to build the best model. They’re trying to be the best place to run any model. That’s a different and arguably more durable position.

What This Means for Multi-Cloud AI Procurement

If you’re building AI applications for enterprise, the practical implications are real.

First, the model availability gap is closing. For the past two years, the practical choice for AWS-native teams was Anthropic. That’s no longer true. GPT-5.5 and Codex will be on Bedrock within weeks of this writing. Anthropic’s compute shortage has already been tightening Claude quotas, which means teams that built on Claude for availability reasons may find themselves reconsidering.

Second, the “first on Azure” window creates a procurement timing question. OpenAI still releases models on Azure first. The exclusivity window isn’t specified, but it exists. If you’re building on the frontier — and for some agentic use cases, you need the latest model — Azure still has a structural advantage in the immediate post-release window. That advantage is temporary, but it’s real.

Third, the AGI clause removal changes the risk calculus for long-term enterprise commitments. The original deal had a theoretical scenario where Microsoft’s entire AI infrastructure arrangement could dissolve overnight based on an internal OpenAI declaration. That’s gone. Both the Azure and AWS arrangements now have defined end dates rather than philosophical tripwires. That makes multi-year AI infrastructure planning more tractable.

Fourth, and most importantly: the inference company framing means OpenAI’s incentive is now explicitly to be everywhere. They’re not trying to drive you to a specific cloud. They’re trying to serve tokens wherever you already are. That’s a different kind of vendor relationship than the one that existed before April 27.

The Agentic Layer Complicates This

The cloud availability story is straightforward. The agentic story is messier.

OpenAI’s Codex on AWS is notable because agentic workloads are where the real compute demand is. The AI Daily Brief’s analysis of the current moment is accurate: we’re in a transition from pre-agentic to agentic AI, and the token shortages are coming from work-based agents, not chat interfaces. The companies that lock in infrastructure for agentic workloads now are positioning for a different scale of demand than what exists today.

Amazon’s managed agents platform — now branded as powered by OpenAI — is a direct play for that market. It’s similar in structure to the managed agents OpenAI introduced with their Frontier platform in February. The question is whether enterprises want to run agents through a cloud provider’s managed layer or go direct to the model provider.

One coffee. One working app.

You bring the idea. Remy manages the project.

WHILE YOU WERE AWAY

✓Designed the data model

✓Picked an auth scheme — sessions + RBAC

✓Wired up Stripe checkout

✓Deployed to production

Live at yourapp.msagent.ai

For teams building complex agent workflows, the orchestration layer matters as much as the model. Platforms like MindStudio handle this orchestration across 200+ models and 1,000+ integrations, which means the multi-cloud availability of OpenAI models translates directly into more routing options for production agent systems — you can now point the same workflow at GPT-5.5 on Bedrock without restructuring your infrastructure.

The broader point is that “which model” and “which cloud” are increasingly separate decisions. OpenAI’s inference company strategy is explicitly designed to decouple them.

The Competitive Pressure This Creates

OpenAI’s AWS availability doesn’t just affect OpenAI’s market position. It changes the competitive dynamics for every model provider.

Anthropic’s enterprise advantage was partly structural: they were in Bedrock, OpenAI wasn’t. That structural advantage is gone. Anthropic still has strong model quality — Claude Mythos is expected to push the frontier further on coding and reasoning — but the distribution moat has narrowed. The companies that chose Claude because it was available in their existing cloud stack now have a real choice.

Google is in a different position. Gemini has consumer reach and deep integration with Google Workspace, but enterprise momentum has been weaker than Google’s overall infrastructure position would suggest. The AI Daily Brief’s power rankings gave Google a three out of ten on momentum — not because the models are bad, but because the agentic and coding use cases that are driving enterprise adoption in 2026 haven’t been Gemini’s strongest ground. Google I/O will be the test of whether that changes.

For OpenAI, the multi-cloud move is also a hedge against its own compute constraints. The company has been assembling compute deals aggressively, but owning infrastructure is different from having access to it. Being able to serve inference through both Azure and AWS means more capacity headroom, which matters when token demand is growing faster than anyone’s ability to build data centers.

The Spec Problem Nobody Talks About

There’s a subtler implication of the inference company framing that’s worth sitting with.

If OpenAI is an inference company, then the value they’re selling is increasingly commoditized by definition. Inference is a compute service. The differentiation comes from model quality, but model quality is a moving target — GPT-5.5 is already competing with Claude Opus 4.7, and the gap between frontier models is narrowing faster than anyone predicted two years ago.

The durable advantage in this environment isn’t which model you use. It’s how well your application is specified. Teams that have invested in clear, precise specifications for their AI workflows — what the system should do, what the edge cases are, what the outputs need to look like — can swap models as the frontier moves without rebuilding from scratch. Teams that have their logic tangled up in prompt strings and model-specific behaviors are going to find multi-cloud flexibility less useful than it sounds.

This is why tools like Remy are interesting in this context: you write your application as a spec — annotated markdown where the intent is explicit — and the full-stack application gets compiled from it. When the underlying model or infrastructure changes, you fix the spec and recompile. The source of truth is the specification, not the generated code.

The inference company era rewards teams that have separated their application logic from their model dependencies. That’s an architectural discipline, not a vendor choice.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

What the Sequence Tells You

April 27: Microsoft exclusivity ends. April 28: OpenAI announces AWS availability.

That’s not a coincidence. That’s a company that had been planning this move for months, waiting for the legal structure to allow it. The inference company framing wasn’t a post-hoc description of what happened. It was the strategy that drove the negotiation.

The practical takeaway for builders is this: the cloud you’re on is no longer a constraint on which frontier models you can access. That’s new. It changes the procurement calculus, it changes the competitive dynamics for model providers, and it changes what “vendor lock-in” means in the context of AI infrastructure.

The teams that will benefit most from this shift are the ones who’ve been building with model-agnostic architectures — where the cloud provider and the model are configuration choices, not foundational dependencies. The teams that built tightly coupled to a specific model on a specific cloud are going to find the new flexibility harder to use than it looks.

Altman’s framing is precise. An inference company serves tokens wherever compute exists. The question for everyone building on top of that infrastructure is whether your application is designed to take advantage of that, or whether you’ve accidentally built a moat around yourself.