The AI infrastructure landscape is fracturing in ways that will reshape hardware buying decisions for years to come.
Three recent developments signal a fundamental shift:
- Cerebras is reviving its IPO plans, targeting a Q2 2026 listing after raising $1.1 billion at an $8 billion valuation
- Amazon is in talks to invest $10 billion or more in OpenAI, potentially bundled with Trainium chip adoption
- OpenAI and Broadcom announced a collaboration to deploy 10 gigawatts of OpenAI-designed custom AI accelerators by 2029
These aren't isolated events. They represent a broader reconfiguration of who builds AI compute, who controls it, and what options remain for everyone else.
The Rise of Custom Silicon
For years, NVIDIA's dominance was absolute. The H100 became the de facto currency of AI capability—organizations measured themselves by how many they could secure.
That's changing.
The OpenAI-Broadcom Bet
OpenAI's partnership with Broadcom represents the most ambitious custom silicon play by an AI lab to date. The numbers are staggering: 10 gigawatts of compute capacity, deployed by 2029, using OpenAI-designed accelerators.
For context: OpenAI currently operates on just over 2 gigawatts. They're planning to 5x their compute footprint with chips they design themselves.
"Partnering with Broadcom is a critical step in building the infrastructure needed to unlock AI's potential," Sam Altman stated. But the subtext is clearer: reducing dependency on NVIDIA's pricing and availability constraints.
The technical specifications hint at what's coming:
- Systolic array architecture: Optimized for inference workloads
- High-bandwidth memory: Likely HBM3E or HBM4
- TSMC 3nm process: Competitive with NVIDIA's latest
Cerebras Goes Public
Cerebras is preparing to file for its US IPO as early as next week, targeting a Q2 2026 listing. After clearing CFIUS review and raising $1.1 billion, the wafer-scale computing pioneer is betting that the market is ready for alternatives to GPU-centric architectures.
The Wafer-Scale Engine 3 is genuinely different: 900,000 AI cores and four trillion transistors on a single silicon wafer. It's not just a different chip—it's a different computing paradigm.
What makes Cerebras interesting for enterprise buyers isn't just performance. It's optionality. A successful IPO validates non-NVIDIA approaches and signals sustained investment in alternative architectures.
What This Means for Hardware Strategy
The custom silicon trend creates a bifurcated market:
| Tier | Players | Access Model |
|---|---|---|
| Hyperscale | OpenAI, Google, Amazon, Meta | Custom silicon, proprietary architectures |
| Enterprise | Everyone else | Commercial GPUs, cloud APIs, licensed compute |
This isn't necessarily bad news for enterprises. More competition should eventually mean better pricing. But in the near term, it creates uncertainty about which hardware investments will retain value.
The Cloud Wars Intensify
Meanwhile, the hyperscalers are locked in aggressive competition for AI workloads—and the potential Amazon-OpenAI deal illustrates how tangled these relationships have become.
The Circular Deal Problem
Amazon has already invested at least $8 billion in Anthropic, OpenAI's primary competitor. Now they're reportedly in talks to invest $10 billion in OpenAI itself, potentially bundled with Trainium chip adoption.
These deals raise questions about circular economics. OpenAI signed a $38 billion cloud computing deal with Amazon in November. Now Amazon may invest $10 billion back, which OpenAI would largely spend on... Amazon infrastructure.
The strategic logic for Amazon is clearer than the financial logic:
- Chip validation: If OpenAI adopts Trainium, it legitimizes AWS's custom silicon against NVIDIA
- Ecosystem lock-in: Deeper infrastructure integration makes switching costs prohibitive
- Competitive positioning: Even without exclusive model rights (Microsoft holds those through the 2030s), Amazon gains AI credibility
GPU Pricing Across Clouds
For enterprises evaluating cloud options, 2025 pricing has become more competitive:
| Provider | H100 On-Demand | A100 On-Demand | Notes |
|---|---|---|---|
| AWS | ~$3.90/hr (after 44% cut) | ~$3.06/hr | June 2025 price reduction |
| Azure | $6.98/hr | $3.67/hr | Regional variation up to 30% |
| Google Cloud | $11.06/hr (8-GPU instance) | ~$3.00/hr | Granular pricing model |
| Lambda Labs | $2.49-$3.29/hr | $1.29/hr | ML-focused provider |
The spread is significant—AWS's H100 pricing is now nearly half of Azure's. But hidden costs matter: data transfer egress ($0.08-$0.12/GB), storage, and networking can add 20-40% to monthly bills.
Spot instances offer 60-90% discounts but with interruption risk:
- AWS: 2-minute warning before interruption
- Azure: 30-second warning
- Google Cloud: Prices may change once per 30 days
Enterprise Strategy: Rent vs. Own
The macro trends are clear. Enterprise AI spending hit $37 billion in 2025—a 3.2x year-over-year increase from $11.5 billion in 2024. But where that money goes is shifting.
The Buy Over Build Trend
Enterprises are increasingly choosing purchased solutions over internal builds. In 2024, 47% of AI solutions were built internally. Today, 76% of AI use cases are purchased rather than built in-house.
This has infrastructure implications. If you're not building models, you may not need training clusters. Inference workloads have different economics—and different hardware requirements.
When On-Premises Makes Sense
Cloud dominates, but specific factors push toward on-premises investment:
- Data sovereignty: Regulatory requirements increasingly mandate geographic control
- Latency sensitivity: Applications requiring <10ms response times can't tolerate cloud round-trips
- Predictable workloads: Steady-state inference at scale often pencils out cheaper on owned hardware
- Competitive sensitivity: Some organizations won't run proprietary models through cloud APIs
The Hybrid Reality
55% of organizations use public cloud, while 51% rely on hybrid setups. This isn't contradiction—most enterprises are running both.
A practical hybrid strategy might look like:
| Workload Type | Recommended Approach | Rationale |
|---|---|---|
| Experimentation | Cloud (spot instances) | Low commitment, easy scaling |
| Model training | Cloud (reserved) or specialized providers | Burst capacity, latest GPUs |
| Steady-state inference | On-premises or dedicated cloud | Predictable costs, data control |
| Edge inference | On-premises appliances | Latency, connectivity resilience |
Looking Ahead: 2026 and Beyond
Several trends will shape hardware decisions in the coming year.
Inference Overtakes Training
Inference hardware spend is projected to jump from $12 billion in 2025 to $21 billion in 2026, overtaking training expenditures within 12-18 months. This shifts the calculus:
- Inference workloads are more predictable, favoring owned hardware
- Inference-optimized silicon (like Google's TPUs or custom ASIC designs) becomes more relevant
- Edge deployment for inference gains momentum
GPU Rental Prices Stabilize
H100 rental prices have dropped to $2.85/hour on competitive platforms, with A100s at $0.66/hour. New GPU releases (Blackwell, MI350) may push older-generation prices down further.
For enterprises, this creates opportunity: the cost of experimentation is falling. What was a major capital commitment two years ago is now approachable as an operational expense.
Skills Gap Drives Outsourcing
44% of executives cite lack of in-house AI expertise as the biggest barrier to AI deployment. This pushes toward managed services and cloud-based solutions where infrastructure complexity is abstracted away.
Practical Implications for Buyers
How should enterprise hardware buyers navigate this landscape?
- Don't over-commit to training infrastructure. Unless you're building foundation models, training needs are likely decreasing as fine-tuning and inference become the primary workloads.
- Watch the custom silicon space. Cerebras's IPO, OpenAI's Broadcom partnership, and Amazon's Trainium push all suggest alternatives to NVIDIA are gaining viability. Early evaluation positions you for future options.
- Negotiate aggressively on cloud pricing. The hyperscaler price war is real. AWS's 44% H100 price cut shows flexibility exists. Reserved instances and committed use discounts can match or beat on-premises TCO.
- Plan for hybrid. Pure cloud and pure on-premises are both limiting. Most organizations will need both—cloud for burst capacity and experimentation, on-premises for steady-state inference and sensitive workloads.
- Factor in the skills gap. Hardware is only valuable if you can operate it. Managed services and cloud abstraction may be worth the premium if internal expertise is limited.
The Bigger Picture
The AI infrastructure market is consolidating at the top and fragmenting in the middle.
Hyperscalers and leading AI labs are building custom silicon, creating vertically integrated stacks that may not be available to everyone else. Meanwhile, the commercial market is seeing more competition, falling prices, and increasing optionality.
For enterprise buyers, this is actually favorable. You're not going to build wafer-scale processors or design custom accelerators. But you benefit from the competition among those who do.
The key is staying flexible. The hardware you buy or rent today will operate in a different landscape by 2027. Plan for that uncertainty, and you'll be positioned to take advantage of whatever emerges.