AI Infrastructure in 2026: Custom Silicon, Cloud Wars, and What It Means for Buyers

Photo by Jinsoo Choi on Unsplash

Industry Insights

AI Infrastructure in 2026: Custom Silicon, Cloud Wars, and What It Means for Buyers

December 20, 2025

8 min read

ai-infrastructurecustom-siliconcloud-computingenterprisecerebrasopenaitrends

The AI infrastructure landscape is fracturing in ways that will reshape hardware buying decisions for years to come.

Three recent developments signal a fundamental shift:

Cerebras is reviving its IPO plans, targeting a Q2 2026 listing after raising $1.1 billion at an $8 billion valuation
Amazon is in talks to invest $10 billion or more in OpenAI, potentially bundled with Trainium chip adoption
OpenAI and Broadcom announced a collaboration to deploy 10 gigawatts of OpenAI-designed custom AI accelerators by 2029

These aren't isolated events. They represent a broader reconfiguration of who builds AI compute, who controls it, and what options remain for everyone else.

The Rise of Custom Silicon

For years, NVIDIA's dominance was absolute. The H100 became the de facto currency of AI capability—organizations measured themselves by how many they could secure.

That's changing.

The OpenAI-Broadcom Bet

OpenAI's partnership with Broadcom represents the most ambitious custom silicon play by an AI lab to date. The numbers are staggering: 10 gigawatts of compute capacity, deployed by 2029, using OpenAI-designed accelerators.

For context: OpenAI currently operates on just over 2 gigawatts. They're planning to 5x their compute footprint with chips they design themselves.

"Partnering with Broadcom is a critical step in building the infrastructure needed to unlock AI's potential," Sam Altman stated. But the subtext is clearer: reducing dependency on NVIDIA's pricing and availability constraints.

The technical specifications hint at what's coming:

Systolic array architecture: Optimized for inference workloads
High-bandwidth memory: Likely HBM3E or HBM4
TSMC 3nm process: Competitive with NVIDIA's latest

Cerebras Goes Public

Cerebras is preparing to file for its US IPO as early as next week, targeting a Q2 2026 listing. After clearing CFIUS review and raising $1.1 billion, the wafer-scale computing pioneer is betting that the market is ready for alternatives to GPU-centric architectures.

The Wafer-Scale Engine 3 is genuinely different: 900,000 AI cores and four trillion transistors on a single silicon wafer. It's not just a different chip—it's a different computing paradigm.

What makes Cerebras interesting for enterprise buyers isn't just performance. It's optionality. A successful IPO validates non-NVIDIA approaches and signals sustained investment in alternative architectures.

What This Means for Hardware Strategy

The custom silicon trend creates a bifurcated market:

Tier	Players	Access Model
Hyperscale	OpenAI, Google, Amazon, Meta	Custom silicon, proprietary architectures
Enterprise	Everyone else	Commercial GPUs, cloud APIs, licensed compute

This isn't necessarily bad news for enterprises. More competition should eventually mean better pricing. But in the near term, it creates uncertainty about which hardware investments will retain value.

The Cloud Wars Intensify

Meanwhile, the hyperscalers are locked in aggressive competition for AI workloads—and the potential Amazon-OpenAI deal illustrates how tangled these relationships have become.

The Circular Deal Problem

Amazon has already invested at least $8 billion in Anthropic, OpenAI's primary competitor. Now they're reportedly in talks to invest $10 billion in OpenAI itself, potentially bundled with Trainium chip adoption.

These deals raise questions about circular economics. OpenAI signed a $38 billion cloud computing deal with Amazon in November. Now Amazon may invest $10 billion back, which OpenAI would largely spend on... Amazon infrastructure.

The strategic logic for Amazon is clearer than the financial logic:

Chip validation: If OpenAI adopts Trainium, it legitimizes AWS's custom silicon against NVIDIA
Ecosystem lock-in: Deeper infrastructure integration makes switching costs prohibitive
Competitive positioning: Even without exclusive model rights (Microsoft holds those through the 2030s), Amazon gains AI credibility

GPU Pricing Across Clouds

For enterprises evaluating cloud options, 2025 pricing has become more competitive:

Provider	H100 On-Demand	A100 On-Demand	Notes
AWS	~$3.90/hr (after 44% cut)	~$3.06/hr	June 2025 price reduction
Azure	$6.98/hr	$3.67/hr	Regional variation up to 30%
Google Cloud	$11.06/hr (8-GPU instance)	~$3.00/hr	Granular pricing model
Lambda Labs	$2.49-$3.29/hr	$1.29/hr	ML-focused provider

The spread is significant—AWS's H100 pricing is now nearly half of Azure's. But hidden costs matter: data transfer egress ($0.08-$0.12/GB), storage, and networking can add 20-40% to monthly bills.

Spot instances offer 60-90% discounts but with interruption risk:

AWS: 2-minute warning before interruption
Azure: 30-second warning
Google Cloud: Prices may change once per 30 days

Enterprise Strategy: Rent vs. Own

The macro trends are clear. Enterprise AI spending hit $37 billion in 2025—a 3.2x year-over-year increase from $11.5 billion in 2024. But where that money goes is shifting.

The Buy Over Build Trend

Enterprises are increasingly choosing purchased solutions over internal builds. In 2024, 47% of AI solutions were built internally. Today, 76% of AI use cases are purchased rather than built in-house.

This has infrastructure implications. If you're not building models, you may not need training clusters. Inference workloads have different economics—and different hardware requirements.

When On-Premises Makes Sense

Cloud dominates, but specific factors push toward on-premises investment:

Data sovereignty: Regulatory requirements increasingly mandate geographic control
Latency sensitivity: Applications requiring <10ms response times can't tolerate cloud round-trips
Predictable workloads: Steady-state inference at scale often pencils out cheaper on owned hardware
Competitive sensitivity: Some organizations won't run proprietary models through cloud APIs

The Hybrid Reality

55% of organizations use public cloud, while 51% rely on hybrid setups. This isn't contradiction—most enterprises are running both.

A practical hybrid strategy might look like:

Workload Type	Recommended Approach	Rationale
Experimentation	Cloud (spot instances)	Low commitment, easy scaling
Model training	Cloud (reserved) or specialized providers	Burst capacity, latest GPUs
Steady-state inference	On-premises or dedicated cloud	Predictable costs, data control
Edge inference	On-premises appliances	Latency, connectivity resilience

Looking Ahead: 2026 and Beyond

Several trends will shape hardware decisions in the coming year.

Inference Overtakes Training

Inference hardware spend is projected to jump from $12 billion in 2025 to $21 billion in 2026, overtaking training expenditures within 12-18 months. This shifts the calculus:

Inference workloads are more predictable, favoring owned hardware
Inference-optimized silicon (like Google's TPUs or custom ASIC designs) becomes more relevant
Edge deployment for inference gains momentum

GPU Rental Prices Stabilize

H100 rental prices have dropped to $2.85/hour on competitive platforms, with A100s at $0.66/hour. New GPU releases (Blackwell, MI350) may push older-generation prices down further.

For enterprises, this creates opportunity: the cost of experimentation is falling. What was a major capital commitment two years ago is now approachable as an operational expense.

Skills Gap Drives Outsourcing

44% of executives cite lack of in-house AI expertise as the biggest barrier to AI deployment. This pushes toward managed services and cloud-based solutions where infrastructure complexity is abstracted away.

Practical Implications for Buyers

How should enterprise hardware buyers navigate this landscape?

Don't over-commit to training infrastructure. Unless you're building foundation models, training needs are likely decreasing as fine-tuning and inference become the primary workloads.
Watch the custom silicon space. Cerebras's IPO, OpenAI's Broadcom partnership, and Amazon's Trainium push all suggest alternatives to NVIDIA are gaining viability. Early evaluation positions you for future options.
Negotiate aggressively on cloud pricing. The hyperscaler price war is real. AWS's 44% H100 price cut shows flexibility exists. Reserved instances and committed use discounts can match or beat on-premises TCO.
Plan for hybrid. Pure cloud and pure on-premises are both limiting. Most organizations will need both—cloud for burst capacity and experimentation, on-premises for steady-state inference and sensitive workloads.
Factor in the skills gap. Hardware is only valuable if you can operate it. Managed services and cloud abstraction may be worth the premium if internal expertise is limited.

The Bigger Picture

The AI infrastructure market is consolidating at the top and fragmenting in the middle.

Hyperscalers and leading AI labs are building custom silicon, creating vertically integrated stacks that may not be available to everyone else. Meanwhile, the commercial market is seeing more competition, falling prices, and increasing optionality.

For enterprise buyers, this is actually favorable. You're not going to build wafer-scale processors or design custom accelerators. But you benefit from the competition among those who do.

The key is staying flexible. The hardware you buy or rent today will operate in a different landscape by 2027. Plan for that uncertainty, and you'll be positioned to take advantage of whatever emerges.

Share this post

AI Hardware Meets Regulation: What New AI Laws Mean for Compute Architects

Industry Insights

Photo by Wesley Tingey on Unsplash

December 17, 2025 • 7 min read

AI Hardware Meets Regulation: What New AI Laws Mean for Compute Architects

With New York's RAISE Act now signed into law and the EU AI Act in force, AI hardware decisions can no longer ignore compliance. Here's what compute architects and procurement teams need to know about regulatory readiness.

AI Is the New Land. And We're All Late Settlers.

Industry Insights

Photo by The New York Public Library on Unsplash

December 14, 2025 • 5 min read

AI Is the New Land. And We're All Late Settlers.

The real leverage in AI isn't the chatbot interface — it's the chips, data centers, and infrastructure you'll never own. A clear-eyed look at who controls AI hardware and what that means for everyone else.

You Can Buy an AI Supercomputer at Walmart Now — What Does That Mean?

Industry Insights

Photo by David Montero on Unsplash

December 10, 2025 • 6 min read

You Can Buy an AI Supercomputer at Walmart Now — What Does That Mean?

The NVIDIA DGX Spark is listed on Walmart.com alongside paper towels and frozen pizzas. This isn't a glitch — it's a glimpse into the rapid consumerization of AI hardware and what it signals for the future of computing.

AI Infrastructure in 2026: Custom Silicon, Cloud Wars, and What It Means for Buyers

The Rise of Custom Silicon

The OpenAI-Broadcom Bet

Cerebras Goes Public

What This Means for Hardware Strategy

The Cloud Wars Intensify

The Circular Deal Problem

GPU Pricing Across Clouds

Enterprise Strategy: Rent vs. Own

The Buy Over Build Trend

When On-Premises Makes Sense

The Hybrid Reality

Looking Ahead: 2026 and Beyond

Inference Overtakes Training

GPU Rental Prices Stabilize

Skills Gap Drives Outsourcing

Practical Implications for Buyers

The Bigger Picture

Share this post

Related Posts

AI Hardware Meets Regulation: What New AI Laws Mean for Compute Architects

AI Is the New Land. And We're All Late Settlers.

You Can Buy an AI Supercomputer at Walmart Now — What Does That Mean?