AI Infrastructure in 2026: Custom Silicon, Cloud Wars, and What It Means for Buyers
Photo by Jinsoo Choi on Unsplash
Industry Insights

AI Infrastructure in 2026: Custom Silicon, Cloud Wars, and What It Means for Buyers

December 20, 2025
8 min read
ai-infrastructurecustom-siliconcloud-computingenterprisecerebrasopenaitrends

The AI infrastructure landscape is fracturing in ways that will reshape hardware buying decisions for years to come.

Three recent developments signal a fundamental shift:

These aren't isolated events. They represent a broader reconfiguration of who builds AI compute, who controls it, and what options remain for everyone else.

The Rise of Custom Silicon

For years, NVIDIA's dominance was absolute. The H100 became the de facto currency of AI capability—organizations measured themselves by how many they could secure.

That's changing.

The OpenAI-Broadcom Bet

OpenAI's partnership with Broadcom represents the most ambitious custom silicon play by an AI lab to date. The numbers are staggering: 10 gigawatts of compute capacity, deployed by 2029, using OpenAI-designed accelerators.

For context: OpenAI currently operates on just over 2 gigawatts. They're planning to 5x their compute footprint with chips they design themselves.

"Partnering with Broadcom is a critical step in building the infrastructure needed to unlock AI's potential," Sam Altman stated. But the subtext is clearer: reducing dependency on NVIDIA's pricing and availability constraints.

The technical specifications hint at what's coming:

  • Systolic array architecture: Optimized for inference workloads
  • High-bandwidth memory: Likely HBM3E or HBM4
  • TSMC 3nm process: Competitive with NVIDIA's latest

Cerebras Goes Public

Cerebras is preparing to file for its US IPO as early as next week, targeting a Q2 2026 listing. After clearing CFIUS review and raising $1.1 billion, the wafer-scale computing pioneer is betting that the market is ready for alternatives to GPU-centric architectures.

The Wafer-Scale Engine 3 is genuinely different: 900,000 AI cores and four trillion transistors on a single silicon wafer. It's not just a different chip—it's a different computing paradigm.

What makes Cerebras interesting for enterprise buyers isn't just performance. It's optionality. A successful IPO validates non-NVIDIA approaches and signals sustained investment in alternative architectures.

What This Means for Hardware Strategy

The custom silicon trend creates a bifurcated market:

TierPlayersAccess Model
HyperscaleOpenAI, Google, Amazon, MetaCustom silicon, proprietary architectures
EnterpriseEveryone elseCommercial GPUs, cloud APIs, licensed compute

This isn't necessarily bad news for enterprises. More competition should eventually mean better pricing. But in the near term, it creates uncertainty about which hardware investments will retain value.

The Cloud Wars Intensify

Meanwhile, the hyperscalers are locked in aggressive competition for AI workloads—and the potential Amazon-OpenAI deal illustrates how tangled these relationships have become.

The Circular Deal Problem

Amazon has already invested at least $8 billion in Anthropic, OpenAI's primary competitor. Now they're reportedly in talks to invest $10 billion in OpenAI itself, potentially bundled with Trainium chip adoption.

These deals raise questions about circular economics. OpenAI signed a $38 billion cloud computing deal with Amazon in November. Now Amazon may invest $10 billion back, which OpenAI would largely spend on... Amazon infrastructure.

The strategic logic for Amazon is clearer than the financial logic:

  • Chip validation: If OpenAI adopts Trainium, it legitimizes AWS's custom silicon against NVIDIA
  • Ecosystem lock-in: Deeper infrastructure integration makes switching costs prohibitive
  • Competitive positioning: Even without exclusive model rights (Microsoft holds those through the 2030s), Amazon gains AI credibility

GPU Pricing Across Clouds

For enterprises evaluating cloud options, 2025 pricing has become more competitive:

ProviderH100 On-DemandA100 On-DemandNotes
AWS~$3.90/hr (after 44% cut)~$3.06/hrJune 2025 price reduction
Azure$6.98/hr$3.67/hrRegional variation up to 30%
Google Cloud$11.06/hr (8-GPU instance)~$3.00/hrGranular pricing model
Lambda Labs$2.49-$3.29/hr$1.29/hrML-focused provider

The spread is significant—AWS's H100 pricing is now nearly half of Azure's. But hidden costs matter: data transfer egress ($0.08-$0.12/GB), storage, and networking can add 20-40% to monthly bills.

Spot instances offer 60-90% discounts but with interruption risk:

  • AWS: 2-minute warning before interruption
  • Azure: 30-second warning
  • Google Cloud: Prices may change once per 30 days

Enterprise Strategy: Rent vs. Own

The macro trends are clear. Enterprise AI spending hit $37 billion in 2025—a 3.2x year-over-year increase from $11.5 billion in 2024. But where that money goes is shifting.

The Buy Over Build Trend

Enterprises are increasingly choosing purchased solutions over internal builds. In 2024, 47% of AI solutions were built internally. Today, 76% of AI use cases are purchased rather than built in-house.

This has infrastructure implications. If you're not building models, you may not need training clusters. Inference workloads have different economics—and different hardware requirements.

When On-Premises Makes Sense

Cloud dominates, but specific factors push toward on-premises investment:

  • Data sovereignty: Regulatory requirements increasingly mandate geographic control
  • Latency sensitivity: Applications requiring <10ms response times can't tolerate cloud round-trips
  • Predictable workloads: Steady-state inference at scale often pencils out cheaper on owned hardware
  • Competitive sensitivity: Some organizations won't run proprietary models through cloud APIs

The Hybrid Reality

55% of organizations use public cloud, while 51% rely on hybrid setups. This isn't contradiction—most enterprises are running both.

A practical hybrid strategy might look like:

Workload TypeRecommended ApproachRationale
ExperimentationCloud (spot instances)Low commitment, easy scaling
Model trainingCloud (reserved) or specialized providersBurst capacity, latest GPUs
Steady-state inferenceOn-premises or dedicated cloudPredictable costs, data control
Edge inferenceOn-premises appliancesLatency, connectivity resilience

Looking Ahead: 2026 and Beyond

Several trends will shape hardware decisions in the coming year.

Inference Overtakes Training

Inference hardware spend is projected to jump from $12 billion in 2025 to $21 billion in 2026, overtaking training expenditures within 12-18 months. This shifts the calculus:

  • Inference workloads are more predictable, favoring owned hardware
  • Inference-optimized silicon (like Google's TPUs or custom ASIC designs) becomes more relevant
  • Edge deployment for inference gains momentum

GPU Rental Prices Stabilize

H100 rental prices have dropped to $2.85/hour on competitive platforms, with A100s at $0.66/hour. New GPU releases (Blackwell, MI350) may push older-generation prices down further.

For enterprises, this creates opportunity: the cost of experimentation is falling. What was a major capital commitment two years ago is now approachable as an operational expense.

Skills Gap Drives Outsourcing

44% of executives cite lack of in-house AI expertise as the biggest barrier to AI deployment. This pushes toward managed services and cloud-based solutions where infrastructure complexity is abstracted away.

Practical Implications for Buyers

How should enterprise hardware buyers navigate this landscape?

  1. Don't over-commit to training infrastructure. Unless you're building foundation models, training needs are likely decreasing as fine-tuning and inference become the primary workloads.
  2. Watch the custom silicon space. Cerebras's IPO, OpenAI's Broadcom partnership, and Amazon's Trainium push all suggest alternatives to NVIDIA are gaining viability. Early evaluation positions you for future options.
  3. Negotiate aggressively on cloud pricing. The hyperscaler price war is real. AWS's 44% H100 price cut shows flexibility exists. Reserved instances and committed use discounts can match or beat on-premises TCO.
  4. Plan for hybrid. Pure cloud and pure on-premises are both limiting. Most organizations will need both—cloud for burst capacity and experimentation, on-premises for steady-state inference and sensitive workloads.
  5. Factor in the skills gap. Hardware is only valuable if you can operate it. Managed services and cloud abstraction may be worth the premium if internal expertise is limited.

The Bigger Picture

The AI infrastructure market is consolidating at the top and fragmenting in the middle.

Hyperscalers and leading AI labs are building custom silicon, creating vertically integrated stacks that may not be available to everyone else. Meanwhile, the commercial market is seeing more competition, falling prices, and increasing optionality.

For enterprise buyers, this is actually favorable. You're not going to build wafer-scale processors or design custom accelerators. But you benefit from the competition among those who do.

The key is staying flexible. The hardware you buy or rent today will operate in a different landscape by 2027. Plan for that uncertainty, and you'll be positioned to take advantage of whatever emerges.

Share this post