What does the AI Compute Cost Outlook cover?

AI Compute Cost Outlook: Quarterly Briefing

AI Compute Cost Outlook and GPU Pricing Trends (Next Quarters) Executive Summary AI compute costs are entering a bifurcated regime: hyperscaler capex and GPU supply are…

⛔ Restricted

This is a free summary. The full dataset for this section is members-only and blocked to crawlers — the AISA verify node returns 403 for .data.json / .full.md here, regardless of balance.

AI Compute Cost Outlook and GPU Pricing Trends (Next Quarters)

Executive Summary AI compute costs are entering a bifurcated regime: hyperscaler capex and GPU supply are expanding rapidly, but inference demand and model complexity are growing even faster, keeping pressure on marginal prices. GPU rental rates have softened 20–25% year‑on‑year, yet hyperscaler AI‑focused capex is set to exceed $600 bn in 2026, implying continued tightness in high‑end capacity and limited downside for premium SKUs.

1. Hyperscaler Capex and AI Infrastructure Build‑out

The “big five” hyperscalers are on track to spend over $600 bn in capex in 2026, with roughly 75% ($450 bn) directly tied to AI infrastructure—servers, GPUs, and AI‑optimized datacenters. This represents a 36% year‑on‑year increase, funded increasingly via debt markets as capex after buybacks and dividends now exceeds projected free cash flow. The result is a structural ramp in AI‑ready capacity, but with a heavy tilt toward next‑gen GPUs (H200/B200‑class and beyond), which remain constrained by packaging and power‑delivery bottlenecks.

2. GPU Rental Rates and Compute Economics

Recent observations show GPU hourly rental rates paid by neocloud and hyperscaler customers down 20–25% over the past year, reflecting incremental supply and better utilization. However, only ~20 GW of the world’s ~125 GW of datacenter capacity is currently capable of running AI workloads, so the effective pool of high‑end GPUs remains tight. As a result, price declines are concentrated in mid‑tier SKUs; frontier‑class GPUs used for large‑scale training and high‑throughput inference are seeing only modest erosion, with providers still losing money on inference at scale for some workloads.

3. Inference Cost Curve and Model‑Level Efficiency

On the model side, frontier‑class inference is becoming markedly cheaper: Gartner forecasts that running a one‑trillion‑parameter model will cost providers over 90% less by 2030 than in 2025, with 2030‑era models up to 100× cheaper than 2022‑era equivalents. In practice, newer open‑weight models such as DeepSeek‑V4‑Pro are already delivering strong coding benchmarks at ~$0.44 per million input tokens and ~$0.87 per million output tokens, versus 10–20× higher for closed frontier models. This encourages “model‑level arbitrage,” where organizations route workloads to smaller, cheaper models, cutting costs by 40–85% versus using a single flagship model for all tasks.

4. What to Watch

GPU‑by‑SKU pricing divergence: Monitor spot and reserved‑instance pricing for H200/B200 versus mid‑tier SKUs; expect continued softness in the latter but resilience in the former as AI‑ready capacity remains constrained.
Hyperscaler debt‑funded capex: Track quarterly capex guidance and debt issuance from the big five; any slowdown in AI‑focused spend would tighten GPU availability and cap downside on rental rates.
Inference‑vs‑training demand mix: Watch OpenAI‑ and similar‑style forecasts for compute demand; if training remains the dominant use case through 2030, premium GPU pricing will stay elevated despite growing installed base.

Members-only briefing synthesized by the AISA LLM layer (AISA Perplexity API). 2026-06-23.