Compute Economics: Scarcity Now, Overbuild Risk Later

Issue 93 Edition 2026-04-03 9 min read

General

Sources: 1 • Confidence: Medium • Updated: 2026-04-11 17:57

Key takeaways

The strongest overbuild analogy is disputed on the grounds that AI capex is led by blue-chip companies with substantial cash and debt capacity rather than fragile startups.
Bots and cheap explosive drones are framed as creating an economic asymmetry where attacks are cheap but defense and verification are expensive, requiring new defensive technologies and approaches.
An agent is defined as an LLM connected to a bash-like shell plus a filesystem for state, with a cron-like loop/heartbeat and markdown files as a common state format.
Recent AI product shocks are framed as unlocking decades of accumulated research rather than being purely recent inventions.
Open-source and edge inference become more important when centralized inference is capacity-constrained and when users want trust, privacy, latency, and price optimization from local models.

Sections

Compute Economics: Scarcity Now, Overbuild Risk Later

The strongest overbuild analogy is disputed on the grounds that AI capex is led by blue-chip companies with substantial cash and debt capacity rather than fragile startups.
Some users are reportedly spending on the order of $1,000 per day on Claude tokens to run agent-like workloads.
AI has historically exhibited recurring boom-bust cycles described as 'summers' and 'winters'.
A major driver of the dot-com crash is framed as leveraged telecom overbuilding based on incorrect traffic growth expectations.
Compute capacity is framed as scarce such that incremental spending to deploy GPUs converts into revenue quickly.
Current user-facing AI models are framed as 'sandbagged' due to supply constraints, implying more abundant compute would improve delivered capability even without algorithmic progress.

Security, Identity, And Agent Commerce

Bots and cheap explosive drones are framed as creating an economic asymmetry where attacks are cheap but defense and verification are expensive, requiring new defensive technologies and approaches.
a16z is claimed to be a key participant in the World proof-of-human project and to view its approach as correct for addressing the bot problem.
Models are claimed to be able to reverse-engineer complex software binaries to recover source-like representations where human reverse engineering would be prohibitively slow.
A small cohort of users has reportedly given AI agents direct access to bank accounts and credit cards to enable autonomous spending.
Permissive early-adopter usage patterns are framed as a way to discover both valuable capabilities and dangerous failure modes of agents.
Some users are reportedly using AI agents to scan local networks, identify insecure IoT devices, and take over control of home systems including cameras and access controls.

Agent Architecture As Os Primitives And Portability

An agent is defined as an LLM connected to a bash-like shell plus a filesystem for state, with a cron-like loop/heartbeat and markdown files as a common state format.
The need for specialized tool-connection protocols is disputed in favor of exposing capabilities via command-line interfaces.
Model-provider lock-in via proprietary internal representations is framed as potentially limited because competing models could learn or reverse-engineer what another model produced.
Pi plus OpenClaw are claimed to combine an LLM paradigm with a Unix shell paradigm for building agents.
If agent state is stored in files, the underlying LLM and even runtime environment can be swapped while preserving the agent's memories and capabilities.
Because an agent can introspect and rewrite its own files, it can add new functions to itself with minimal human effort.

Capability Trajectory And Diffusion Framing

Recent AI product shocks are framed as unlocking decades of accumulated research rather than being purely recent inventions.
Open sourcing accelerates replication of AI breakthroughs by revealing implementation details (papers and code), enabling rapid diffusion of capabilities like reasoning.
A recent 'reasoning breakthrough' materially addressed the critique that LLMs were only pattern completion and not suitable for high-stakes professional work.
The AI field has converged on neural networks as the correct core architecture after decades of controversy.
AI progress is framed as four breakthroughs: large language models, reasoning, agents, and self-improvement (RSI).
Scaling laws are framed as a self-fulfilling coordination target similar to Moore's law, helping keep progress on-curve.

Market Structure And Competitive Fragility At The App Layer

Open-source and edge inference become more important when centralized inference is capacity-constrained and when users want trust, privacy, latency, and price optimization from local models.
Some companies building on top of foundation models will be outcompeted when next-generation models absorb their differentiating features.
The current U.S. administration is characterized as supportive of AI and open-source AI, contrasting with a prior administration characterized as hostile.
Chinese AI firms may open-source models as a loss leader because they expect limited ability to sell commercial AI services in the U.S.
AI2 is claimed to have collapsed, and U.S. open-source model labs are characterized as weaker near-term relative to firms like Mistral.
The market for scaled foundation-model companies is predicted to consolidate from roughly a dozen across the U.S. and China to a small number of winners within three years.

Watchlist

Agent-to-agent interaction across social networks could introduce alignment and control risks if agents are allowed to act autonomously.
He suggests there may be additional, not-yet-understood scaling laws ahead (e.g., for world models, robotics, and real-world data acquisition).

Unknowns

Are 'reasoning breakthroughs' measurably improving correctness and reliability in high-stakes professional deployments relative to prior models?
What is the actual duration and magnitude of compute scarcity (GPU and non-GPU) across clouds and enterprises, and does it match the predicted multi-year shortage horizon?
To what extent are user-facing models 'sandbagged' by supply constraints versus limited by model quality, safety policy, or product design choices?
How common is very high agent token spend (e.g., on the order of $1,000/day), and what workload mix drives it (tool calls, browsing, coding, long-context reasoning)?
Will agentic systems built on file-backed state demonstrate practical cross-model portability without material regressions in performance, security, or maintainability?

Investor overlay

Read-throughs

Near term compute scarcity may sustain strong utilization and rapid revenue conversion for deployed GPUs, while agent workloads broaden bottlenecks to CPU, memory, and networking and raise the value of software optimization on older chips.
Security spend may rise due to cheap offensive automation and expensive defense and verification, increasing focus on proof of human infrastructure and new defensive technologies as agents are used for reconnaissance and device takeover.
Open source and edge inference may gain share when centralized inference is capacity constrained and users demand trust, privacy, latency, and price optimization, potentially pressuring app layer differentiation as model capability diffuses quickly.

What would confirm

Evidence of product throttling tied to capacity limits and persistent multi year shortages across GPU and non GPU resources, plus growing agent token spend that stresses CPU, memory, and networking beyond GPUs.
Rising adoption and budgets for proof of human systems and agent safe commerce rails, alongside increased incidents of agent enabled reconnaissance and compromise attempts reported by enterprises and security vendors.
Increased deployment of local and open source models for privacy and latency reasons, and faster replication of new capabilities across providers that erodes app layer differentiation soon after model releases.

What would kill

Compute scarcity resolves quickly, utilization falls, and the overbuild risk narrative dominates with excess capacity and weaker revenue conversion for deployed infrastructure.
Reasoning and agentic systems do not measurably improve correctness and reliability in high stakes deployments, limiting adoption and reducing demand for expanded compute and new security and identity layers.
Centralized inference capacity scales smoothly and users do not shift toward local models, weakening the case for edge inference and open source advantages driven by trust, privacy, latency, or price.

Sources

Marc Andreessen on AI Winters and Agent Breakthroughs

2026-04-03 a16z.simplecast.com