Rosa Del Mar

Daily Brief

Issue 81 2026-03-22

Lossy-Self-Improvement Vs Recursive Self-Improvement

Issue 81 Edition 2026-03-22 7 min read
General
Sources: 1 • Confidence: Medium • Updated: 2026-03-25 17:57

Key takeaways

  • AI progress will likely appear more linear than exponential in hindsight because development loops will exhibit 'lossy self-improvement' in which friction breaks key recursive self-improvement assumptions.
  • Current language models are already capable of performing many highly valuable knowledge-work tasks.
  • Automation can effectively optimize single metrics such as test loss, but improvements on those metrics often do not translate into increased user productivity.
  • For leading general models, post-training is extremely complex, and much of the difficulty is concentrated in achieving the last 1–3% of performance without overfitting or harming out-of-domain behavior.
  • Compute and research resource allocation inside organizations will remain politically mediated, creating friction that prevents an unconstrained self-improvement loop.

Sections

Lossy-Self-Improvement Vs Recursive Self-Improvement

  • AI progress will likely appear more linear than exponential in hindsight because development loops will exhibit 'lossy self-improvement' in which friction breaks key recursive self-improvement assumptions.
  • Recursive self-improvement requires a closed loop in which models improve the process of building better models, gains amplify iteration-to-iteration, and friction is low enough to avoid a sigmoid-shaped progress curve.
  • Compute and research resource allocation inside organizations will remain politically mediated, creating friction that prevents an unconstrained self-improvement loop.
  • For the next few years, the industry will operate in a 'lossy self-improvement' regime where models are core to the development loop but do not change the overall approach enough to cause takeoff.
  • Improving a model on specific tasks does not necessarily improve its ability to improve itself, because self-improvement depends on experiment design and navigating multiple metrics rather than single-objective optimization.

Near-Term Capability And Adoption Expectations With Uncertainty

  • Current language models are already capable of performing many highly valuable knowledge-work tasks.
  • In 2026, AI will likely feel like a huge step forward due to workflow polishing and major training-compute scaling, but it will not be a fundamental change that triggers takeoff.
  • Superhuman coding assistants and easier AI research workflows will drive at least a year of rapid progress at the cutting edge of AI.
  • Near-term capability gains beyond coding and CLI-based computer use are difficult to predict, and it is unclear which additional tasks models will master within a year.
  • A plausible near-term milestone is an 'AGI threshold' where AI becomes a drop-in replacement for most remote workers even if capabilities remain jagged and non-humanlike.

Automation Translation Gap And Limits To Agent Parallelism

  • Automation can effectively optimize single metrics such as test loss, but improvements on those metrics often do not translate into increased user productivity.
  • Scaling the number of AI agents in parallel will face steep diminishing returns because agents sample similar solutions and remain bottlenecked by human supervision.
  • Prior AutoML efforts did not substantially change the day-to-day work of top researchers, indicating that narrow optimization automation often fails to replace core research intuition and complexity management.

Diminishing Returns And Complexity Concentration In Post-Training

  • For leading general models, post-training is extremely complex, and much of the difficulty is concentrated in achieving the last 1–3% of performance without overfitting or harming out-of-domain behavior.
  • As AI systems become more complex, additional progress becomes harder and exhibits diminishing returns, consistent with a 'complexity break' framing.

Industry Concentration And Internal Governance As Bottlenecks

  • Compute and research resource allocation inside organizations will remain politically mediated, creating friction that prevents an unconstrained self-improvement loop.
  • The frontier AI industry is consolidating toward an oligopoly of roughly two to three labs with disproportionate access to top models and the resources to build next-generation systems.

Unknowns

  • Is the frontier model ecosystem actually consolidating into two to three labs, and what concrete mechanisms (exclusive access, compute deals, hiring) drive or prevent that consolidation?
  • How large is the real-world productivity uplift from current models across high-value knowledge-work tasks, and how reliable are these gains across organizations and workflows?
  • What is the empirical correlation between improvements in training/post-training metrics (e.g., loss, reward, benchmark scores) and user-level productivity or quality outcomes?
  • What are the dominant sources of difficulty in the 'last 1–3%' of post-training performance, and how often do marginal gains cause regressions or out-of-domain harms?
  • To what extent can AI systems perform the meta-work of AI R&D (experiment design, multi-metric tradeoffs, debugging failure modes) rather than only narrow optimization tasks?

Investor overlay

Read-throughs

  • AI progress may look more linear than exponential due to technical and organizational friction, implying capability gains come in costly, uneven increments rather than compounding automatically.
  • Model metric improvements may not translate into user productivity, implying adoption and monetization depend on workflow integration, reliability, and supervision costs more than benchmark wins.
  • Frontier progress may concentrate in a few labs due to access and allocation bottlenecks, implying supply of leading capability and compute is gated by exclusive deals and internal governance.

What would confirm

  • Real world studies show large, repeatable productivity gains from models across multiple knowledge work workflows, not just coding, with low supervision overhead.
  • Repeated cases where benchmark or loss improvements fail to improve user outcomes, alongside rising post-training effort focused on robustness and avoiding regressions.
  • Evidence of consolidation into two to three frontier labs via exclusive compute access, hiring concentration, and restricted model availability affecting downstream access.

What would kill

  • Clear demonstrations that AI systems reliably perform meta AI R and D work like experiment design, debugging, and multi-metric tradeoffs, accelerating frontier progress without major human bottlenecks.
  • Strong empirical linkage between training and post-training metrics and user productivity across organizations, with predictable translation from leaderboard gains to workflow outcomes.
  • Sustained rapid progress from many labs without access gating, showing compute allocation politics and exclusivity do not materially constrain frontier advancement.

Sources

  1. 2026-03-22 interconnects.ai