Agentic Coding Outputs And Scope Expansion
Sources: 1 • Confidence: Medium • Updated: 2026-04-13 03:43
Key takeaways
- Max Woolf describes a sequence of coding-agent projects that increase in ambition from simple YouTube metadata scrapers to substantially larger builds.
- Max Woolf states that he believes Opus 4.5 and later models are an order of magnitude better at coding than models from just months earlier, while also stating that making that claim publicly can sound like hype.
- Max Woolf claims that, using agents, he is developing a Rust crate named "rustlearn" that implements fast versions of standard machine-learning algorithms including logistic regression and k-means.
- Max Woolf reports attempting to break Opus and Codex with complex tasks that would take him months alone, and reports that the models kept completing those tasks correctly.
- The post is positioned within a growing genre asserting that coding agents became notably effective around November, implying a perceived recent inflection in capability.
Sections
Agentic Coding Outputs And Scope Expansion
- Max Woolf describes a sequence of coding-agent projects that increase in ambition from simple YouTube metadata scrapers to substantially larger builds.
- Simon Willison reports that Claude Code successfully produced a Rust word-cloud CLI tool after he asked it to build one.
- Max Woolf frames porting scikit-learn to Rust with comparable features as an extremely ambitious task.
Perceived Recent Capability Inflection And Credibility Gap
- Max Woolf states that he believes Opus 4.5 and later models are an order of magnitude better at coding than models from just months earlier, while also stating that making that claim publicly can sound like hype.
- The post is positioned within a growing genre asserting that coding agents became notably effective around November, implying a perceived recent inflection in capability.
Agent-Enabled Reimplementation/Porting Of Core Ml Algorithms Into Rust With Performance Claims
- Max Woolf claims that, using agents, he is developing a Rust crate named "rustlearn" that implements fast versions of standard machine-learning algorithms including logistic regression and k-means.
- Max Woolf asserts that his described three-step pipeline can outperform scikit-learn implementations even for simpler algorithms.
Long-Horizon Task Robustness (Anecdotal)
- Max Woolf reports attempting to break Opus and Codex with complex tasks that would take him months alone, and reports that the models kept completing those tasks correctly.
Watchlist
- The post is positioned within a growing genre asserting that coding agents became notably effective around November, implying a perceived recent inflection in capability.
Unknowns
- Are the referenced artifacts (the "rustlearn" crate and the Rust word-cloud CLI) publicly available with reproducible build steps, tests, and CI?
- What exactly is the described three-step pipeline, and under what conditions does it outperform scikit-learn (datasets, metrics, hardware, hyperparameters, preprocessing)?
- What are the acceptance criteria for "completing correctly" on the months-scale tasks, and how often do such tasks fail without human patching?
- Is there objective evidence for the claimed timing and magnitude of a coding-agent capability inflection (e.g., benchmark deltas over time), rather than a narrative impression?
- What operational constraints dominate these workflows (tooling setup, context management, rate limits, model costs, review practices), and how do they scale with project size?