Agentic Coding Applied To Non-Trivial Rust Builds
Sources: 1 • Confidence: Medium • Updated: 2026-03-02 19:33
Key takeaways
- Max Woolf describes a sequence of coding-agent projects that increase in ambition from simple scripts to substantially larger builds.
- Max Woolf says he believes Opus 4.5 and later models are an order of magnitude better at coding than models from months earlier, and that making such a claim publicly is difficult without sounding like hype.
- Max Woolf reports that he tried to break Opus and Codex with complex tasks that would take him months alone, but that they kept completing them correctly.
- The post is presented within a broader narrative that coding agents became notably effective around November.
- Simon Willison reports that he asked Claude Code to build a Rust word-cloud CLI tool and that Claude Code successfully produced it.
Sections
Agentic Coding Applied To Non-Trivial Rust Builds
- Max Woolf describes a sequence of coding-agent projects that increase in ambition from simple scripts to substantially larger builds.
- Simon Willison reports that he asked Claude Code to build a Rust word-cloud CLI tool and that Claude Code successfully produced it.
- Max Woolf states that, using agents, he is developing a Rust crate called "rustlearn" that implements fast versions of standard machine-learning algorithms including logistic regression and k-means.
- Max Woolf frames porting scikit-learn to Rust with comparable features as an extremely ambitious task.
Perceived Recent Inflection And Skepticism/Credibility Gap
- Max Woolf says he believes Opus 4.5 and later models are an order of magnitude better at coding than models from months earlier, and that making such a claim publicly is difficult without sounding like hype.
- The post is presented within a broader narrative that coding agents became notably effective around November.
Claims Of Performance And Long-Horizon Correctness
- Max Woolf reports that he tried to break Opus and Codex with complex tasks that would take him months alone, but that they kept completing them correctly.
- Max Woolf asserts that his described three-step pipeline can outperform scikit-learn implementations even for simpler algorithms.
Watchlist
- The post is presented within a broader narrative that coding agents became notably effective around November.
Unknowns
- Are the referenced artifacts (e.g., the "rustlearn" crate and the Rust word-cloud CLI) publicly available with reproducible build steps, tests, and CI?
- What were the exact task specifications, acceptance criteria, and observed failure rates for the reported long-horizon 'months of work' tasks?
- What is the three-step pipeline referenced for the performance claim, and how does it control for fairness versus scikit-learn (same algorithmic variants, convergence criteria, and preprocessing)?
- What concrete measurements support the narrative that coding agents became notably effective around November (benchmarks, real-world task completion rates, or internal productivity metrics)?
- Which exact model versions and environments are being compared when claiming order-of-magnitude coding improvements, and are the comparisons controlled?