Agentic Coding Applied To Non-Trivial Rust Builds

Issue 58 Edition 2026-02-27 5 min read

Not accepted General

Sources: 1 • Confidence: Medium • Updated: 2026-03-02 19:33

Key takeaways

Max Woolf describes a sequence of coding-agent projects that increase in ambition from simple scripts to substantially larger builds.
Max Woolf says he believes Opus 4.5 and later models are an order of magnitude better at coding than models from months earlier, and that making such a claim publicly is difficult without sounding like hype.
Max Woolf reports that he tried to break Opus and Codex with complex tasks that would take him months alone, but that they kept completing them correctly.
The post is presented within a broader narrative that coding agents became notably effective around November.
Simon Willison reports that he asked Claude Code to build a Rust word-cloud CLI tool and that Claude Code successfully produced it.

Max Woolf describes a sequence of coding-agent projects that increase in ambition from simple scripts to substantially larger builds.
Simon Willison reports that he asked Claude Code to build a Rust word-cloud CLI tool and that Claude Code successfully produced it.
Max Woolf states that, using agents, he is developing a Rust crate called "rustlearn" that implements fast versions of standard machine-learning algorithms including logistic regression and k-means.
Max Woolf frames porting scikit-learn to Rust with comparable features as an extremely ambitious task.

Max Woolf says he believes Opus 4.5 and later models are an order of magnitude better at coding than models from months earlier, and that making such a claim publicly is difficult without sounding like hype.
The post is presented within a broader narrative that coding agents became notably effective around November.

Max Woolf reports that he tried to break Opus and Codex with complex tasks that would take him months alone, but that they kept completing them correctly.
Max Woolf asserts that his described three-step pipeline can outperform scikit-learn implementations even for simpler algorithms.

The post is presented within a broader narrative that coding agents became notably effective around November.

Are the referenced artifacts (e.g., the "rustlearn" crate and the Rust word-cloud CLI) publicly available with reproducible build steps, tests, and CI?
What were the exact task specifications, acceptance criteria, and observed failure rates for the reported long-horizon 'months of work' tasks?
What is the three-step pipeline referenced for the performance claim, and how does it control for fairness versus scikit-learn (same algorithmic variants, convergence criteria, and preprocessing)?
What concrete measurements support the narrative that coding agents became notably effective around November (benchmarks, real-world task completion rates, or internal productivity metrics)?
Which exact model versions and environments are being compared when claiming order-of-magnitude coding improvements, and are the comparisons controlled?