Rosa Del Mar

Daily Brief

Issue 62 2026-03-03

Expert Belief Update / Legitimacy Signal

Issue 62 Edition 2026-03-03 4 min read
Not accepted General
Sources: 1 • Confidence: Medium • Updated: 2026-03-08 21:22

Key takeaways

  • Donald Knuth expressed joy that his conjecture has a nice solution and that the solution illustrates progress in automated reasoning.
  • Donald Knuth stated that Claude Opus 4.6 solved an open problem that he had worked on for several weeks, and he stated the model was released three weeks earlier.
  • Donald Knuth characterized the event as evidence of a dramatic advance in automatic deduction and creative problem solving.
  • Donald Knuth stated that he expects to revise his opinions about generative AI in light of this result.

Sections

Expert Belief Update / Legitimacy Signal

  • Donald Knuth expressed joy that his conjecture has a nice solution and that the solution illustrates progress in automated reasoning.
  • Donald Knuth stated that he expects to revise his opinions about generative AI in light of this result.

Single-Case Capability Demonstration (Open-Problem Solving)

  • Donald Knuth stated that Claude Opus 4.6 solved an open problem that he had worked on for several weeks, and he stated the model was released three weeks earlier.

Interpretation: Advance In Automatic Deduction And Creativity

  • Donald Knuth characterized the event as evidence of a dramatic advance in automatic deduction and creative problem solving.

Unknowns

  • What exactly was the problem/conjecture, and what is the full solution output attributed to the model?
  • Has the solution been independently checked and accepted by relevant experts or any formal venue, and if so, under what criteria?
  • Is the result reproducible across runs and across other leading systems, or is it idiosyncratic to a single model/version/prompting setup?
  • What were the inputs, constraints, and tooling used (e.g., any retrieval, code execution, theorem prover integration, or human feedback during the attempt)?
  • What specific opinions does Knuth expect to revise, and does he change any recommended practices for using generative AI in research work?

Investor overlay

Read-throughs

  • If independently validated, this anecdote could be interpreted as a legitimacy signal for automated reasoning progress, potentially increasing attention to frontier model capabilities in formal domains.
  • Knuth publicly revising opinions could be read as a high profile attitude shift that may influence broader openness to using generative AI in research workflows, contingent on verification and reproducibility.

What would confirm

  • Public release of the exact problem statement, full model output, and a clear verification trail showing the result is correct under agreed criteria.
  • Independent expert checking or formal acceptance in a credible venue, explicitly confirming novelty and correctness of the solution.
  • Reproducibility evidence: same result across reruns and across other leading systems, plus disclosure of inputs, constraints, and any tooling used.

What would kill

  • Independent reviewers find the solution incorrect, incomplete, not novel, or dependent on hidden assumptions that invalidate the claimed open problem status.
  • Disclosure shows substantial human intervention or external tooling that materially drove the result, undermining attribution to the model alone.
  • Failure to reproduce the result across reruns or systems, or inability to provide the original prompt and outputs for inspection.

Sources

  1. 2026-03-03 simonwillison.net