Expert Belief Update / Legitimacy Signal

Issue 62 Edition 2026-03-03 4 min read

Not accepted General

Sources: 1 • Confidence: Medium • Updated: 2026-03-08 21:22

Key takeaways

Donald Knuth expressed joy that his conjecture has a nice solution and that the solution illustrates progress in automated reasoning.
Donald Knuth stated that Claude Opus 4.6 solved an open problem that he had worked on for several weeks, and he stated the model was released three weeks earlier.
Donald Knuth characterized the event as evidence of a dramatic advance in automatic deduction and creative problem solving.
Donald Knuth stated that he expects to revise his opinions about generative AI in light of this result.

Donald Knuth expressed joy that his conjecture has a nice solution and that the solution illustrates progress in automated reasoning.
Donald Knuth stated that he expects to revise his opinions about generative AI in light of this result.

Donald Knuth stated that Claude Opus 4.6 solved an open problem that he had worked on for several weeks, and he stated the model was released three weeks earlier.

Donald Knuth characterized the event as evidence of a dramatic advance in automatic deduction and creative problem solving.

What exactly was the problem/conjecture, and what is the full solution output attributed to the model?
Has the solution been independently checked and accepted by relevant experts or any formal venue, and if so, under what criteria?
Is the result reproducible across runs and across other leading systems, or is it idiosyncratic to a single model/version/prompting setup?
What were the inputs, constraints, and tooling used (e.g., any retrieval, code execution, theorem prover integration, or human feedback during the attempt)?
What specific opinions does Knuth expect to revise, and does he change any recommended practices for using generative AI in research work?

If independently validated, this anecdote could be interpreted as a legitimacy signal for automated reasoning progress, potentially increasing attention to frontier model capabilities in formal domains.
Knuth publicly revising opinions could be read as a high profile attitude shift that may influence broader openness to using generative AI in research workflows, contingent on verification and reproducibility.

Public release of the exact problem statement, full model output, and a clear verification trail showing the result is correct under agreed criteria.
Independent expert checking or formal acceptance in a credible venue, explicitly confirming novelty and correctness of the solution.
Reproducibility evidence: same result across reruns and across other leading systems, plus disclosure of inputs, constraints, and any tooling used.

Independent reviewers find the solution incorrect, incomplete, not novel, or dependent on hidden assumptions that invalidate the claimed open problem status.
Disclosure shows substantial human intervention or external tooling that materially drove the result, undermining attribution to the model alone.
Failure to reproduce the result across reruns or systems, or inability to provide the original prompt and outputs for inspection.