Llm-Mediated Reverse Engineering Workflow And Reproducibility Constraints
Sources: 1 • Confidence: High • Updated: 2026-03-25 17:54
Key takeaways
- The author used Codex CLI with GPT-5.4 xhigh to review the zip for obvious hallucinations and, seeing none, published the result.
- An assembler-knowledgeable reviewer argued the output was not a full disassembly, consisted of short snippets, and questioned whether the snippets were correct.
- Searching the binary for the hex opcode sequence B0 E8 ('mov al,0xe8') was sufficient to confirm that a presented snippet was not present anywhere in the binary.
- Borland's 1985 Turbo Pascal 3.02 executable was 39,731 bytes and included a full text editor IDE and a Pascal compiler.
- A reviewer noted additional suspicious and impossible code in the artifact, including a 'ret 1' in a system call dispatcher that would misalign the stack.
Sections
Llm-Mediated Reverse Engineering Workflow And Reproducibility Constraints
- The author used Codex CLI with GPT-5.4 xhigh to review the zip for obvious hallucinations and, seeing none, published the result.
- The author obtained the Turbo Pascal executable and used Claude to interpret the binary and produce an interactive annotated artifact via a sequence of prompts.
- The shared Claude link did not include the actually executed code from the session, so the author provided a zip of intermediate files instead.
Failure Mode: Plausible Technical Artifacts Can Be Partially Fabricated
- An assembler-knowledgeable reviewer argued the output was not a full disassembly, consisted of short snippets, and questioned whether the snippets were correct.
- A later update states that the published decompiled/annotated result was hallucinated and inaccurate.
- After receiving the critique, Claude agreed the artifact mixed real hex dumps and some correct disassembly with fabricated assembly and labels for roughly half the binary that fails byte-level comparison.
Low-Cost Falsification Techniques For Claimed Disassembly Snippets
- Searching the binary for the hex opcode sequence B0 E8 ('mov al,0xe8') was sufficient to confirm that a presented snippet was not present anywhere in the binary.
- A reviewer noted additional suspicious and impossible code in the artifact, including a 'ret 1' in a system call dispatcher that would misalign the stack.
- A reviewer identified an example where an 'EmitByte' routine in the artifact pointlessly pushed and popped AX and concluded those instructions do not appear in the actual binary.
Extreme Compactness Of Historical Developer Tooling
- Borland's 1985 Turbo Pascal 3.02 executable was 39,731 bytes and included a full text editor IDE and a Pascal compiler.
Unknowns
- What is a fully reproducible, byte-addressed disassembly of the Turbo Pascal 3.02 executable that maps every claimed function/snippet to exact offsets and bytes?
- What fraction of the published artifact is verifiably correct versus fabricated when checked against the binary using deterministic disassembly tooling?
- Which specific verification gates were applied (or omitted) in the initial publication process, beyond a qualitative second-model review?
- Do the simple opcode-pattern searches and 'impossible code' heuristics generalize to reliably screening other AI-generated reverse-engineering writeups?
- Is there any direct operator/product/investor decision readthrough stated in the corpus beyond the general need for byte-level verification in AI-assisted reverse engineering?