Training-Data Representation Vs Agent-Harness Mitigation
Sources: 1 • Confidence: Medium • Updated: 2026-03-10 08:29
Key takeaways
- If a coding agent is prompted to run a new tool's "--help" and similar commands, the returned text can provide enough documentation within modern context windows for the agent to use brand new tools effectively.
- In the author's experience, coding agents did not materially constrain their technology choices toward a "Choose Boring Technology" stack.
- A rapidly adopted "Skills" mechanism in coding-agent tools enables projects to publish official skills that help agents use their products.
- With the latest models running in strong coding-agent harnesses, training-data representation may no longer be the dominant determinant of how well models help with a given tool or language.
- In codebases that use private or very new libraries not present in training data, coding agents can still make progress by learning patterns from existing code examples and then iterating and testing to close gaps.
Sections
Training-Data Representation Vs Agent-Harness Mitigation
- If a coding agent is prompted to run a new tool's "--help" and similar commands, the returned text can provide enough documentation within modern context windows for the agent to use brand new tools effectively.
- In codebases that use private or very new libraries not present in training data, coding agents can still make progress by learning patterns from existing code examples and then iterating and testing to close gaps.
- A couple of years ago, LLMs appeared to perform better on questions about Python or JavaScript than on questions about less widely used programming languages.
Recommendation Bias Vs Execution Under Constraints
- In the author's experience, coding agents did not materially constrain their technology choices toward a "Choose Boring Technology" stack.
- A referenced study titled "What Claude Code Actually Chooses" reports that after over 2,000 prompts, Claude Code showed a strong build-over-buy bias and a preferred stack in which GitHub Actions, Stripe, and shadcn/ui had a near monopoly within their categories.
- Technology recommendation bias by LLMs and agent performance under a human-imposed technology choice are distinct questions and should be evaluated separately.
Emerging Distribution Channel Via Agent Skills
- A rapidly adopted "Skills" mechanism in coding-agent tools enables projects to publish official skills that help agents use their products.
Unknowns
- Do modern coding agents still show systematically higher success rates on widely represented languages/tools compared to low-representation or novel tools, when evaluated under the same harness and workflow?
- How much of agent competence on brand-new tools can be explained by pulling local documentation into context (e.g., "--help"/manpages) versus reliance on pretrained knowledge?
- For private or novel internal libraries, what iteration counts and test-pass rates do coding agents achieve in practice, and what failure modes dominate?
- How stable are tool/vendor recommendation biases (including build-over-buy bias and stack monopolies) across model versions, different coding-agent products, and different prompt/harness defaults?
- What is the measurable impact of publishing official "Skills" on agent success rates and on downstream tool/vendor selection frequency?