Training-Data Bias Vs Agent-Era Mitigation Mechanisms
Sources: 1 • Confidence: High • Updated: 2026-04-12 10:23
Key takeaways
- The author reports being uncertain that training-data representation still determines how well models help with a given tool or language when using strong coding-agent harnesses.
- A referenced study titled "What Claude Code Actually Chooses" reports that after more than 2,000 prompts, Claude Code showed a strong build-over-buy bias and a preferred stack where GitHub Actions, Stripe, and shadcn/ui had a near monopoly in their categories.
- A rapidly adopted "Skills" mechanism in coding-agent tools is enabling projects to publish official skills that help agents use their products.
- The author reports being surprised that coding agents do not materially constrain their technology choices or push them toward a "Choose Boring Technology" approach.
- If a coding agent is prompted to run a new tool's "--help" and similar commands, the resulting documentation can fit within modern context windows and be sufficient for the agent to use the tool effectively.
Sections
Training-Data Bias Vs Agent-Era Mitigation Mechanisms
- The author reports being uncertain that training-data representation still determines how well models help with a given tool or language when using strong coding-agent harnesses.
- The author reports being surprised that coding agents do not materially constrain their technology choices or push them toward a "Choose Boring Technology" approach.
- If a coding agent is prompted to run a new tool's "--help" and similar commands, the resulting documentation can fit within modern context windows and be sufficient for the agent to use the tool effectively.
- In codebases using private or very new libraries absent from training data, coding agents can still work by learning patterns from existing examples and iterating and testing their output.
- LLM-assisted programming may steer technology choices toward tools that are well represented in training data, making adoption harder for newer tools.
- A couple of years ago, models appeared to perform better when asked about Python or JavaScript than when asked about less widely used languages.
Systematic Tool And Vendor Recommendation Concentration In Agent Tooling
- A referenced study titled "What Claude Code Actually Chooses" reports that after more than 2,000 prompts, Claude Code showed a strong build-over-buy bias and a preferred stack where GitHub Actions, Stripe, and shadcn/ui had a near monopoly in their categories.
- The author distinguishes between what technology LLMs recommend and how well agents perform when humans choose a different technology than the model or harness would prefer.
Skills And Official Integrations As A New Channel To Shape Agent Behavior
- A rapidly adopted "Skills" mechanism in coding-agent tools is enabling projects to publish official skills that help agents use their products.
Unknowns
- Across model generations and harnesses, how strongly does training-data representation predict agent task success when forced to use low-representation or novel tools?
- How much do agent tool recommendations (preferred stacks, build-over-buy bias) actually influence real-world technology adoption, procurement, or architecture decisions?
- When agents are bootstrapped via CLI help/manpages alone, what are the observed error modes and task boundaries (e.g., configuration, authentication, edge-case flags)?
- In repositories with private or new libraries, what iteration count and test pass rates are typical for agent-generated changes, and how does this vary by codebase quality and test coverage?
- Does publishing official Skills measurably improve agent success rates and/or increase selection of the corresponding tool compared to alternatives without Skills?