Skip to main content

On This Page

Why Your AGENTS.md Files are Sabotaging AI Coding Performance

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

New ETH Zurich Study Proves Your AI Coding Agents are Failing Because Your AGENTS.md Files are too Detailed

Researchers at ETH Zurich analyzed coding agents like Sonnet-4.5 and GPT-5.2 to evaluate the impact of repository-level context files. The study found that automatically generated AGENTS.md files actually reduced success rates by 3% and increased inference costs by over 20%.

Why This Matters

While developers use context engineering to guide LLMs through complex codebases, bloated or auto-generated documentation creates technical overhead for the agent. The reality is that agents are often too obedient to unnecessary instructions, leading to more reasoning steps and higher costs without improved outcomes. High-parameter models often possess enough parametric knowledge to render extensive directory trees redundant, making surgical intervention more effective than comprehensive but noisy documentation.

Key Insights

  • Auto-generated context files reduced success rates by 3% on AGENTBENCH, 2026.
  • Detailed directory trees are redundant as agents are proficient at autonomous file discovery.
  • The Multiplier Effect: Explicitly mentioning tools like uv increased usage 160x compared to instances where they were omitted.
  • Human-written context files provided only a marginal 4% performance gain over using no context at all.
  • Stronger models like GPT-5.2 do not necessarily produce better context files than smaller models.

Practical Applications

  • Use Case: Specify non-obvious tooling like uv or bun in AGENTS.md to ensure the agent uses high-performance package managers.
  • Pitfall: Including detailed style guides wastes tokens; use deterministic linters and formatters instead for cheaper and faster results.
  • Use Case: Maintain lean context files under 300 lines to minimize reasoning steps and inference overhead.
  • Pitfall: Relying on LLM-generated repository overviews without human review leads to redundant content and decreased task success.

References:

Continue reading

Next article

LM Link: Secure Peer-to-Peer Access for Remote GPU Workstations

Related Content