MCP vs. CLI: Measuring Token Overhead in Agent Search
These articles are AI-generated summaries. Please check the original sources for full details.
I measured MCP vs a CLI for agent search
Engineer Ary Rabelo compared the token costs of SerpApi’s official MCP server against a custom MIT-licensed CLI. The results showed the MCP returning 6,047 tokens per call compared to just 351 for the CLI using field projection.
Why This Matters
The technical reality is that while MCP provides a standardized transport, it introduces significant ‘standing costs’ by injecting tool schemas into the context on every turn. For stateless operations like search, this overhead compounds as more tools are added, potentially wasting thousands of tokens before an agent even performs work. This contrasts with the ideal model of lean context usage where only necessary data is passed to the LLM.
Key Insights
- Standing cost disparity: MCP injects 771 tokens per turn for tool schema, whereas a binary on PATH (CLI) incurs ~0 standing tokens (Rabelo, 2026).
- Field Projection over Full Payloads: Using
--fields title,linkreduces response size to 351 tokens versus 6,047 for default MCP output. - Context Reduction via Code Execution: Anthropic’s research showed a Drive-to-Salesforce workflow token reduction from 150,000 to 2,000 by calling tools as code rather than loading definitions.
Practical Applications
- Stateless Search: Use a CLI with minified JSON output and field projection to minimize context bloat in coding loops.
- Governed Connections: Use MCP when requiring OAuth, multi-user auth, or server-side rate limiting across multiple clients.
- Tool Proliferation Pitfall: Loading multiple MCP servers simultaneously creates additive standing costs that can consume several thousand tokens of the context window.
References:
Continue reading
Next article
Kafka 4.0+: Mastering KRaft, Incremental Rebalancing, and Production Python Patterns
Related Content
The Six Levels of MCP Server Maturity: Moving Beyond API Wrapping
Most production MCP servers are stuck at Level 1 or 2, failing to provide the domain context necessary for effective agent reasoning.
Scaling AI Agents: When to Transition from Prototypes to an MCP Runtime
Discover the 6 critical signs your AI agent has outgrown its prototype phase and requires a governed MCP runtime for production security.
Why Agent Memory is Not a Database: Shifting to Governed Evolving Memory
A new research paper argues that record-level database abstractions cause four critical failure modes in AI agent memory systems.