AI-Driven Development: Moving Beyond Vibe Coding to Agentic Engineering

The orchestration mindset

Andrew Stellman developed Octobatch, a production-grade batch orchestrator for Monte Carlo simulations. The system comprises 21,000 lines of Python and nearly 1,000 automated tests built entirely by AI.

Why This Matters

There is a critical gap between theoretical knowledge of AI tools and the practical ability to maintain architectural coherence across thousands of lines of generated code. While fully autonomous agents can produce massive outputs—such as Anthropic’s experiment where 16 Claude instances spent $20,000 to build a 100,000-line C compiler that still required human intervention to fix bugs—true reliability requires an ‘orchestration mindset’ where humans own the architecture and verification.

Key Insights

The ‘Cognitive Shortcut Paradox’ indicates that developers who already know what good software looks like are the most effective at driving AI coding tools (Stellman, O’Reilly Radar).
LLM Batch APIs (released by OpenAI, Anthropic, and Google between April 2024 and July 2025) provide a 50% cost reduction and better performance at scale compared to real-time APIs by treating LLMs as processing infrastructure rather than chatbots.
AI exhibits a generative bias toward adding code rather than deleting it; experienced developers must override this instinct to prevent unnecessary complexity in the codebase.
Agentic engineering requires specific roles: one LLM for architecture planning, another for execution, a coding agent for implementation, and a human for vision and verification.

Practical Applications

[Octobatch / Monte Carlo Simulations] Use case: Running thousands of iterations with seeded randomness for reproducibility. Pitfall: Re-seeding RNGs at every iteration creates correlation bias, leading to incorrect statistical results (e.g., sailors falling in water at 77.5% vs the expected 50%).
[Multi-LLM Coordination] Use case: Using one model (Gemini) to validate the output or identify hallucinations produced by another (Claude). Pitfall: Relying on a single LLM’s estimate of complexity; models may overestimate implementation time due to lack of full architectural context.

References:

https://stackoverflow.blog/2026/05/22/dispatches-from-o-reilly-the-accidental-orchestrator/

On This Page

The orchestration mindset

Why This Matters

Key Insights

Practical Applications

Continue reading

Related Content

The Risk of 'Vibe Coding': Why Fundamental Engineering Still Matters in the AI Era

Engineering LLM Pipelines with LangChain.js: A Technical Overview

Standardizing Agentic Code: Building Guidelines for AI and Human Engineers