Unrolling the Codex agent loop
These articles are AI-generated summaries. Please check the original sources for full details.
Unrolling the Codex agent loop
The Codex CLI is a cross-platform software agent designed for producing reliable software changes, and its core logic relies on the ‘agent loop’—a process orchestrating interaction between the user, the model, and invoked tools, resulting in meaningful software work. The loop functions by repeatedly querying models and executing tool calls, achieving an impressive level of reliability, but can exhaust context windows if not carefully managed.
Why This Matters
Developing AI agents often involves a gap between ideal, stateless models and the realities of stateful interactions, tool calls, and context window limitations. Unmanaged conversation history can quickly exhaust the context window of large language models, leading to performance degradation or outright failure—potentially costing significant compute resources and lengthening development cycles.
Key Insights
- Responses API configuration: Codex CLI is configurable to use various endpoints implementing the Responses API, including ChatGPT, OpenAI hosted models, and local models via Ollama/LM Studio.
- Prompt Construction: The agent loop builds prompts by combining system instructions, tool definitions, and user input, prioritizing system and developer roles to influence the model’s behavior.
- Context Window Management: Codex employs prompt compaction—utilizing the Responses API’s
/responses/compactendpoint—to summarize conversation history, preserving context while avoiding limitations of the model context window.
Working Example
# Example of a simplified agent loop interaction (Conceptual)
def agent_loop(user_input, model, tools):
prompt = construct_prompt(user_input, model, tools)
response = model.inference(prompt)
if response.tool_call:
tool_output = execute_tool(response.tool_call, tools)
updated_prompt = update_prompt(prompt, tool_output)
return agent_loop(user_input, model, tools, updated_prompt) # Recursive call
else:
return response.assistant_message
def construct_prompt(user_input, model, tools):
# Combine instructions, tool definitions, and user input to create a prompt
return f"Instructions: {model.instructions}\nTools: {tools}\nUser Input: {user_input}"
def execute_tool(tool_call, tools):
# Execute the specified tool and return output
return tools[tool_call.tool_name](tool_call.arguments)
def update_prompt(prompt, tool_output):
# Append tool output to the prompt for the next iteration
return prompt + f"\nTool Output: {tool_output}"
Practical Applications
- Automated Code Modification (OpenAI Codex): The Codex CLI uses the agent loop to automatically write and modify code within a user’s environment.
- Prompt Injection Vulnerability: Improperly sandboxed tool calls can lead to prompt injection, allowing malicious input to override agent instructions.
References:
Continue reading
Next article
Vim for DevOps: Practical Editing Techniques for Remote Operations
Related Content
Engineering a Real-Time Robot Battle Simulator: Lessons in Performance and Language Design
A technical deep dive into Logic Arena, featuring a custom scripting language and the resolution of a 3,862ms scripting bottleneck.
Building a Production-Grade Async Job Queue: Engineering Resilience and Backpressure
A technical deep dive into building an async job queue with Redis Streams, achieving 85% test coverage and a sustained throughput of 56 req/s.
Salesforce's eVerse Simulates Realistic Customer Service Interactions
Salesforce’s eVerse simulation tool aims to improve AI agent performance in noisy, unpredictable call centers, achieving 84-88% coverage of routine inquiries.