Skip to main content

On This Page

Building Multi-Agent Systems with SmolAgents: Code Execution and Dynamic Orchestration

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

A Coding Implementation to Build Multi-Agent AI Systems with SmolAgents Using Code Execution, Tool Calling, and Dynamic Orchestration

SmolAgents by HuggingFace provides a minimalist framework for building agentic systems that reason through direct Python code execution. Starting with version 1.8.0, the framework simplified multi-agent orchestration by removing the ManagedAgent wrapper class in favor of direct sub-agent passing.

Why This Matters

Traditional LLM frameworks often rely on static tool definitions and rigid prompts, which struggle with complex multi-step reasoning. SmolAgents addresses this by utilizing a CodeAgent paradigm where the LLM writes actual Python logic to be executed in a sandbox, ensuring that mathematical and logical chains are computed rather than merely predicted.

Furthermore, the dynamic nature of the framework allows for runtime tool injection through a standard Python dictionary. This flexibility is critical for production environments where agents must adapt to new data sources or specialized utilities without requiring a full system restart or complex reconfiguration of the orchestration layer.

Key Insights

  • ManagedAgent wrapper class removal, 2026 (v1.8.0): SmolAgents simplified multi-agent patterns by allowing sub-agents to be passed directly via the managed_agents parameter.
  • CodeAgent Python Sandbox: Unlike standard ReAct agents, CodeAgents write and execute actual Python code to reach a final_answer, increasing accuracy for mathematical tasks.
  • Dynamic Tool Management: Tools are stored in a standard Python dictionary (agent.tools), enabling developers to inject new capabilities like a factorial tool during runtime.
  • LiteLLMModel Integration: The framework utilizes LiteLLMModel to support diverse backends, such as the gpt-4o-mini engine used for efficient reasoning.
  • Stateful Memory via MemoTool: Implementing a custom class-based Tool allows for persistent key-value storage across multiple agent steps, solving context retention issues.

Working Examples

Implementation of a CodeAgent with dynamic tool injection for mathematical reasoning.

from smolagents import CodeAgent, LiteLLMModel, Tool, tool
import math

@tool
def factorial(n: int) -> str:
    return f"{n}! = {math.factorial(n)}"

class PrimeTool(Tool):
    name = "prime_checker"
    description = "If composite, returns the smallest prime factor."
    inputs = {"n": {"type": "integer", "description": "Positive integer to test."}}
    output_type = "string"
    def forward(self, n: int) -> str:
        if n < 2: return f"{n} is not prime."
        for i in range(2, int(math.isqrt(n)) + 1):
            if n % i == 0: return f"{n} is NOT prime. Factor: {i}."
        return f"{n} IS prime!"

engine = LiteLLMModel(model_id="openai/gpt-4o-mini", api_key=OPENAI_API_KEY)
code_agent = CodeAgent(tools=[PrimeTool()], model=engine)
code_agent.tools["factorial"] = factorial
result = code_agent.run("Is 10! prime?")

Multi-agent orchestration using specialized sub-agents for research and calculation.

math_agent = CodeAgent(tools=[PrimeTool()], model=engine, name="math_specialist", description="Handles math and primality.")
research_agent = ToolCallingAgent(tools=[DuckDuckGoTool()], model=engine, name="research_specialist", description="Searches the web.")

manager_agent = CodeAgent(
    tools=[],
    model=engine,
    managed_agents=[math_agent, research_agent],
    max_steps=8
)

manager_agent.run("Find Python's release year and check if it is a prime number.")

Practical Applications

  • Use case: Mathematical verification systems using CodeAgent to execute factorials and primality tests. Pitfall: Over-reliance on LLM internal math without external tool verification, leading to calculation errors.
  • Use case: Automated research pipelines using DuckDuckGoTool and MemoTool to aggregate and store web facts. Pitfall: Lack of max_steps constraints leading to infinite search loops and excessive API costs.

References:

Continue reading

Next article

Build Persistent AI Memory: A Guide to Mem0, OpenAI, and ChromaDB Integration

Related Content