Skip to main content

On This Page

Poolside AI Launches Laguna XS.2 and M.1: High-Performance Agentic Coding via MoE

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

Poolside AI has unveiled the Laguna M.1 and XS.2 models, built on a Mixture-of-Experts architecture to optimize inference efficiency for coding tasks. The flagship M.1 model achieved 72.5% on the SWE-bench Verified benchmark after being trained from scratch on 30 trillion tokens.

Why This Matters

Scaling large language models for software engineering often faces a trade-off between total parameter count and inference cost. By utilizing Mixture-of-Experts (MoE) with only 3B to 23B activated parameters, Poolside addresses the compute bottleneck while maintaining the capability headroom of 33B to 225B total parameters. Furthermore, their use of the Muon optimizer over standard AdamW reduces training steps by 15%, demonstrating that architectural and algorithmic efficiency is critical for achieving high-performance agentic coding on consumer hardware like a 36GB RAM Mac.

Key Insights

  • Laguna XS.2 achieves 68.2% on SWE-bench Verified in 2026 using only 3B activated parameters, enabling local execution via Ollama.
  • The Muon optimizer reduces memory requirements by storing only one state per parameter, outperforming AdamW efficiency by 15% during Laguna’s 30T token pre-training.
  • Poolside utilizes AutoMixer to optimize data curation, training 60 proxy models to map how different dataset proportions affect STEM and code performance.
  • The XS.2 architecture employs a 3:1 ratio of Sliding Window Attention (SWA) to global attention across 40 layers to minimize KV cache memory overhead.
  • Async On-Policy Agent RL uses GPUDirect RDMA to transfer hundreds of gigabytes of BF16 weights between nodes in under 5 seconds for training stability.

Practical Applications

  • Local Agentic Coding: XS.2 runs on 36GB RAM Macs for long-horizon tasks; pitfall: disabling interleaved thinking during tool calls can significantly reduce reasoning depth.
  • Large-Scale Repository Maintenance: Laguna M.1 handles SWE-bench Pro tasks at a 46.9% success rate; pitfall: global deduplication in data pipelines can disproportionately remove high-quality code samples.

References:

Continue reading

Next article

Standardizing AI Context with @agent Code Annotations

Related Content