Skip to main content

On This Page

NVIDIA Nemotron-Cascade 2: High-Density 30B MoE with Gold Medal Reasoning

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

NVIDIA Releases Nemotron-Cascade 2: An Open 30B MoE with 3B Active Parameters, Delivering Better Reasoning and Strong Agentic Capabilities

NVIDIA has launched Nemotron-Cascade 2, an open-weight 30B Mixture-of-Experts model with 3B activated parameters. It is the second open-weight LLM to achieve Gold Medal-level performance in the 2025 International Mathematical Olympiad and ICPC World Finals.

Why This Matters

Frontier-scale intelligence often requires massive parameter counts, leading to high inference costs and latency. Nemotron-Cascade 2 shifts the focus to ‘intelligence density,’ proving that domain-specific reinforcement learning and on-policy distillation can deliver state-of-the-art reasoning in math and coding at a fraction of the scale used by 100B+ parameter models.

Key Insights

  • Superior mathematical reasoning: Nemotron-Cascade 2 scored 92.4 on AIME 2025, outperforming Qwen3.5-35B-A3B’s score of 91.9.
  • Enhanced coding performance: The model achieved 439.28 on IOI 2025, significantly higher than Qwen3.5-35B-A3B’s 348.6.
  • Multi-Domain On-Policy Distillation (MOPD): This technique reached AIME25 teacher-level performance in 30 steps, proving more sample-efficient than GRPO.
  • Extended context training: NVIDIA utilized a curated dataset with sequences packed up to 256K tokens during the SFT phase, including 1.9M Python reasoning traces.
  • Instruction following excellence: The model scored 83.5 on ArenaHard v2, surpassing the larger Nemotron-3-Super-120B-A12B in alignment benchmarks.

Practical Applications

  • Competitive Programming and Math: Leverage Thinking Mode by initiating prompts with the token for complex logic. Pitfall: Using direct responses for multi-step proofs may bypass the model’s specialized reasoning traces.
  • Agentic Tool Interaction: Implement structured tool-calling within <tool_call> tags for verifiable software engineering workflows. Pitfall: Failing to provide the tool list within tags in the system prompt prevents the model from correctly formatting requests.

References:

Continue reading

Next article

OpenGitClaw: The Autonomous AI Agent for Full-Scale GitHub Repo Maintenance

Related Content