Nous Research Releases NousCoder-14B: A Competitive Olympiad Programming Model

Nous Research has launched NousCoder-14B, a competitive programming model built upon Qwen3-14B and refined through reinforcement learning (RL) with verifiable rewards. On the LiveCodeBench v6 benchmark, covering problems from August 1, 2024, to January 5, 2025, the model attains a Pass@1 accuracy of 67.87%.

Why This Matters

Current large language models (LLMs) often struggle with complex reasoning and code execution required for competitive programming. While models like Qwen3-14B demonstrate strong general capabilities, they lack the specialized training needed to reliably solve algorithmic challenges under strict constraints. The cost of failure in these scenarios isn’t just incorrect output; it’s wasted compute and developer time, especially in automated code generation pipelines.

Key Insights

LiveCodeBench v6 Benchmark: Designed for evaluating competitive programming skills, consisting of 454 problems.
GRPO Objectives: DAPO, GSPO, and GSPO+ are RL objectives tested for long context code generation, all normalizing rewards within groups.
Modal for Scalable Execution: Used by Nous Research to safely execute untrusted code at scale during the RL training process.

Working Example

# Example of a simple competitive programming problem and solution
# (Illustrative - not directly from the research, but representative)

def solve():
    n = int(input())
    a = list(map(int, input().split()))
    
    a.sort()
    
    print(a[0])

solve()

Practical Applications

Automated Code Generation: Companies like GitHub Copilot could integrate NousCoder-14B to improve the accuracy of code suggestions for competitive programming tasks.
Pitfall: Relying solely on LLM-generated code without rigorous testing can lead to incorrect solutions, especially when strict time and memory limits are in place.

References:

https://www.marktechpost.com/2026/01/18/nous-research-releases-nouscoder-14b-a-competitive-olympiad-programming-model-post-trained-on-qwen3-14b-via-reinforcement-learning/

On This Page

Nous Research Releases NousCoder-14B: A Competitive Olympiad Programming Model