Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models

Google DeepMind researchers have introduced ATLAS, a set of scaling laws for multilingual language models, based on 774 controlled training runs across models ranging from 10 million to 8 billion parameters. The ATLAS framework estimates how individual languages contribute to or interfere with performance in others during training, providing a quantitative foundation for exploring modular or specialized multilingual designs.

Why This Matters

The introduction of ATLAS addresses the limitations of existing scaling laws, which are derived from English-only or single-language training regimes, providing limited guidance for models trained on multiple languages. By explicitly modeling cross-lingual transfer and the efficiency trade-offs introduced by multilingual training, ATLAS offers a more accurate understanding of the complexities involved in multilingual language models, allowing for more efficient and effective model development.

Key Insights

774 controlled training runs were conducted across models ranging from 10 million to 8 billion parameters, using multilingual data covering more than 400 languages: Google DeepMind, 2026
Cross-lingual transfer is strongly correlated with shared scripts and language families, with Scandinavian languages exhibiting mutual benefits: ATLAS study
Temporal and other workflow management tools can be used to optimize the training process for multilingual models: industry practice

Working Example

# Example of how to use the ATLAS framework to estimate the required model size and training data for a multilingual model
def estimate_model_size(num_languages, target_performance):
    # Calculate the required model size based on the ATLAS scaling laws
    model_size = 1.18 ** num_languages * target_performance
    return model_size

def estimate_training_data(num_languages, model_size):
    # Calculate the required training data based on the ATLAS scaling laws
    training_data = 1.66 ** num_languages * model_size
    return training_data

# Example usage:
num_languages = 10
target_performance = 0.8
model_size = estimate_model_size(num_languages, target_performance)
training_data = estimate_training_data(num_languages, model_size)
print(f"Required model size: {model_size}, Required training data: {training_data}")

Practical Applications

Use Case: Google DeepMind uses ATLAS to develop more efficient and effective multilingual language models, such as those used in Google Translate.
Pitfall: Failing to account for cross-lingual transfer and the efficiency trade-offs introduced by multilingual training can result in inefficient model development and reduced performance.

References:

On This Page

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models