Skip to main content

On This Page

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models

Google DeepMind researchers have introduced ATLAS, a set of scaling laws for multilingual language models, based on 774 controlled training runs across models ranging from 10 million to 8 billion parameters. The ATLAS framework estimates how individual languages contribute to or interfere with performance in others during training, providing a quantitative foundation for exploring modular or specialized multilingual designs.

Why This Matters

The introduction of ATLAS addresses the limitations of existing scaling laws, which are derived from English-only or single-language training regimes, providing limited guidance for models trained on multiple languages. By explicitly modeling cross-lingual transfer and the efficiency trade-offs introduced by multilingual training, ATLAS offers a more accurate understanding of the complexities involved in multilingual language models, allowing for more efficient and effective model development.

Key Insights

  • 774 controlled training runs were conducted across models ranging from 10 million to 8 billion parameters, using multilingual data covering more than 400 languages: Google DeepMind, 2026
  • Cross-lingual transfer is strongly correlated with shared scripts and language families, with Scandinavian languages exhibiting mutual benefits: ATLAS study
  • Temporal and other workflow management tools can be used to optimize the training process for multilingual models: industry practice

Working Example

# Example of how to use the ATLAS framework to estimate the required model size and training data for a multilingual model
def estimate_model_size(num_languages, target_performance):
    # Calculate the required model size based on the ATLAS scaling laws
    model_size = 1.18 ** num_languages * target_performance
    return model_size

def estimate_training_data(num_languages, model_size):
    # Calculate the required training data based on the ATLAS scaling laws
    training_data = 1.66 ** num_languages * model_size
    return training_data

# Example usage:
num_languages = 10
target_performance = 0.8
model_size = estimate_model_size(num_languages, target_performance)
training_data = estimate_training_data(num_languages, model_size)
print(f"Required model size: {model_size}, Required training data: {training_data}")

Practical Applications

  • Use Case: Google DeepMind uses ATLAS to develop more efficient and effective multilingual language models, such as those used in Google Translate.
  • Pitfall: Failing to account for cross-lingual transfer and the efficiency trade-offs introduced by multilingual training can result in inefficient model development and reduced performance.

References:


Continue reading

Next article

Google Disrupts IPIDEA — One of the World’s Largest Residential Proxy Networks

Related Content