Skip to main content

On This Page

Fast Eventual Consistency: Inside Corrosion, the Distributed System Powering Fly.io

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Transcript

Fly.io developed Corrosion, a distributed system prioritizing speed over strict consistency, to manage global machine data and power its developer-focused cloud platform supporting deployments across 40 regions. Corrosion replicates SQLite data across nodes, employing CRDTs for conflict resolution and the SWIM protocol for cluster membership, achieving low-latency updates for services like the Fly Proxy.

Why This Matters

Traditional distributed databases often prioritize strong consistency (ACID properties) at the cost of latency and scalability. Corrosion opts for eventual consistency to achieve faster replication and improved performance—a trade-off essential for a global CDN-like application where responsiveness is paramount. Failure to achieve low-latency replication at Fly.io could result in suboptimal routing decisions for users and negatively impact application performance.

Key Insights

  • 800 physical servers: Corrosion operates at scale across Fly.io’s infrastructure (2026-01-09).
  • CRDTs over ACID: The system leverages Conflict-free Replicated Data Types (CRDTs) to handle data conflicts instead of traditional ACID transactions, optimizing for speed in an eventually consistent model.
  • QUIC Transport Protocol: Corrosion uses QUIC, aiming to reduce latency and improve reliability compared to TCP for data propagation.

Working Example

// Example showing a simple update to Corrosion via HTTP API
curl -X POST \
  -H "Content-Type: application/sql" \
  -d "UPDATE machines SET status = 'healthy' WHERE id = 'machine-123';" \
  http://corrosion-api:8080/sql

Practical Applications

  • Fly.io Proxy: Corrosion provides real-time machine health and location data to the Fly Proxy for optimal traffic routing.
  • Pitfall: Inconsistent data resulting from eventual consistency can lead to brief periods of incorrect routing decisions if applications aren’t designed to tolerate stale data.

References:

Continue reading

Next article

Russian APT28 Runs Credential-Stealing Campaign Targeting Energy and Policy Organizations

Related Content