Skip to main content
cracking the tech interview system design and algorithms in java 25

Design a Ride-Sharing Service

16 min read Chapter 4 of 75
Summary

Covers geospatial indexing with QuadTrees and GeoHash, real-time...

Covers geospatial indexing with QuadTrees and GeoHash, real-time matching algorithms, trip state machines, WebSocket-based location streaming, and surge pricing.

Design a Ride-Sharing Service

A ride-sharing platform like Uber or Lyft coordinates millions of drivers and passengers in real time. The system must track moving vehicles, match riders to the closest available driver within seconds, manage the full lifecycle of a trip, process payments, and handle surge pricing — all at massive scale with strict latency requirements. This problem tests your ability to reason about geospatial data structures, real-time streaming, state machines, and distributed consistency.

Requirements

Functional Requirements

  • Request a ride — a passenger specifies pickup and dropoff locations, selects a ride type (economy, premium, pool), and receives a fare estimate.
  • Match a driver — the system finds the nearest available driver, sends a ride offer, and handles acceptance or timeout.
  • Real-time tracking — both driver and passenger see live location updates on a map throughout the trip.
  • Trip lifecycle management — track trip state from request through completion or cancellation.
  • Payment processing — calculate fare based on distance, time, and surge multiplier; charge the passenger and pay the driver.
  • Ratings — both parties rate each other after trip completion.

Non-Functional Requirements

  • Matching latency — a driver must be matched within 1 second of ride request in 95% of cases.
  • Scale — support 1 million+ concurrent active drivers reporting locations simultaneously.
  • Availability99.99% uptime — ride requests must never fail due to infrastructure issues.
  • Location freshness — driver positions must be no more than 5 seconds stale on the passenger’s map.
  • Consistency — trip state must be consistent across driver and passenger views within 1 second.

Capacity Estimation

Active drivers: 1 million concurrent drivers at peak hours.

Location updates: Each driver sends a GPS update every 3 seconds via WebSocket.

  • 1M drivers × (1 update / 3 seconds) = ~333,000 location updates per second.

Rides per day: 15 million rides/day globally. At peak: ~500 ride requests per second.

Storage for location updates:

  • Each update: driver ID (16 bytes) + latitude (8 bytes) + longitude (8 bytes) + timestamp (8 bytes) + heading (4 bytes) + speed (4 bytes) = 48 bytes.
  • 333K updates/sec × 48 bytes = ~16 MB/second of raw location data.
  • Retain hot data (last 5 minutes) in memory: 333K × 100 × 48 bytes ≈ 1.6 GB — fits comfortably in a single server’s RAM, though we shard for throughput.

Matching computation: Each ride request searches for drivers within a radius. With spatial indexing, this query touches ~50-200 nearby drivers rather than scanning all 1M.

High-Level Design

┌──────────────┐                            ┌──────────────────┐
│   Passenger   │◄── WebSocket ──►          │   Driver App      │
│   App         │                           │                   │
└──────┬───────┘                            └────────┬──────────┘
       │ REST / WebSocket                            │ WebSocket (GPS)
       ▼                                             ▼
┌──────────────────────────────────────────────────────────────────┐
│                        API Gateway / Load Balancer                │
└──────┬──────────┬──────────────┬──────────────┬─────────────────┘
       │          │              │              │
       ▼          ▼              ▼              ▼
┌──────────┐ ┌──────────┐ ┌──────────────┐ ┌────────────────┐
│  Ride     │ │ Matching  │ │  Location    │ │  Notification  │
│  Service  │ │ Service   │ │  Service     │ │  Service       │
└─────┬────┘ └─────┬────┘ └──────┬───────┘ └────────────────┘
      │            │             │
      ▼            ▼             ▼
┌──────────┐ ┌──────────┐ ┌──────────────┐
│ Trip DB   │ │ Driver   │ │  Geo Index   │
│ (Postgres)│ │ Index    │ │  (Redis +    │
│           │ │ (Memory) │ │   QuadTree)  │
└──────────┘ └──────────┘ └──────────────┘


┌──────────────┐     ┌──────────────┐
│  Payment     │     │   Kafka      │
│  Service     │     │  (Events)    │
└──────────────┘     └──────────────┘

Service responsibilities:

  • Ride Service — handles ride requests, fare estimation, trip state management, and cancellation logic.
  • Matching Service — queries the geo index for nearby available drivers, ranks them, and dispatches offers.
  • Location Service — ingests GPS streams from drivers via WebSocket, updates the spatial index, and fans out location data to passengers tracking their driver.
  • Payment Service — calculates fares, processes charges, and handles driver payouts.
  • Notification Service — pushes ride offers to drivers, trip updates to passengers via push notifications and WebSocket.

Deep Dive

Location Tracking & Geospatial Indexing

Drivers send GPS coordinates every 3 seconds over a persistent WebSocket connection. HTTP polling would waste bandwidth and increase latency — WebSocket provides a full-duplex channel with minimal overhead per update.

GeoHash for Spatial Indexing:

GeoHash encodes a latitude/longitude pair into a string where shared prefixes indicate spatial proximity. A GeoHash of precision 6 (e.g., 9q8yyk) represents a cell approximately 1.2 km × 600 m. Precision 7 narrows to ~150 m × 150 m.

record Location(double latitude, double longitude, long timestampMs) {
    // Validate coordinates at construction
    Location {
        if (latitude < -90 || latitude > 90) throw new IllegalArgumentException("Invalid latitude: " + latitude);
        if (longitude < -180 || longitude > 180) throw new IllegalArgumentException("Invalid longitude: " + longitude);
    }
}

record GeoHashCell(String hash, int precision) {
    static GeoHashCell encode(Location loc, int precision) {
        double latMin = -90, latMax = 90;
        double lonMin = -180, lonMax = 180;
        var sb = new StringBuilder();
        String base32 = "0123456789bcdefghjkmnpqrstuvwxyz";
        boolean isLon = true;
        int bit = 0, charIndex = 0;

        while (sb.length() < precision) {
            if (isLon) {
                double mid = (lonMin + lonMax) / 2;
                if (loc.longitude() >= mid) { charIndex |= (1 << (4 - bit)); lonMin = mid; }
                else { lonMax = mid; }
            } else {
                double mid = (latMin + latMax) / 2;
                if (loc.latitude() >= mid) { charIndex |= (1 << (4 - bit)); latMin = mid; }
                else { latMax = mid; }
            }
            isLon = !isLon;
            if (++bit == 5) {
                sb.append(base32.charAt(charIndex));
                bit = 0;
                charIndex = 0;
            }
        }
        return new GeoHashCell(sb.toString(), precision);
    }

    // Return this cell and its 8 neighbors for boundary-safe queries
    List<String> withNeighbors() {
        // Compute adjacent geohash cells to handle edge cases
        // where a driver is near a cell boundary
        return List.of(hash /* + 8 neighbor hashes computed via bit manipulation */);
    }
}

Why not use GeoHash alone? GeoHash cells are fixed-size rectangles. In dense urban areas, a single cell at precision 6 might contain thousands of drivers, while a rural cell contains none. A QuadTree adapts to density.

QuadTree for Dynamic Partitioning:

A QuadTree recursively subdivides a 2D region into four quadrants. A node splits when it exceeds a threshold (e.g., 500 drivers), producing smaller cells in dense areas and larger cells in sparse areas.

sealed interface QuadTreeNode permits QuadTreeNode.Leaf, QuadTreeNode.Internal {

    record BoundingBox(double minLat, double maxLat, double minLon, double maxLon) {
        boolean contains(Location loc) {
            return loc.latitude() >= minLat && loc.latitude() <= maxLat
                && loc.longitude() >= minLon && loc.longitude() <= maxLon;
        }

        BoundingBox[] subdivide() {
            double midLat = (minLat + maxLat) / 2;
            double midLon = (minLon + maxLon) / 2;
            return new BoundingBox[]{
                new BoundingBox(midLat, maxLat, minLon, midLon),   // NW
                new BoundingBox(midLat, maxLat, midLon, maxLon),   // NE
                new BoundingBox(minLat, midLat, minLon, midLon),   // SW
                new BoundingBox(minLat, midLat, midLon, maxLon)    // SE
            };
        }
    }

    record Leaf(BoundingBox bounds, List<DriverLocation> drivers, int maxCapacity)
        implements QuadTreeNode {

        QuadTreeNode insert(DriverLocation driver) {
            if (!bounds.contains(driver.location())) return this;
            if (drivers.size() < maxCapacity) {
                var updated = new ArrayList<>(drivers);
                updated.add(driver);
                return new Leaf(bounds, List.copyOf(updated), maxCapacity);
            }
            // Split into Internal node
            var quadrants = bounds.subdivide();
            var children = new QuadTreeNode[4];
            for (int i = 0; i < 4; i++) {
                children[i] = new Leaf(quadrants[i], List.of(), maxCapacity);
            }
            var internal = new Internal(bounds, children);
            for (var d : drivers) internal = (Internal) internal.insert(d);
            return internal.insert(driver);
        }
    }

    record Internal(BoundingBox bounds, QuadTreeNode[] children) implements QuadTreeNode {
        QuadTreeNode insert(DriverLocation driver) {
            for (int i = 0; i < children.length; i++) {
                if (children[i] instanceof Leaf leaf && leaf.bounds().contains(driver.location())) {
                    var updated = children.clone();
                    updated[i] = leaf.insert(driver);
                    return new Internal(bounds, updated);
                }
            }
            return this;
        }
    }
}

record DriverLocation(String driverId, Location location, boolean available) {}

In practice, the Location Service uses Redis Geo (GEOADD, GEOSEARCH) for the persistent index and maintains an in-memory QuadTree for sub-millisecond nearest-neighbor queries in the Matching Service.

Driver-Passenger Matching

When a passenger requests a ride, the Matching Service:

  1. Queries the geo index for all available drivers within a radius (e.g., 3 km). If too few results, expand the radius incrementally.
  2. Computes estimated time of arrival (ETA) for each candidate by calling a routing service.
  3. Ranks drivers by ETA (primary) and rating (secondary).
  4. Sends a ride offer to the top-ranked driver with a 15-second acceptance timeout.
  5. If the driver declines or times out, offer to the next driver.
record RideRequest(String passengerId, Location pickup, Location dropoff, RideType rideType) {}

enum RideType { ECONOMY, PREMIUM, POOL }

record DriverCandidate(String driverId, Location location, double distanceKm, double etaSeconds, double rating) {}

class MatchingService {
    private final GeoIndex geoIndex;
    private final RoutingService routingService;

    List<DriverCandidate> findCandidates(RideRequest request, double radiusKm, int maxCandidates) {
        var nearbyDrivers = geoIndex.findNearby(request.pickup(), radiusKm);

        // Compute ETAs concurrently using virtual threads
        try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
            var tasks = nearbyDrivers.stream()
                .filter(DriverLocation::available)
                .limit(maxCandidates * 2) // fetch extra in case some are unavailable
                .map(driver -> scope.fork(() -> {
                    double eta = routingService.computeEta(driver.location(), request.pickup());
                    double distance = haversineDistance(driver.location(), request.pickup());
                    return new DriverCandidate(
                        driver.driverId(), driver.location(), distance, eta, getDriverRating(driver.driverId())
                    );
                }))
                .toList();

            scope.join().throwIfFailed();

            return tasks.stream()
                .map(StructuredTaskScope.Subtask::get)
                .sorted(Comparator.comparingDouble(DriverCandidate::etaSeconds)
                    .thenComparing(Comparator.comparingDouble(DriverCandidate::rating).reversed()))
                .limit(maxCandidates)
                .toList();
        }
    }

    static double haversineDistance(Location a, Location b) {
        double R = 6371; // Earth radius in km
        double dLat = Math.toRadians(b.latitude() - a.latitude());
        double dLon = Math.toRadians(b.longitude() - a.longitude());
        double sinLat = Math.sin(dLat / 2);
        double sinLon = Math.sin(dLon / 2);
        double c = 2 * Math.asin(Math.sqrt(
            sinLat * sinLat + Math.cos(Math.toRadians(a.latitude()))
                * Math.cos(Math.toRadians(b.latitude())) * sinLon * sinLon
        ));
        return R * c;
    }
}

The use of StructuredTaskScope with virtual threads enables concurrent ETA computation for all candidate drivers. Computing 50 ETAs sequentially at 20ms each would take 1 second — with virtual threads, all 50 execute concurrently, completing in ~20-40ms.

Trip State Machine

A trip progresses through a well-defined set of states. Modeling this as a state machine enforces valid transitions and prevents illegal state changes (e.g., jumping from REQUESTED to COMPLETED).

States and transitions:

REQUESTED ──► MATCHED ──► DRIVER_EN_ROUTE ──► ARRIVED ──► IN_PROGRESS ──► COMPLETED
    │             │              │                │            │
    ▼             ▼              ▼                ▼            ▼
 CANCELLED    CANCELLED     CANCELLED         CANCELLED    CANCELLED
sealed interface TripState permits
    TripState.Requested, TripState.Matched, TripState.DriverEnRoute,
    TripState.Arrived, TripState.InProgress, TripState.Completed,
    TripState.Cancelled {

    record Requested(String tripId, String passengerId, Location pickup, Location dropoff, long requestedAt)
        implements TripState {}

    record Matched(String tripId, String driverId, double etaSeconds, long matchedAt)
        implements TripState {}

    record DriverEnRoute(String tripId, String driverId, Location driverLocation, long updatedAt)
        implements TripState {}

    record Arrived(String tripId, String driverId, long arrivedAt)
        implements TripState {}

    record InProgress(String tripId, Location currentLocation, double distanceTraveledKm, long startedAt)
        implements TripState {}

    record Completed(String tripId, double totalDistanceKm, long durationSeconds, double fareAmount, long completedAt)
        implements TripState {}

    record Cancelled(String tripId, String reason, String cancelledBy, long cancelledAt)
        implements TripState {}
}

// State transition function — enforces valid transitions at compile time
sealed interface TripEvent permits
    TripEvent.DriverAssigned, TripEvent.DriverDeparted, TripEvent.DriverArrived,
    TripEvent.TripStarted, TripEvent.TripEnded, TripEvent.TripCancelled {

    record DriverAssigned(String driverId, double etaSeconds) implements TripEvent {}
    record DriverDeparted() implements TripEvent {}
    record DriverArrived() implements TripEvent {}
    record TripStarted() implements TripEvent {}
    record TripEnded(double distanceKm, long durationSeconds, double fare) implements TripEvent {}
    record TripCancelled(String reason, String cancelledBy) implements TripEvent {}
}

TripState transition(TripState current, TripEvent event) {
    return switch (current) {
        case TripState.Requested r -> switch (event) {
            case TripEvent.DriverAssigned e ->
                new TripState.Matched(r.tripId(), e.driverId(), e.etaSeconds(), System.currentTimeMillis());
            case TripEvent.TripCancelled e ->
                new TripState.Cancelled(r.tripId(), e.reason(), e.cancelledBy(), System.currentTimeMillis());
            default -> throw new IllegalStateException("Invalid transition from REQUESTED: " + event);
        };
        case TripState.Matched m -> switch (event) {
            case TripEvent.DriverDeparted _ ->
                new TripState.DriverEnRoute(m.tripId(), m.driverId(), null, System.currentTimeMillis());
            case TripEvent.TripCancelled e ->
                new TripState.Cancelled(m.tripId(), e.reason(), e.cancelledBy(), System.currentTimeMillis());
            default -> throw new IllegalStateException("Invalid transition from MATCHED: " + event);
        };
        case TripState.DriverEnRoute d -> switch (event) {
            case TripEvent.DriverArrived _ ->
                new TripState.Arrived(d.tripId(), d.driverId(), System.currentTimeMillis());
            case TripEvent.TripCancelled e ->
                new TripState.Cancelled(d.tripId(), e.reason(), e.cancelledBy(), System.currentTimeMillis());
            default -> throw new IllegalStateException("Invalid transition from DRIVER_EN_ROUTE: " + event);
        };
        case TripState.Arrived a -> switch (event) {
            case TripEvent.TripStarted _ ->
                new TripState.InProgress(a.tripId(), null, 0, System.currentTimeMillis());
            case TripEvent.TripCancelled e ->
                new TripState.Cancelled(a.tripId(), e.reason(), e.cancelledBy(), System.currentTimeMillis());
            default -> throw new IllegalStateException("Invalid transition from ARRIVED: " + event);
        };
        case TripState.InProgress ip -> switch (event) {
            case TripEvent.TripEnded e ->
                new TripState.Completed(ip.tripId(), e.distanceKm(), e.durationSeconds(), e.fare(), System.currentTimeMillis());
            case TripEvent.TripCancelled e ->
                new TripState.Cancelled(ip.tripId(), e.reason(), e.cancelledBy(), System.currentTimeMillis());
            default -> throw new IllegalStateException("Invalid transition from IN_PROGRESS: " + event);
        };
        case TripState.Completed _, TripState.Cancelled _ ->
            throw new IllegalStateException("Cannot transition from terminal state: " + current);
    };
}

Event sourcing stores every TripEvent in an append-only log (Kafka topic partitioned by trip ID). The current state is reconstructed by replaying events. This provides a complete audit trail and enables replaying history for debugging or analytics.

Surge Pricing

Surge pricing balances supply and demand by increasing fares when demand exceeds supply in a geographic area. The system divides the city into GeoHash cells (precision 5, ~5 km²) and computes a surge multiplier for each cell.

record SurgeData(String geohashCell, int activeDrivers, int pendingRequests) {
    double surgeMultiplier() {
        if (activeDrivers == 0) return 5.0; // cap at 5x
        double demandSupplyRatio = (double) pendingRequests / activeDrivers;
        return switch (demandSupplyRatio) {
            case double r when r <= 0.5 -> 1.0;       // surplus of drivers
            case double r when r <= 1.0 -> 1.0;       // balanced
            case double r when r <= 1.5 -> 1.2;       // slightly elevated
            case double r when r <= 2.0 -> 1.5;
            case double r when r <= 3.0 -> 2.0;
            case double r when r <= 5.0 -> 3.0;
            default -> 5.0;                           // hard cap
        };
    }
}

Surge multipliers are recalculated every 30 seconds and cached. The passenger sees the multiplier before confirming the ride. Transparent pricing requires showing the estimated fare including the surge.

Data Synchronization

Both the driver and passenger must see a consistent view of the trip state. This requires:

Event-driven architecture with Kafka:

Every state transition publishes a TripEvent to a Kafka topic. The Ride Service, the driver app, the passenger app, and the analytics pipeline all consume from this topic. Kafka guarantees ordering within a partition (partitioned by trip ID), ensuring events are processed in the correct sequence.

record TripEventEnvelope(
    String tripId,
    TripEvent event,
    TripState resultingState,
    long timestamp,
    int version
) {}

Optimistic locking for concurrent state updates:

Two services might attempt to update the same trip simultaneously (e.g., the driver cancels while the payment service completes). Use a version field on the trip record:

// In the database update
int rowsUpdated = db.execute("""
    UPDATE trips
    SET state = ?, version = version + 1, updated_at = NOW()
    WHERE trip_id = ? AND version = ?
    """, newState, tripId, expectedVersion);

if (rowsUpdated == 0) {
    // Concurrent modification detected — reload and retry or reject
    throw new OptimisticLockException("Trip " + tripId + " was modified concurrently");
}

WebSocket fan-out for real-time updates:

The passenger tracking their driver receives location updates through a WebSocket connection managed by the Notification Service. When the Location Service receives a driver’s GPS update, it publishes to a Kafka topic. The Notification Service consumes this and fans out the location to the connected passenger.

Bottlenecks & Scaling

Location service hotspots:

Dense urban areas generate disproportionate location update traffic. Shard the Location Service by GeoHash prefix — all drivers in 9q8y* (San Francisco) route to shard 1, while dr5r* (New York) routes to shard 2. This ensures even load distribution based on geographic density.

Matching latency:

Computing ETAs for nearby drivers is the bottleneck in the matching pipeline. Pre-compute a driver ETA index — for each GeoHash cell, maintain a sorted list of available drivers with pre-computed ETAs to the cell center. When a ride request arrives, look up the cell’s pre-computed list instead of computing ETAs on demand. Refresh the index every 10 seconds.

Payment failures — saga pattern:

A ride involves multiple transactional steps: charge the passenger, pay the driver, update the trip record. If any step fails, the saga pattern orchestrates compensating actions:

  1. Charge passenger → success.
  2. Update trip status → success.
  3. Pay driver → failure.
  4. Compensate: refund passenger, revert trip status, alert support.
sealed interface SagaStep permits SagaStep.Charge, SagaStep.UpdateTrip, SagaStep.PayDriver {
    void execute() throws SagaException;
    void compensate();

    record Charge(String passengerId, double amount) implements SagaStep {
        public void execute() { /* charge via payment gateway */ }
        public void compensate() { /* issue refund */ }
    }

    record UpdateTrip(String tripId) implements SagaStep {
        public void execute() { /* mark trip as paid */ }
        public void compensate() { /* revert to completed unpaid */ }
    }

    record PayDriver(String driverId, double amount) implements SagaStep {
        public void execute() { /* transfer to driver account */ }
        public void compensate() { /* no-op, driver was never paid */ }
    }
}

Database scaling:

Trip data grows linearly with rides. Partition the trips table by date range (monthly partitions) and archive completed trips older than 90 days to cold storage (S3 + Athena for analytics). Active trips stay in PostgreSQL with read replicas for the driver and passenger apps.

Interviewer Tips

Common follow-up questions:

  1. “How do you handle drivers going offline mid-trip?” — The Location Service detects a stale driver (no updates for 30+ seconds) and triggers an alert. The Ride Service can reassign the trip or notify the passenger. Stale detection uses a TTL on the driver’s location record in Redis.

  2. “How would you design ride pooling?” — Extend the Matching Service to consider detour time when inserting a new passenger into an existing pool trip. Use a constrained optimization algorithm that minimizes total detour across all passengers while respecting a maximum detour threshold (e.g., 10 minutes).

  3. “Why not use a relational database with PostGIS for geospatial queries?” — PostGIS works well for static or slowly-changing data. For 333K location updates/second with sub-millisecond query latency, an in-memory spatial index (QuadTree or Redis Geo) outperforms disk-backed PostGIS by orders of magnitude.

  4. “How do you prevent surge price manipulation?” — Lock the surge multiplier at the time of ride request confirmation. Even if surge changes between confirmation and pickup, the passenger pays the confirmed rate. This prevents gaming by delaying confirmation.

  5. “What about international expansion?” — Deploy independent regional clusters (US, EU, APAC) with separate databases and geo indexes. Trip data does not need cross-region replication since rides are inherently local. Share only user account data across regions via eventual consistency.

Red flags interviewers watch for:

  • Not addressing the 333K location updates/sec throughput challenge.
  • Using HTTP polling instead of WebSocket for real-time location streaming.
  • Ignoring the state machine — allowing arbitrary trip state transitions.
  • Not discussing what happens when the matching service cannot find a driver.
  • Overlooking the consistency requirements between driver and passenger views of a trip.