Capacity Planning and Tenant Growth Modeling

The Symptom

The documentation platform grows from 50 to 200 tenants in 6 months. Each new tenant onboards with 5,000-50,000 documents. The operations team adds nodes reactively—when disk hits 80% or search latency exceeds the SLA. Every scaling event is an emergency: order hardware, configure nodes, rebalance shards, all while the cluster is under pressure.

The Internals

Capacity planning for a multi-tenant search cluster requires modeling three resources:

Storage. Total data volume across all tenants, including replicas and segment overhead. OpenSearch stores data at roughly 1.1x the raw JSON size (inverted index, doc values, stored fields overhead) plus replica copies.
Memory. Heap memory per node for caches, in-flight requests, and cluster state. The cluster state grows linearly with shard count. Caches grow with query diversity.
Compute. CPU for indexing and search. Indexing CPU is proportional to write throughput. Search CPU is proportional to query volume x query complexity.

The Implementation

Per-Tenant Resource Estimation

public class TenantCapacityEstimator {

    // Storage estimation constants
    private static final double INDEX_OVERHEAD_FACTOR = 1.15;  // 15% overhead
    private static final int REPLICA_COUNT = 1;

    public record TenantCapacity(
        String tenantId,
        long documentCount,
        long estimatedStorageBytes,
        double estimatedStorageGB,
        double estimatedHeapMB,
        double estimatedDailyQueryLoad,
        String recommendedStrategy  // shared or dedicated
    ) {}

    public TenantCapacity estimate(String tenantId, long documentCount,
            long avgDocSizeBytes, double dailyQueryVolume) {

        // Storage = docs * avg_size * overhead * (1 + replicas)
        long storageBytes = (long) (documentCount * avgDocSizeBytes *
            INDEX_OVERHEAD_FACTOR * (1 + REPLICA_COUNT));
        double storageGB = storageBytes / (1024.0 * 1024.0 * 1024.0);

        // Heap: ~1KB per 1000 documents for field data and caches
        double heapMB = documentCount / 1000.0 * 0.001 * 1024;

        // Dedicated index threshold
        String strategy = documentCount > 500_000 || storageGB > 10
            ? "dedicated" : "shared";

        return new TenantCapacity(
            tenantId, documentCount, storageBytes, storageGB,
            heapMB, dailyQueryVolume, strategy
        );
    }
}

Cluster Capacity Model

public class ClusterCapacityModel {

    public record ClusterCapacity(
        int currentNodes,
        double totalStorageUsedGB,
        double totalStorageCapacityGB,
        double storageUtilization,
        int totalShards,
        double avgShardsPerNode,
        double heapUsedPercent,
        int tenantsSupported,
        int additionalTenantsBeforeScaling,
        String scalingRecommendation
    ) {}

    public ClusterCapacity evaluate(
            List<TenantCapacityEstimator.TenantCapacity> tenants,
            int nodeCount,
            double storagePerNodeGB,
            double heapPerNodeGB) {

        double totalStorageUsed = tenants.stream()
            .mapToDouble(t -> t.estimatedStorageGB())
            .sum();

        double totalCapacity = nodeCount * storagePerNodeGB;
        double utilization = totalStorageUsed / totalCapacity;

        // Shard count estimation
        int sharedTenants = (int) tenants.stream()
            .filter(t -> t.recommendedStrategy().equals("shared"))
            .count();
        int dedicatedTenants = (int) tenants.stream()
            .filter(t -> t.recommendedStrategy().equals("dedicated"))
            .count();
        int totalShards = 20 + (dedicatedTenants * 4);  // shared: 10 primary + 10 replica

        double avgShardsPerNode = (double) totalShards / nodeCount;

        // Heap estimation
        double totalHeapNeeded = tenants.stream()
            .mapToDouble(t -> t.estimatedHeapMB())
            .sum();
        double totalHeapAvailable = nodeCount * heapPerNodeGB * 1024;
        double heapPercent = totalHeapNeeded / totalHeapAvailable * 100;

        // How many more tenants can we add?
        double avgTenantStorageGB = totalStorageUsed / tenants.size();
        double remainingStorageGB = (totalCapacity * 0.75) - totalStorageUsed;
        int additionalTenants = (int) (remainingStorageGB / avgTenantStorageGB);

        String recommendation;
        if (utilization > 0.75) {
            recommendation = "Scale now: storage utilization above 75%";
        } else if (avgShardsPerNode > 600) {
            recommendation = "Scale now: shard count per node too high";
        } else if (heapPercent > 70) {
            recommendation = "Scale now: heap pressure above 70%";
        } else if (additionalTenants < 10) {
            recommendation = "Plan scaling: fewer than 10 tenants until capacity";
        } else {
            recommendation = "Healthy: capacity for " + additionalTenants +
                " additional tenants";
        }

        return new ClusterCapacity(
            nodeCount, totalStorageUsed, totalCapacity, utilization,
            totalShards, avgShardsPerNode, heapPercent,
            tenants.size(), Math.max(0, additionalTenants), recommendation
        );
    }
}

Growth Projection

public record GrowthProjection(
    int month,
    int projectedTenants,
    double projectedStorageGB,
    int requiredNodes,
    boolean scalingRequired
) {}

public List<GrowthProjection> projectGrowth(
        int currentTenants, double currentStorageGB,
        double monthlyGrowthRate, int months,
        double storagePerNodeGB, int currentNodes) {

    List<GrowthProjection> projections = new ArrayList<>();

    for (int month = 1; month <= months; month++) {
        int projected = (int) (currentTenants *
            Math.pow(1 + monthlyGrowthRate, month));
        double projectedStorage = currentStorageGB *
            Math.pow(1 + monthlyGrowthRate, month);

        int requiredNodes = (int) Math.ceil(
            projectedStorage / (storagePerNodeGB * 0.75));

        projections.add(new GrowthProjection(
            month, projected, projectedStorage,
            requiredNodes, requiredNodes > currentNodes
        ));
    }

    return projections;
}

The Measurement

12-month growth projection for the documentation platform:

Month	Tenants	Storage (GB)	Shards	Nodes Required	Action
0	50	120	220	4	Current
3	65	160	280	4	Monitor
6	85	210	360	5	Add 1 node
9	110	280	460	6	Add 1 node
12	145	370	600	7	Add 1 node

Scaling one node at a time, planned 1 month in advance, avoids emergency scaling events. The projection assumes 10% monthly tenant growth and proportional data growth.

The Decision Rule

Model capacity 6 months ahead. Plan scaling events for when utilization reaches 75%, not 90%. The 75% threshold leaves room for traffic spikes, tenant onboarding bursts, and reindexing operations that temporarily double storage usage.

Track per-tenant resource consumption monthly. Tenants that grow faster than expected should be flagged for promotion to dedicated indices. A tenant that doubles its document count in one month will disproportionately impact the shared index.

Never rely on reactive scaling for search infrastructure. A search cluster under resource pressure degrades gracefully (slow responses) until it fails suddenly (circuit breakers, OOM, shard allocation failures). The graceful degradation phase feels like “it’s fine, just a bit slow,” masking the approaching cliff.