gRPC, HTTP/2 Multiplexing, and Connection Reuse

The Black Box

The route optimizer creates a new HTTP connection for each request to the package service. At 50 requests per route and 100 routes computed per minute, the optimizer creates 5,000 TCP connections per minute. Each connection involves a TCP handshake (1 round trip), TLS handshake (2 more round trips for TLS 1.3), and HTTP negotiation. The route optimizer spends more time establishing connections than using them.

The Mechanism

HTTP/1.1 Connection Model

HTTP/1.1 allows connection reuse via keep-alive, but each connection handles one request at a time. To send 50 concurrent requests, 50 connections are needed. Each connection consumes:

A file descriptor on both client and server
A TCP send/receive buffer (typically 128KB per direction)
A TLS session (approximately 10KB of memory for session state)

50 connections: 50 file descriptors, ~14MB of buffer memory per side.

HTTP/2 Multiplexing

HTTP/2 uses a single TCP connection with multiple streams. Each stream carries one request/response pair. Streams are interleaved on the connection: bytes from stream 3 can be sent between bytes from stream 7. The connection is full-duplex.

HTTP/1.1 (50 requests, keep-alive, serial):
Conn 1: [req1 →] [← resp1] [req2 →] [← resp2] ... [req50 →] [← resp50]
Total: 50 round trips × 0.5ms = 25ms minimum

HTTP/2 (50 requests, multiplexed):
Conn 1: [req1,req2,...,req50 →→→] [←←← resp1,resp2,...,resp50]
Total: ~1 round trip × 0.5ms = 0.5ms minimum (plus server processing)

gRPC uses HTTP/2 natively. Every gRPC call is an HTTP/2 stream. A single ManagedChannel in the gRPC Java client maintains one or more HTTP/2 connections and multiplexes all RPCs across them.

gRPC Channel Lifecycle

// Concept: gRPC channel configuration for the route optimizer
// One channel per target service. Reused across all RPC calls.
// Do NOT create a new channel per request.

// BLACK BOX: creating a channel per request
PackageServiceGrpc.PackageServiceBlockingStub getPackage(String host) {
    ManagedChannel channel = ManagedChannelBuilder.forAddress(host, 9090).build();
    return PackageServiceGrpc.newBlockingStub(channel);
    // Channel is created, used once, and garbage collected.
    // TCP handshake, HTTP/2 setup on every call. Terrible.
}

// MECHANISM: shared channel, reused across calls
private final ManagedChannel channel = ManagedChannelBuilder
    .forAddress("package-service", 9090)
    .usePlaintext()
    .keepAliveTime(30, TimeUnit.SECONDS)     // Send keepalive pings
    .keepAliveTimeout(5, TimeUnit.SECONDS)    // Close if no ping response
    .maxInboundMessageSize(4 * 1024 * 1024)  // 4MB max message
    .build();

private final PackageServiceGrpc.PackageServiceBlockingStub stub =
    PackageServiceGrpc.newBlockingStub(channel);

// The channel manages the HTTP/2 connection(s).
// All RPC calls through 'stub' are multiplexed on the same connection.
// Connection is established on first use and reused for the lifetime of the channel.

Deadline Propagation

gRPC deadlines prevent requests from waiting indefinitely. A deadline propagates from the client through intermediate services: if the route optimizer sets a 500ms deadline, the package service knows it has 500ms total, not 500ms per hop.

// Concept: gRPC deadline to prevent unbounded waiting
PackageInfo result = stub
    .withDeadlineAfter(200, TimeUnit.MILLISECONDS)  // 200ms deadline
    .getPackage(request);

// If the package service does not respond within 200ms:
// - Client receives StatusRuntimeException with Status.DEADLINE_EXCEEDED
// - Server is notified that the client has cancelled (server can stop processing)
// - No thread is blocked waiting for a response that will be discarded

The Observable Consequence

Connection costs for the route optimizer’s 50-request batch:

Metric	HTTP/1.1 (50 connections)	HTTP/1.1 (keep-alive, serial)	gRPC (1 connection)
TCP handshakes	50	1	1
Concurrent in-flight	50	1	50
Total round trips	50	50	1 (batch RPC)
Wall clock time	25ms + processing	25ms + processing	0.5ms + processing
Memory (buffers)	14 MB	280 KB	280 KB
File descriptors	50	1	1

The gRPC approach with a batch RPC reduces network overhead from 25ms to 0.5ms and memory usage from 14MB to 280KB. For a route optimizer computing 100 routes/minute, the savings are 2,450ms of network latency and 1.4GB of connection buffer churn per minute.

The Decision Rule

Use gRPC for internal service-to-service communication when you control both the client and server, request volume is high (hundreds+ RPCs/second), and you benefit from compile-time type safety via Protobuf schemas.

Do not use gRPC for browser-facing APIs (browsers do not natively support HTTP/2 trailers, which gRPC requires). Do not use gRPC for services that are called fewer than 10 times per minute. The tooling overhead (Protobuf codegen, gRPC stubs, channel management) is not justified for low-frequency calls. A plain HTTP/1.1 JSON endpoint is simpler to implement, debug, and monitor.

When adopting gRPC, create one ManagedChannel per target service and share it across all callers. Configure keepalive to detect dead connections. Set deadlines on every call to prevent unbounded waits.