Streaming HTML Architecture Patterns
Streaming HTML Architecture Patterns
The Symptom
The e-commerce product detail page uses SSR with streaming, but the FCP is 480ms instead of the expected 200ms. The streaming implementation renders the entire page inside a single Suspense boundary. The slowest data source (recommendations, 460ms) blocks the entire shell.
The Cause
A single Suspense boundary wrapping the whole page defeats streaming. The onShellReady callback fires only when all non-suspended content is rendered. If the shell itself depends on slow data, the shell is not ready until that data arrives.
// SLOW: Single Suspense boundary wraps everything
function ProductPage({ productId }: { productId: string }) {
return (
<Suspense fallback={<FullPageSkeleton />}>
<ProductContent productId={productId} />
</Suspense>
);
}
// ProductContent fetches ALL data before rendering anything
function ProductContent({ productId }: { productId: string }) {
const product = use(fetchProduct(productId)); // 40ms
const reviews = use(fetchReviews(productId)); // 280ms
const recommendations = use(fetchRecommendations(productId)); // 460ms
// Nothing renders until all three resolve (460ms)
return (
<>
<ProductHeader product={product} />
<ProductReviews reviews={reviews} />
<Recommendations items={recommendations} />
</>
);
}
The Baseline
Single-boundary streaming:
- Shell ready: 460ms (blocked by recommendations)
- FCP: 600ms (460ms server + 140ms network)
- LCP: 800ms
- Total server render time: 460ms
The Fix
Granular Suspense Boundaries
Split the page into independent streaming boundaries based on data source latency:
// FAST: Independent Suspense boundaries per data source
function ProductPage({ productId }: { productId: string }) {
return (
<Layout>
{/* Shell: only depends on product data (40ms) */}
<ProductShell productId={productId} />
{/* Streams independently when reviews resolve (280ms) */}
<Suspense fallback={<ReviewsSkeleton />}>
<ProductReviews productId={productId} />
</Suspense>
{/* Streams independently when recommendations resolve (460ms) */}
<Suspense fallback={<RecommendationsSkeleton />}>
<Recommendations productId={productId} />
</Suspense>
</Layout>
);
}
// Shell component: fast data only
function ProductShell({ productId }: { productId: string }) {
const product = use(fetchProduct(productId)); // 40ms
return (
<>
<ProductHeader product={product} />
<ProductImages images={product.images} />
<ProductPrice
price={product.price}
originalPrice={product.originalPrice}
/>
<AddToCartButton productId={product.id} />
</>
);
}
The shell now depends only on fetchProduct (40ms). The onShellReady callback fires at 40ms, and HTML streaming begins immediately. Reviews and recommendations stream in when their data resolves, each replacing its skeleton placeholder.
Nested Suspense for Progressive Disclosure
Some sections have sub-components with different data latencies. The reviews section has a summary (fast) and individual reviews (slower):
function ProductReviews({ productId }: { productId: string }) {
const summary = use(fetchReviewSummary(productId)); // 80ms
return (
<section>
{/* Summary renders when this boundary streams */}
<ReviewSummary
averageRating={summary.averageRating}
totalReviews={summary.totalReviews}
distribution={summary.distribution}
/>
{/* Individual reviews stream later */}
<Suspense fallback={<ReviewListSkeleton count={5} />}>
<ReviewList productId={productId} />
</Suspense>
</section>
);
}
function ReviewList({ productId }: { productId: string }) {
const reviews = use(fetchReviews(productId)); // 280ms
return (
<ul>
{reviews.map((review) => (
<ReviewCard key={review.id} review={review} />
))}
</ul>
);
}
The review summary (average rating, total count) streams at 80ms. Individual reviews stream at 280ms. The user sees meaningful content (the rating breakdown) 200ms before the full review list appears.
Error Boundaries in Streaming Context
If a streamed section fails (the recommendation service is down), the error must not break the already-rendered page. An Error Boundary wrapping each Suspense boundary catches failures:
import { Component, type ErrorInfo, type ReactNode } from "react";
interface ErrorBoundaryProps {
fallback: ReactNode;
children: ReactNode;
}
interface ErrorBoundaryState {
hasError: boolean;
}
class StreamErrorBoundary extends Component<
ErrorBoundaryProps,
ErrorBoundaryState
> {
state: ErrorBoundaryState = { hasError: false };
static getDerivedStateFromError(): ErrorBoundaryState {
return { hasError: true };
}
componentDidCatch(error: Error, info: ErrorInfo): void {
console.error("Stream section failed:", error, info);
}
render(): ReactNode {
if (this.state.hasError) {
return this.props.fallback;
}
return this.props.children;
}
}
// Usage: wrapping each streamed section
function ProductPage({ productId }: { productId: string }) {
return (
<Layout>
<ProductShell productId={productId} />
<StreamErrorBoundary fallback={<ReviewsUnavailable />}>
<Suspense fallback={<ReviewsSkeleton />}>
<ProductReviews productId={productId} />
</Suspense>
</StreamErrorBoundary>
<StreamErrorBoundary fallback={<RecommendationsUnavailable />}>
<Suspense fallback={<RecommendationsSkeleton />}>
<Recommendations productId={productId} />
</Suspense>
</StreamErrorBoundary>
</Layout>
);
}
If recommendations fail, the user sees a “Recommendations unavailable” message instead of a broken page. The product header, images, price, and reviews remain intact.
Server Timing Headers
Measure streaming performance by exposing server-side timing data:
import { renderToPipeableStream } from 'react-dom/server';
import type { Request, Response } from 'express';
function handleProductRequest(req: Request, res: Response): void {
const startTime = performance.now();
const { pipe } = renderToPipeableStream(
<ProductPage productId={req.params.id} />,
{
bootstrapScripts: ['/static/client.js'],
onShellReady() {
const shellTime = performance.now() - startTime;
res.setHeader('Content-Type', 'text/html; charset=utf-8');
res.setHeader(
'Server-Timing',
`shell;dur=${shellTime.toFixed(1)};desc="Shell render"`
);
res.statusCode = 200;
pipe(res);
},
onAllReady() {
const totalTime = performance.now() - startTime;
// Log total render time for monitoring
console.log(`Full render: ${totalTime.toFixed(1)}ms`);
},
onError(error: unknown) {
console.error('Render error:', error);
},
}
);
}
The Server-Timing header appears in Chrome DevTools Network panel, making shell render time visible during development. The CI pipeline can parse this header to assert shell render time stays below a threshold.
The Proof
| Metric | Single Boundary | Granular Boundaries | Delta |
|---|---|---|---|
| Shell ready (server) | 460ms | 40ms | -420ms |
| FCP | 600ms | 180ms | -420ms |
| LCP | 800ms | 380ms | -420ms |
| Reviews visible | 600ms | 420ms | -180ms |
| Recommendations visible | 600ms | 600ms | 0ms |
| TTI | 1,200ms | 1,100ms | -100ms |
The recommendations appear at the same time in both approaches (the data takes 460ms regardless). The difference is everything else renders 420ms earlier. The user sees the product header, images, and price at 180ms instead of 600ms.
The Trade-off
More Suspense boundaries mean more skeleton states for the user to see. If seven sections each stream at different times, the page “assembles” over 500ms with content popping in progressively. This can feel chaotic. The guideline: group data sources with similar latencies into a single Suspense boundary. The product page uses three groups: fast (product data, 40ms), medium (reviews, 80-280ms), and slow (recommendations, 460ms).
Skeleton components must match the exact dimensions of the rendered content. If the review skeleton is 200px tall and the rendered reviews are 340px tall, the 140px height change causes a CLS of 0.12 (above the 0.1 threshold). Every skeleton in the streaming architecture must have an explicit height or aspect ratio matching the expected content size.