Skip to main content

On This Page

Choosing the Right Database: The 5-Question Architectural Test

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Time-Series, Document, or Relational? The 5-Question Test Every New Project Needs

Gabriel Anhaia outlines a five-question framework to prevent database migration failures that cost developer-weeks. One team saw telemetry latency drop from 14 seconds to under 200ms by switching to ClickHouse, while others faced incident-heavy rollbacks after moving mutable data to document stores.

Why This Matters

Database selection often fails when teams ignore the fundamental design assumptions of their storage engines, such as using columnar stores for point lookups. Misalignment leads to severe operational overhead, where a self-hosted ClickHouse cluster may require up to 1.0 full-time engineer (FTE) compared to just 0.1 FTE for managed Postgres solutions.

Key Insights

  • ClickHouse ingestion reaches 2-3M points/sec compared to TimescaleDB’s 100k-500k range (sanj.dev, 2026).
  • Tag-indexed engines like InfluxDB experience performance degradation when active series cardinality exceeds millions.
  • Postgres effectively handles mixed mutable workloads until hot data crosses the low-TB range, where specialized partitioning is required.
  • Operational costs vary significantly; self-hosted Cassandra can require 2.0 FTEs for proper backup and monitoring.
  • Single-row lookup latency is higher in ClickHouse (15ms) than TimescaleDB (10ms), impacting high-concurrency operational endpoints.

Working Examples

Decision script for database selection based on workload attributes.

from dataclasses import dataclass
from typing import Literal
@dataclass
class Workload:
    write_shape: Literal["append_only", "mutable", "mixed"]
    read_pattern: Literal["range", "key", "fulltext", "mixed"]
    consistency: Literal["acid", "eventual"]
    series_cardinality: int
    ops_headcount: float
def recommend(w: Workload) -> str:
    if w.consistency == "acid":
        return "Postgres + TimescaleDB extension" if w.write_shape == "append_only" else "Postgres (managed)"
    if w.write_shape == "append_only":
        if w.series_cardinality > 10000000:
            return "ClickHouse" if w.ops_headcount >= 1 else "BigQuery"
        return "TimescaleDB on managed Postgres"
    return "Postgres"

Practical Applications

  • IoT Telemetry: Use ClickHouse for 50k devices to utilize 10-30x compression. Pitfall: High-cardinality tags in tag-indexed engines leading to hardware saturation.
  • E-commerce: Use Postgres for ACID-compliant orders and returns. Pitfall: Migrating to MongoDB for flexibility and losing cross-entity reporting efficiency.
  • SaaS Analytics: Use managed BigQuery or Snowflake for teams with less than 0.5 FTE budget. Pitfall: Adopting complex self-hosted clusters that consume the entire engineering team’s on-call capacity.

References:

Continue reading

Next article

Evaluating Agentic Reasoning: The 7 Benchmarks Defining Frontier LLM Performance

Related Content