ClickHouse Native JSON: 2,500x Faster Than MongoDB in 2026
These articles are AI-generated summaries. Please check the original sources for full details.
ClickHouse Native JSON Support in 2026: A PR-by-PR Analysis
ClickHouse reached full native JSON support in version 25.3, replacing legacy string-based parsing with a ground-up columnar implementation. The system now delivers 2,500x faster aggregations than MongoDB on the JSONBench 1-billion-document benchmark. This evolution is the result of over 80 merged pull requests spanning three years of development.
Why This Matters
The technical reality of modern ClickHouse JSON support renders previous criticisms of ‘no native support’ obsolete, as the system no longer relies on full column scans of opaque string blobs. By decomposing JSON paths into separate Dynamic-typed subcolumns with native type preservation, ClickHouse eliminates the performance penalties of document stores. This architectural shift is critical for observability and event-driven systems where high-cardinality semi-structured data previously incurred massive CPU and memory overhead during aggregation.
Key Insights
- Advanced shared data serialization (PR #83777, 2025) introduced per-granule path indexes, resulting in 58x faster selective reads and a 3,300x reduction in memory usage.
- The foundational Variant (PR #58047) and Dynamic (PR #63058) types allow ClickHouse to store mixed types natively (e.g., UInt32, Float64) without collapsing data into Strings.
- The query planner (PR #68053, 2024) and FunctionToSubcolumnsPass (PR #96711) rewrite JSONExtract calls into direct subcolumn reads, bypassing text parsing entirely.
- Native JSON reached General Availability in v25.3 (PR #77785), while the legacy Object(‘json’) type was fully removed in v25.11 (PR #85718) to ensure architectural consistency.
- JSONBench performance data confirms ClickHouse is 10x faster than Elasticsearch and 9,000x faster than DuckDB or PostgreSQL for analytics on 1 billion Bluesky documents.
- Primary key and data-skipping index support for JSON subcolumns (PR #72644, 2024) enables granule-level pruning for semi-structured fields.
Working Examples
Converts existing String, Map, or Tuple columns to the native JSON type via background merge conversion.
ALTER TABLE logs MODIFY COLUMN attributes JSON;
Direct dot-notation access on JSON paths that are automatically optimized into columnar subcolumn reads by the planner.
SELECT attributes.user.id, count() FROM logs GROUP BY attributes.user.id;
Practical Applications
- Use Case: SigNoz and ClickStack achieve 30% to 9x faster log queries by migrating OpenTelemetry attributes from Map(String, String) to the native JSON type.
- Pitfall: Upgrading past v25.11 without migrating legacy Object(‘json’) columns via ALTER TABLE will result in broken schemas as the deprecated type was fully removed in PR #85718.
- Use Case: Systems with high-cardinality JSON keys use max_dynamic_paths and SKIP REGEXP to prioritize columnar storage for high-value fields while overflowing noisy data into optimized shared storage.
- Pitfall: Using SELECT * in CTEs over JSON tables currently prevents subcolumn pruning (Issue #92455); engineers should explicitly name JSON paths in CTE definitions for maximum performance.
References:
Continue reading
Next article
Automating Skool Onboarding: Lessons from a 138-Member Failure
Related Content
Best Vector Databases in 2026: Pricing, Scale, and Architecture Tradeoffs
Compare nine leading vector databases in 2026 including Pinecone and Milvus, featuring Zilliz Cloud's Cardinal engine which delivers 10x higher throughput than HNSW.
We Hit 6 Billion MongoDB Documents (And Lived to Tell the Tale)
Avluz.com successfully scaled its MongoDB cluster to 6 billion documents, reducing monthly costs from $7,500 to $2,180 through optimization and a move to OVH.
GCAIDB Certification: Bridging AI and Database Expertise
The GCAIDB certification validates skills needed to manage databases supporting AI workloads, addressing a key failure point in AI initiatives.