Skip to main content

On This Page

Building HitKeep: A Sovereign Web Analytics Engine in a Single 12MB Go Binary

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

From HealthTech to Open Source: Building a sovereign web analytics engine in a single binary

Developer Pascale Beier developed HitKeep to meet strict HealthTech privacy requirements where traditional analytics stacks were operationally prohibitive. The system integrates a columnar OLAP database and a message broker into a single 12MB executable that has already ingested millions of hits in production.

Why This Matters

Traditional self-hosted analytics tools like Plausible or Umami often require managing a heavy stack of PostgreSQL, ClickHouse, Redis, and Node.js, creating significant maintenance overhead for simple pageview tracking. HitKeep addresses this technical debt by embedding high-performance components directly into the application layer, proving that enterprise-scale data processing can be achieved with zero external database dependencies.

Key Insights

  • Embedded DuckDB serves as the storage engine, providing columnar OLAP performance with a storage density of approximately 1 million raw hits per 120MB of disk space.
  • Internal micro-batching via an embedded NSQ broker decouples HTTP ingestion from disk I/O, resolving lock contention issues common in columnar databases.
  • High Availability is enabled through native clustering using the HashiCorp Memberlist gossip protocol for leader election and node discovery.
  • The ‘Takeout API’ prevents vendor lock-in by allowing one-click exports of raw data into Parquet, CSV, JSON, or NDJSON formats for external analysis.
  • The system is fully air-gap compatible, making zero outbound third-party requests and proxying all external assets like favicons server-side.
  • Security is handled natively with support for WebAuthn (Passkeys/YubiKey) and TOTP for account protection without external identity providers.

Practical Applications

  • HealthTech Compliance: Implementing privacy-first tracking in environments where GDPR and patient privacy laws prohibit third-party services like Google Analytics.
  • Air-Gapped Infrastructure: Deploying analytics on isolated networks where external dependencies or outbound telemetry are strictly forbidden by security policy.
  • Low-Maintenance Monitoring: Utilizing a single-binary deployment for developers who require enterprise analytics performance without the burden of managing a database cluster.
  • Pitfall: Avoid using HitKeep for cross-device identity stitching, as the system is intentionally designed to be cookie-less and privacy-centric.

References:

Continue reading

Next article

Google AI Nano-Banana 2: Sub-Second 4K On-Device Image Synthesis

Related Content