Building HitKeep: A Sovereign Web Analytics Engine in a Single 12MB Go Binary
These articles are AI-generated summaries. Please check the original sources for full details.
From HealthTech to Open Source: Building a sovereign web analytics engine in a single binary
Developer Pascale Beier developed HitKeep to meet strict HealthTech privacy requirements where traditional analytics stacks were operationally prohibitive. The system integrates a columnar OLAP database and a message broker into a single 12MB executable that has already ingested millions of hits in production.
Why This Matters
Traditional self-hosted analytics tools like Plausible or Umami often require managing a heavy stack of PostgreSQL, ClickHouse, Redis, and Node.js, creating significant maintenance overhead for simple pageview tracking. HitKeep addresses this technical debt by embedding high-performance components directly into the application layer, proving that enterprise-scale data processing can be achieved with zero external database dependencies.
Key Insights
- Embedded DuckDB serves as the storage engine, providing columnar OLAP performance with a storage density of approximately 1 million raw hits per 120MB of disk space.
- Internal micro-batching via an embedded NSQ broker decouples HTTP ingestion from disk I/O, resolving lock contention issues common in columnar databases.
- High Availability is enabled through native clustering using the HashiCorp Memberlist gossip protocol for leader election and node discovery.
- The ‘Takeout API’ prevents vendor lock-in by allowing one-click exports of raw data into Parquet, CSV, JSON, or NDJSON formats for external analysis.
- The system is fully air-gap compatible, making zero outbound third-party requests and proxying all external assets like favicons server-side.
- Security is handled natively with support for WebAuthn (Passkeys/YubiKey) and TOTP for account protection without external identity providers.
Practical Applications
- HealthTech Compliance: Implementing privacy-first tracking in environments where GDPR and patient privacy laws prohibit third-party services like Google Analytics.
- Air-Gapped Infrastructure: Deploying analytics on isolated networks where external dependencies or outbound telemetry are strictly forbidden by security policy.
- Low-Maintenance Monitoring: Utilizing a single-binary deployment for developers who require enterprise analytics performance without the burden of managing a database cluster.
- Pitfall: Avoid using HitKeep for cross-device identity stitching, as the system is intentionally designed to be cookie-less and privacy-centric.
References:
Continue reading
Next article
Google AI Nano-Banana 2: Sub-Second 4K On-Device Image Synthesis
Related Content
Nextjs-Elite-Boilerplate: A Production-Ready, API-Driven SaaS Starter
Nextjs-Elite-Boilerplate delivers a frontend-first setup with 100s across all four Lighthouse categories using Next.js 16 and React 19.
Building Privacy-First PDF and Image Tools via Browser-Native Processing
Swathik is launching pdfandimagetools.com, a platform using WebAssembly and ONNX Runtime to process sensitive documents locally without server uploads.
Building a Swedish Sudoku Site with Next.js 15 and Pure TypeScript
Developer Evy Lundell launched sudokun.se, a zero-ad Sudoku platform leveraging Next.js 15 and a deterministic TypeScript engine for unique-solution puzzle generation.