Real-Time Analytics Dashboard

The Problem

An adtech company was tracking ad impressions, clicks, and conversions across 200M daily events. Their Postgres-based analytics dashboard had buckled — queries were timing out at 30+ seconds, the DB was maxing CPU, and the team was making decisions on day-old data.

They needed real-time analytics with sub-second query performance. Without breaking their $8K/month infrastructure budget.

Architecture

Ingestion — Kafka

All events flow through a Kafka cluster (3 brokers). Event producers (ad servers) write to Kafka topics; consumers write to ClickHouse in micro-batches every 5 seconds.

Storage — ClickHouse

ClickHouse is the right tool for this use case: columnar storage, vectorized execution, and purpose-built aggregation functions for time-series analytics. A query that took 32 seconds on Postgres runs in 180ms on ClickHouse — same data.

The schema uses a partitioning strategy by day with materialized views for the most common aggregations (hourly rollups, campaign summaries) pre-computed on write.

Caching — Redis

Frequently-accessed dashboard queries (last 7 days by campaign) are cached in Redis with 60-second TTL. This takes the query load off ClickHouse for 80% of dashboard page loads.

Frontend — Next.js + Recharts

Server-side rendering for the dashboard shell, client-side polling for live metrics (every 10s). Recharts for visualization with custom components matching their brand.

The Budget Constraint

ClickHouse Cloud would have been $15K/month at this scale. Instead:

Self-hosted ClickHouse on 3× AWS r6g.2xlarge instances: ~$1,400/month
Kafka on MSK (managed): ~$800/month
Redis on Elasticache: ~$200/month
Total infra: ~$2,400/month — 70% under budget

Results

P95 query latency: 32 seconds → 180ms
Data freshness: T-24h → T-5s (near real-time)
Uptime: 99.97% over the first 6 months
Infrastructure cost: $2,400/month (was $3,200/month on the old setup — actually cheaper)

What I’d Do Differently

Kafka was slightly over-engineered for 200M events/day. A simpler Kinesis + Lambda pipeline would have been easier to operate with similar performance. I chose Kafka because the team had Kafka experience — operator familiarity matters more than technical elegance.