Neural API Gateway

A high-throughput API gateway with intelligent rate limiting, semantic request routing, and real-time anomaly detection built on a Rust core with Python ML inference workers.

RustPythonRedisKubernetesPyTorch

View on GitHub

Overview

The Neural API Gateway handles 2M+ requests per day for a SaaS platform serving enterprise clients. It combines a blazing-fast Rust core with Python ML workers to deliver intelligent traffic management that goes beyond traditional rule-based systems.

Architecture

Rust Core

The hot path — request parsing, header manipulation, connection pooling — runs entirely in Rust using the Tokio async runtime. This gives us sub-millisecond overhead per request.

ML Inference Layer

A fleet of Python workers (FastAPI + PyTorch) handle the AI features:

Anomaly detection — flags unusual traffic patterns in real time
Semantic routing — routes requests to the optimal backend based on payload semantics
Predictive rate limiting — adjusts limits based on predicted client behavior

Redis Cluster

All shared state (rate limit counters, circuit breaker state, session data) lives in a Redis cluster with 3 replicas for high availability.

Performance

Metric	Before	After
P99 latency	45ms	3.2ms
Throughput	80k req/s	340k req/s
False positive blocks	12%	0.8%

Key Challenge: Zero-Downtime Deploys

Rolling updates to a stateful gateway are tricky. We solved this with a blue-green deployment strategy backed by Kubernetes, where we gradually shift traffic using weighted routing rules in Nginx Ingress.

Back to all projects