All Projects
Backend
Live
August 2024

Neural API Gateway

A high-throughput API gateway with intelligent rate limiting, semantic request routing, and real-time anomaly detection built on a Rust core with Python ML inference workers.

RustPythonRedisKubernetesPyTorch

Overview

The Neural API Gateway handles 2M+ requests per day for a SaaS platform serving enterprise clients. It combines a blazing-fast Rust core with Python ML workers to deliver intelligent traffic management that goes beyond traditional rule-based systems.

Architecture

Rust Core

The hot path — request parsing, header manipulation, connection pooling — runs entirely in Rust using the Tokio async runtime. This gives us sub-millisecond overhead per request.

ML Inference Layer

A fleet of Python workers (FastAPI + PyTorch) handle the AI features:

  • Anomaly detection — flags unusual traffic patterns in real time
  • Semantic routing — routes requests to the optimal backend based on payload semantics
  • Predictive rate limiting — adjusts limits based on predicted client behavior

Redis Cluster

All shared state (rate limit counters, circuit breaker state, session data) lives in a Redis cluster with 3 replicas for high availability.

Performance

MetricBeforeAfter
P99 latency45ms3.2ms
Throughput80k req/s340k req/s
False positive blocks12%0.8%

Key Challenge: Zero-Downtime Deploys

Rolling updates to a stateful gateway are tricky. We solved this with a blue-green deployment strategy backed by Kubernetes, where we gradually shift traffic using weighted routing rules in Nginx Ingress.