Cold-start hybrid recommendation system for a media platform — combining collaborative filtering with content-based embeddings for a 24% CTR uplift.
A media platform (long-form written content) had a recommendation widget showing “you might also like” articles. It was powered by a simple tag-matching algorithm — if you just read a piece tagged “AI”, it showed other AI-tagged pieces. CTR was 2.1%.
Two problems:
A hybrid recommendation system combining two signals:
Matrix factorization on the user-article interaction matrix (implicit feedback: reads, scroll depth, shares). Trained weekly with ALS (Alternating Least Squares) via Implicit library. Produces a “readers who liked this also liked…” signal.
Article embeddings using a fine-tuned sentence transformer (all-mpnet-base-v2). Each article is embedded at publish time. Recommendations based on cosine similarity of embeddings. This solves cold start — new articles get recommendations immediately from their content, before any read data exists.
The final recommendation score is a weighted blend:
Weights tuned via offline A/B testing on held-out interaction data.
FastAPI service with Redis caching. Recommendations pre-computed nightly for the top 10K articles; real-time fallback to content-based for the long tail. P99 latency: 40ms.
Before: a new article published on Monday would get 0 recommendations for 72 hours (no read data). After: new articles are embedded at publish time and immediately surface in the recommendations of semantically similar content. New article CTR in the first 72 hours improved by 4×.
30 minutes, free, no deck. We'll figure out if I'm the right fit for your project.