Cut customer-support headcount 70% with a multi-language AI agent that deflects repetitive tickets at sub-2-second response time.
A YC-backed B2B SaaS company had a support team of 12 handling a flood of repetitive Tier-1 tickets — password resets, billing questions, feature how-tos. As they pushed toward a 3× user growth target, headcount was not an option.
65% of all tickets were answerable from the existing help-center docs. No judgment required. Just retrieval and a coherent reply.
A production LangChain agent sitting in front of the existing Zendesk queue. The agent:
Six languages supported via a detect-then-translate pipeline before and after retrieval.
Scale. 4M documents with sub-100ms retrieval at P99 required a managed vector store. Pinecone’s metadata filtering also let us scope searches to the customer’s specific plan tier.
Pure vector search missed exact-match queries (“what is my plan limit?”). I layered BM25 keyword search (via Elasticsearch) with a reciprocal rank fusion step. Accuracy on the eval set jumped from 84% to 92%.
The agent only auto-sends when cosine similarity > 0.87 AND the LLM’s self-reported confidence is “high”. Everything else gets a draft in the agent’s queue for one-click human approval. This kept the false-positive rate below 0.3%.
The eval harness was built after the agent — it should have come first. We spent a week manually reviewing edge cases that a proper golden-set eval would have caught in hours.
30 minutes, free, no deck. We'll figure out if I'm the right fit for your project.