Multi-Agent Patient Intake Workflow

The Problem

Patient onboarding at a multi-location healthcare provider required 47 minutes of clinical staff time per case. Three separate systems, manual copy-paste, eligibility checks via phone. Staff were processing 30 intakes per day — the bottleneck before any actual care was delivered.

The tasks were 90% mechanical. The 10% requiring judgment (flagging unusual eligibility situations) still needed a human — but didn’t need a human doing the other 90%.

What I Built

A CrewAI multi-agent system with 4 specialized agents orchestrated by a supervisor:

Agents

Form Agent — Extracts structured data from patient intake forms (PDF/web). Handles messy handwriting via a vision model fallback. Writes to the staging Postgres schema.

EHR Agent — Queries the existing EHR via HL7 FHIR API (custom MCP server). Checks for existing patient records, pulls prior visit history, flags duplicates.

Eligibility Agent — Calls the insurance eligibility API (Change Healthcare). Parses the 270/271 transaction response into a human-readable summary. Flags edge cases for human review.

Summary Agent — Synthesizes the above into a structured intake packet: demographics, insurance status, flagged issues, recommended next steps. Formatted for the clinical coordinator’s review.

Supervisor Agent — Runs the workflow, handles retries, escalates to human queue on any agent failure or confidence threshold miss.

The MCP Layer

The EHR integration used a custom MCP (Model Context Protocol) server I wrote to wrap the FHIR API. This let Claude call EHR tools via natural language (“look up patient by DOB and last name”) without hardcoding every query shape. The MCP server handled auth, rate limiting, and audit logging.

Accuracy vs. Human Baseline

I ran a 2-week parallel test: agents processed 300 intakes alongside humans. Clinical staff reviewed both outputs blind. Agent accuracy was 94% on structured data fields, 97% on eligibility determination. The 6% error rate was concentrated in name/DOB edge cases (typos, nicknames) — now caught by a dedicated fuzzy-match pre-check.

Results

47 minutes → 8 minutes per intake (83% reduction)
6× throughput — same staff now handles 180 intakes/day
94% accuracy vs. human baseline on structured fields
Staff feedback: clinical coordinators report lower stress and more time for patient-facing work

What I’d Change

The supervisor agent’s retry logic was too aggressive — on transient API failures, it would retry 3× before escalating, which created a 3-minute delay spike. I’d use exponential backoff with a 30-second cap and immediate escalation on specific error codes.