Back to Work
Fixing Therapist Note Generation
A production LLM ops rebuild that improved quality, reduced rejection, and delivered direct revenue impact.
The problem
Mentalyc's AI-generated therapy notes were inconsistent and therapists were rejecting outputs.
The system needed quality, reliability, and cost control — not just better prompts.
What I built / changed
- Tightened prompts and improved retrieval pipelines.
- Added guardrails and structured output constraints.
- Created an evaluation harness with real therapist feedback loops.
- Built curated datasets for regression testing.
- Implemented cost controls via intelligent model routing and token-aware truncation.
Result
30% revenue impact through higher note acceptance rates.
Stack / concepts
LLM opsPrompt engineeringEval harnessesCost optimization
Interested in discussing engineering challenges?
Get in touch