AS
Back to Work

Fixing Therapist Note Generation

A production LLM ops rebuild that improved quality, reduced rejection, and delivered direct revenue impact.

The problem

Mentalyc's AI-generated therapy notes were inconsistent and therapists were rejecting outputs.

The system needed quality, reliability, and cost control — not just better prompts.

What I built / changed

  • Tightened prompts and improved retrieval pipelines.
  • Added guardrails and structured output constraints.
  • Created an evaluation harness with real therapist feedback loops.
  • Built curated datasets for regression testing.
  • Implemented cost controls via intelligent model routing and token-aware truncation.

Result

30% revenue impact through higher note acceptance rates.

Stack / concepts

LLM opsPrompt engineeringEval harnessesCost optimization

Interested in discussing engineering challenges?

Get in touch