Medical Compliance Engine
I’m Dustin Umphress, a cloud and automation engineer with deep experience in healthcare IT and medical billing systems. This project demonstrates how high-risk compliance logic can be implemented using deterministic data pipelines, with large language models used strictly for constrained explanation—not decision-making.
The goal was to design an AI-assisted auditing system that hallucinates less, even when analyzing complex clinical text.
Project Overview
The Medical Compliance Engine is a serverless Python application that audits medical claims logic against official CMS rule sets, including NCCI edits and MUE limits.
Unlike typical “AI wrappers,” this system separates fact determination from natural-language reasoning:
- Deterministic systems decide what is allowed
- The LLM only explains why, based on supplied facts
This architecture is designed for regulated environments where probabilistic answers are unacceptable.
Core Technical Principles
1. Deterministic First, AI Last
All compliance decisions are resolved using structured rule data queried from a local database. The language
model is never asked to recall, infer, or invent billing rules.
2. Structured RAG (Not Vector Search)
Medical billing rules are binary, not semantic. Retrieval is performed via precise SQL queries against a
normalized rules database rather than vector similarity search.
3. Privacy by Design
Clinical text is redacted locally before any processing. Sensitive identifiers never leave the runtime
environment.
Architecture Summary
Language: Python
Runtime: Docker on AWS Lambda
Rules Engine: SQLite (CMS NCCI + MUE datasets)
AI Layer: LLM via managed cloud inference
Privacy Controls: Local PHI redaction prior to processing
Request Flow:
- Local PHI redaction using NER
- CPT code and unit extraction
- Deterministic rule evaluation via SQL
- Constrained LLM explanation based only on retrieved facts
The system is explicitly designed so that hallucination is structurally impossible.
Serverless Engineering Challenge: Cold Starts
The rules database (~300 MB) initially caused unacceptable Lambda cold-start latency when loaded eagerly.
This was resolved using a lazy-loading strategy:
- Containers initialize instantly
- Database connections are created only on the first audit request
This reduced cold-start latency from over 12 seconds to under 1 second, while preserving a fully serverless deployment model.
Scope and Intended Use
This project is intended as:
- An internal audit assistance tool
- A technical architecture demonstration
- A reference implementation for deterministic AI systems
It is not a payer adjudication system and does not replace official CMS or insurer determinations.
Source Code
The full implementation, including:
- CMS data ingestion and normalization
- SQLite schema design
- Docker configuration
- Serverless deployment logic
is available on GitHub:
👉 https://github.com/dustinumphress/medical-compliance-engine