From f3e58dd50a1eb9ddea892b594cf0f1d95eeae52c Mon Sep 17 00:00:00 2001 From: laxmanhalaki Date: Fri, 13 Mar 2026 19:19:53 +0530 Subject: [PATCH] documents added for the code generation context --- Markdown_Viewer.html | 944 ++++++++++ Mermaid_Selector.html | 1053 +++++++++++ docs/context/business_logic_flows.md | 35 + docs/context/clinical_context.md | 17 + docs/context/state_machine_and_badging.md | 19 + docs/context/technical_specs.md | 17 + project_structure.md | 123 ++ project_structure_detailed.md | 90 + srs_document.md | 1999 +++++++++++++++++++++ 9 files changed, 4297 insertions(+) create mode 100644 Markdown_Viewer.html create mode 100644 Mermaid_Selector.html create mode 100644 docs/context/business_logic_flows.md create mode 100644 docs/context/clinical_context.md create mode 100644 docs/context/state_machine_and_badging.md create mode 100644 docs/context/technical_specs.md create mode 100644 project_structure.md create mode 100644 project_structure_detailed.md create mode 100644 srs_document.md diff --git a/Markdown_Viewer.html b/Markdown_Viewer.html new file mode 100644 index 0000000..87f89c1 --- /dev/null +++ b/Markdown_Viewer.html @@ -0,0 +1,944 @@ + + + + + + Markdown Viewer + + + + + + + + + + +
+
+

πŸ“ Markdown Viewer

+

Upload a file or paste your Markdown to see beautiful rendered output

+
+ +
+
+

πŸ“„ Input

+ +
+ + +
+ +
+ + +
+ +
+
+
πŸ“
+

Click to upload or drag and drop

+

Supports .md, .markdown, .txt files

+
+ +
+ +
+

βš™οΈ Options

+
+ + + + +
+
+ +
+
+ +
+

πŸ‘οΈ Preview

+
+
+
πŸ“„
+

Your rendered Markdown will appear here

+
+
+ + + +
+
+
+
+
+ + + + + diff --git a/Mermaid_Selector.html b/Mermaid_Selector.html new file mode 100644 index 0000000..53ce10b --- /dev/null +++ b/Mermaid_Selector.html @@ -0,0 +1,1053 @@ + + + + + + Mermaid Diagram Viewer + + + + +
+
+

🎨 Mermaid Diagram Viewer

+

Upload a file or paste your Mermaid code to visualize beautiful diagrams

+
+ +
+
+

πŸ“ Input

+ +
+ + +
+ +
+ + +
+ +
+
+
πŸ“
+

Click to upload or drag and drop

+

Supports .mmd, .mermaid, .txt files

+
+ +
+ +
+
+ +
+

πŸ–ΌοΈ Preview

+
+
+
πŸ“Š
+

Your diagram will appear here

+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/docs/context/business_logic_flows.md b/docs/context/business_logic_flows.md new file mode 100644 index 0000000..ea672fb --- /dev/null +++ b/docs/context/business_logic_flows.md @@ -0,0 +1,35 @@ +# IamBilly Backend: Business Logic Flows + +Based on the documentation at `http://34.111.248.204/`, the system consists of 9 core flows. + +## Flow 1: Surgeon Records Dictation +- **Patient Search:** Surgeon searches via MRN or Name. +- **Failures:** If EMR connectivity fails, system retries (1s, 4s, 16s backoff). If still failing, cached data (<24h) is used or manual entry is required. +- **Audio Capture:** Recorded locally on mobile, encrypted with AES-256. +- **Connectivity:** Auto-uploads via TLS 1.3 when internet is available. + +## Flow 2: AI Processing Pipeline (System) +- **Transcription (Whisper):** Converts audio to text using medical vocabulary. +- **Entity Extraction:** NLP extracts Diagnoses, Procedures, and Anatomy. +- **Code Mapping:** AI maps to ICD-10 and CPT codes with modifiers. + +## Flow 3: Surgeon Status Check +- Surgeon monitors claim progress through a "Today's Cases" dashboard. + +## Flow 4: Biller Review & Approval +- **Interface:** Side-by-side view (Transcript vs. Coded Claim). +- **Mandatory HR:** Human review is required before approval. AI-suggested fields are highlighted. + +## Flow 5: Export & Revocation +- **5a (Export):** Claims are pushed to Athena or EMR via API or generated as CMS-1500 PDF. +- **5b (Revoke):** Supervisors can revoke approved status only if not yet exported. + +## Flow 6: Rejection Handling +- Claims rejected by billers come back to the Biller queue with specific reasons for manual correction. + +## Flow 7 & 8: Management & Administration +- **Supervisor:** Operational oversight (efficiency, accuracy). +- **Admin:** System health monitoring, audit log reviews, and MFA management. + +## Flow 9: Security Safeguards +- Session timeouts (15 mins), MFA requirements, and persistent audit logs. diff --git a/docs/context/clinical_context.md b/docs/context/clinical_context.md new file mode 100644 index 0000000..17743f0 --- /dev/null +++ b/docs/context/clinical_context.md @@ -0,0 +1,17 @@ +# IamBilly Backend: Clinical Context (Spine Surgery) + +The MVP is focused on high-volume spine surgery procedures. + +## Procedure Templates +The system supports 6 core spine templates, with a focus on: +1. **ACDF Single Level** (Anterior Cervical Discectomy and Fusion) +2. **Lumbar Fusion** + +These templates pre-load standard CPT/ICD code pairs, which the surgeon can then modify via dictation if deviations occurred during the surgery. + +## Clinical Entities +The extraction layer is optimized for: +- **Diagnoses:** ICD-10 codes. +- **Procedures:** CPT codes + Modifiers (-59, -LT/RT, -51). +- **Laterality:** Left, Right, Bilateral. +- **Anatomy:** Specific spinal levels (e.g., C3-C4, L4-L5). diff --git a/docs/context/state_machine_and_badging.md b/docs/context/state_machine_and_badging.md new file mode 100644 index 0000000..2cf7bd8 --- /dev/null +++ b/docs/context/state_machine_and_badging.md @@ -0,0 +1,19 @@ +# IamBilly Backend: State Machine & AI Confidence Logic + +## Claim State Machine +Claims must transition through the following 7 states: + +1. **RECORDED:** Audio captured on mobile, upload confirmed. +2. **TRANSCRIBING:** AI engine (Whisper) is converting audio to text. +3. **CODING:** NLP engine is extracting entities and mapping ICD/CPT codes. +4. **READY FOR REVIEW:** AI processing complete; sitting in Biller's inbox. +5. **APPROVED:** Human biller/supervisor has verified the claim data. +6. **REJECTED:** Human biller rejected the claim (requires reason code). +7. **EXPORTED:** Claim successfully pushed to EMR or PDF generated. Final state. + +## AI Confidence Logic (Color Coding) +The processing layer must assign a confidence score to each extracted item: + +- 🟒 **Green (>90%):** High confidence. Visual indicator for "Fast-Track" approval. +- 🟠 **Orange (70-90%):** Moderate confidence. Triggers a mandatory review highlight for the biller. +- πŸ”΄ **Red (<70%):** Low confidence. System must provide the top 3 alternative mapping suggestions if confidence is between 70-80%, otherwise requires manual entry from scratch. diff --git a/docs/context/technical_specs.md b/docs/context/technical_specs.md new file mode 100644 index 0000000..8544f5d --- /dev/null +++ b/docs/context/technical_specs.md @@ -0,0 +1,17 @@ +# IamBilly Backend: Technical Context & Specifications + +## Security & Encryption +- **At Rest:** All PHI and Audio files must be encrypted with **AES-256 GCM**. +- **In Transit:** All API communication and file uploads must use **TLS 1.3**. +- **Audit Logs:** Immutable audit trail records for 7 years (per HIPAA). + +## Connectivity & Retries +- **EMR Integration:** 3 retry attempts for connectivity (exponential backoff: 1s, 4s, 16s). +- **Athena/EMR Export:** 3 retry attempts for data push if the downstream API is unavailable. +- **Caching:** + - Patient data cached for 24 hours to mitigate EMR downtime. + - Clinical documents cached for 30 days for cross-session reference. + +## Identity Management +- 15-minute global session idle timeout. +- Mandatory Multi-Factor Authentication (MFA) for Administrative and Supervisor roles. diff --git a/project_structure.md b/project_structure.md new file mode 100644 index 0000000..a044a48 --- /dev/null +++ b/project_structure.md @@ -0,0 +1,123 @@ +### Technology Stack +- **Framework:** FastAPI (Python) +- **Database:** PostgreSQL + pgvector +- **Cache:** Redis +- **Task Queue:** Celery/RabbitMQ (for async STT and AI processing) +- **AI Engines:** Whisper (Self-hosted), Llama/Mistral (Self-hosted) + +--- + +## Production Readiness Overview +This structure is built to **Production Grade** standards for healthcare applications: +- **Scalability:** The Monolithic Modular approach allows modules to be extracted into microservices if needed, while maintaining shared logic in `shared/`. +- **Security:** Built-in layers for JWT OAuth2, AES-256 encryption, and HIPAA-compliant data masking. +- **Observability:** Centralized JSON logging and audit trails in `shared/utils/hipaa_logger.py`. +- **Reliability:** Async DB sessions and background task workers (Celery) prevent request blocking during heavy AI processing. +- **Maintainability:** Clear separation of concerns mapping directly to business domains (Claim, Billing, Patient). + +--- + +## 1. Directory Tree Overview + +```text +iam_billy_backend/ +β”œβ”€β”€ app/ # Main Application Package +β”‚ β”œβ”€β”€ main.py # App entry point & middleware +β”‚ β”œβ”€β”€ api/ # API Experience Layer +β”‚ β”‚ β”œβ”€β”€ dependencies.py # Global FastAPI dependencies +β”‚ β”‚ └── v1/ # Versioned API routes +β”‚ β”‚ └── api.py # Main router inclusion +β”‚ β”œβ”€β”€ core/ # Global System Core +β”‚ β”‚ β”œβ”€β”€ config.py # Settings (Pydantic) +β”‚ β”‚ β”œβ”€β”€ security.py # Auth (JWT/OAuth2) logic +β”‚ β”‚ └── exceptions.py # Global error handlers +β”‚ β”œβ”€β”€ database/ # Data Access Layer +β”‚ β”‚ β”œβ”€β”€ base.py # SQLAlchemy model registry +β”‚ β”‚ β”œβ”€β”€ session.py # DB Engine & Session factory +β”‚ β”‚ └── migrations/ # Alembic migration scripts +β”‚ β”‚ β”œβ”€β”€ env.py +β”‚ β”‚ β”œβ”€β”€ script.py.mako +β”‚ β”‚ └── versions/ # Individual migration files +β”‚ β”œβ”€β”€ modules/ # Domain-Driven Modules (Processing Layer) +β”‚ β”‚ β”œβ”€β”€ auth/ +β”‚ β”‚ β”œβ”€β”€ patient/ +β”‚ β”‚ β”œβ”€β”€ audio/ +β”‚ β”‚ β”œβ”€β”€ ai_service/ +β”‚ β”‚ β”œβ”€β”€ billing/ +β”‚ β”‚ β”œβ”€β”€ claim/ +β”‚ β”‚ └── integration/ # EMR Adapter Implementations +β”‚ β”œβ”€β”€ shared/ # Cross-Module Resources +β”‚ β”‚ β”œβ”€β”€ models/ # Shared DB Mixins +β”‚ β”‚ β”œβ”€β”€ schemas/ # Generic Pydantic models (Pagination, Error) +β”‚ β”‚ └── utils/ # Global Utilities +β”‚ β”‚ β”œβ”€β”€ hipaa_logger.py # HIPAA-compliant JSON logging +β”‚ β”‚ β”œβ”€β”€ encryptor.py # AES-256 Utility +β”‚ β”‚ └── date_helpers.py # TZ-aware time logic +β”‚ └── ai_layer/ # AI Infrastructure (Local Inference) +β”‚ β”œβ”€β”€ models/ # Model loading scripts +β”‚ β”œβ”€β”€ rag/ # Vector store & RAG pipelines +β”‚ └── inference/ # Local Whisper/LLM runners +β”œβ”€β”€ scripts/ # Operational Scripts (Data Migration, ETL) +β”œβ”€β”€ tests/ # Pytest Suite +β”œβ”€β”€ alembic.ini # Migration config +β”œβ”€β”€ docker-compose.yml # Local Dev Orchestration +β”œβ”€β”€ requirements.txt # Dependency list +└── README.md +``` + +--- + +## 2. Granular Module Breakdown + +Every domain module (`app/modules//`) follows this structure: + +### `models.py` +Defines SQLAlchemy ORM models. +- *Example (Claim):* `Claim`, `ClaimHistory`, `AuditLog`. + +### `schemas.py` +Defines Pydantic models for request validation and response serialization. +- *Example (Patient):* `PatientCreate`, `PatientPublic`, `PatientSearch`. + +### `services/` +Contains the business logic decoupled from the API layer. +- `logic.py`: Primary business rules. +- `validators.py`: Domain-specific validation logic. + +### `routers/` +FastAPI route definitions. +- `endpoints.py`: HTTP methods (GET, POST, etc.) and payload handling. + +### `tasks/` +Celery background tasks. +- `worker.py`: Async task definitions for the specific module. + +--- + +## 3. Specialized Layouts + +### **Migrations (`app/database/migrations/`)** +- `versions/`: Auto-generated migration files reflecting schema changes. +- `env.py`: Connects Alembic to the FastAPI application metadata. + +### **Utilities (`app/shared/utils/`)** +- `hipaa_logger.py`: Custom logging to ensure PHI isn't leaked in logs while satisfying audit trail requirements. +- `encryptor.py`: Standardized AES-256 GCM logic for encrypting audio files at rest. +- `emr_client.py`: Base HTTP client session for integration adapters. + +### **AI Layer (`app/ai_layer/`)** +- `inference/whisper_runner.py`: Logic to feed audio blobs to the local Whisper engine. +- `rag/vector_indexer.py`: Logic to ingest PDF manuals/cheat-sheets into `pgvector`. + +--- + +## 4. Key Implementation Files + +- **`app/main.py`**: Initializes FastAPI, adds CORS, mounts `v1` router, and sets up startup/shutdown events (DB connection, Redis init). +- **`app/core/config.py`**: Uses `pydantic-settings` to load `.env` variables for Database URL, Redis URL, JWT Secret, and AI model paths. +- **`requirements.txt`**: + - `fastapi`, `uvicorn`, `sqlalchemy[asyncio]`, `asyncpg` + - `pydantic[email]`, `pydantic-settings` + - `alembic`, `celery`, `redis` + - `cryptography`, `python-jose[cryptography]` + - `pgv-sdk` (for vector search) diff --git a/project_structure_detailed.md b/project_structure_detailed.md new file mode 100644 index 0000000..0d3b5e1 --- /dev/null +++ b/project_structure_detailed.md @@ -0,0 +1,90 @@ +# Detailed File-Level Project Structure: IamBilly Backend + +This document provides an exhaustive file-level breakdown of the IamBilly backend, adhering to the monolithic modular pattern and the stack requirements (FastAPI, PostgreSQL, Redis, AI Layer). + +--- + +## 1. Global Core Structure (`app/`) + +### `app/core/` +- `config.py`: Centralized environment variable management (Pydantic Settings). +- `security.py`: JWT token generation, password hashing, and OAuth2 scopes. +- `encryption.py`: AES-256 logic for at-rest storage of audio and PHI. +- `logging.py`: HIPAA-compliant JSON logging with audit trail metadata. +- `exceptions.py`: Custom HTTP exception handlers for global error responses. + +### `app/api/` +- `v1/api.py`: Main router that includes all module-level routers. +- `dependencies.py`: Global FastAPI dependencies (e.g., `get_current_active_user`). + +### `app/database/` +- `base.py`: Import all SQLAlchemy models here for Alembic detection. +- `session.py`: Database engine and async session factory setup. +- `crud.py`: Common CRUD operations for shared entities. + +--- + +## 2. Domain Modules (`app/modules/`) + +Every module below follows this structure: `/[models.py, schemas.py, service.py, router.py, tasks.py]`. + +### `modules/auth/` +- `models.py`: `User`, `Role`, `Permission` tables. +- `schemas.py`: Login, Token, and User management schemas. +- `service.py`: Authentication logic, role verification. +- `router.py`: `/login`, `/refresh`, `/me` endpoints. + +### `modules/patient/` +- `models.py`: `Patient`, `PatientEMRMap`. +- `schemas.py`: Patient search results and metadata. +- `service.py`: Real-time Lookup logic (caching EMR results). +- `router.py`: `/patients/search`, `/patients/{id}`. + +### `modules/audio/` +- `models.py`: `Recording`, `AudioSession`. +- `schemas.py`: Upload validation and status updates. +- `service.py`: Secure file handling, metadata extraction. +- `router.py`: `/audio/upload`, `/audio/{id}/status`. +- `tasks.py`: Background task to move files to encrypted object storage. + +### `modules/ai_service/` +- `service.py`: Orchestrator for the "Voice-to-Claim" pipeline. +- `extraction.py`: NLP logic for entity extraction (Diagnosis, Procedure, Laterality). +- `transcription.py`: Integration with Whisper STT. +- `tasks.py`: Celery worker for long-running STT/LLM inference. + +### `modules/billing/` +- `models.py`: `PayerRule`, `NCCIEdit`, `ModifierMap`. +- `service.py`: Rules engine logic, optimization strategies for 10-20 spine procedures. +- `rag_service.py`: Logic to query the `ai_layer` vector store for scrubbing. +- `router.py`: `/billing/payer-rules`. + +### `modules/claim/` +- `models.py`: `Claim` (State Machine), `ClaimRevision`, `AuditTrail`. +- `schemas.py`: Full CMS-1500 JSON representation. +- `service.py`: Lifecycle management (Draft -> Human Review -> Exported). +- `router.py`: `/claims/queue`, `/claims/{id}/approve`, `/claims/{id}/reject`. + +--- + +## 3. Infrastructure & Integration Layer + +### `app/ai_layer/` +- `models/`: Loader scripts for Llama/Mistral/Qwen models. +- `rag/`: Vector DB initialization (pgvector), indexing policy docs/manuals. +- `inference/`: `whisper_service.py` (STT extraction) and `llm_service.py` (Entity Extraction). + +### `app/integration/` +- `base_adapter.py`: Abstract base class for EMR integrations. +- `epic_fhir.py`: Epic R4 FHIR client (Patient, Encounter, DocRef resources). +- `athena_api.py`: Custom client for Athena Centricity. +- `curemd_api.py`: CureMD REST client. + +--- + +## 4. Root Config Files +- `main.py`: App initialization, Middleware (CORS/HIPAA logging), and API Mounting. +- `celery_app.py`: Celery configuration for async tasks. +- `alembic.ini`: Database migration config. +- `docker-compose.yml`: Definition for `api`, `worker`, `db`, `redis`. +- `requirements.txt`: To include: `fastapi`, `sqlalchemy[asyncio]`, `pgv-sdk`, `pydantic-settings`, `celery`, `redis`, `python-multipart`, `cryptography`. diff --git a/srs_document.md b/srs_document.md new file mode 100644 index 0000000..275b322 --- /dev/null +++ b/srs_document.md @@ -0,0 +1,1999 @@ +# Software Requirements Specification + +## AI-Powered Medical Billing Automation System + +#### For Out-of-Network Neurosurgical & Orthopedic Practices + +``` +Prepared by: Dextra Labs Pte Ltd +Document Version: 2. +Date: February 2026 +``` +**Document Information** + +**Client** Dr. Brian McHugh / McHugh +Neurosurgery + +**Project Name** AI-Powered Medical Billing Automation + +**Implementation Partner** Dextra Labs + +**Development Partner** Tech4Biz Solutions Pvt Ltd + +**Document Status** v2.0 β€” Development-Ready + + +## Table of Contents + + +## 1. Introduction + +### 1.1 Purpose + +This Software Requirements Specification (SRS) document provides a comprehensive, +development-ready description of the AI-Powered Medical Billing Automation System. It details +functional and non-functional requirements, system architecture, interfaces, constraints, and +milestone-aligned deliverables for building an intelligent billing automation platform for out-of- +network neurosurgical and orthopedic practices. + +This document serves as the primary reference for developers, project managers, technical +leads, QA personnel, and stakeholders throughout the development lifecycle. All requirements +are mapped to SOW milestones to ensure traceability from specification to delivery. + +### 1.2 Scope + +The system is designed to transform the end-to-end claim generation process for specialty +medical practices. **[UPDATED]** The MVP focuses on 5-10 common spine surgery types +(covering approximately 98-99% of typical procedures) for a single practice with up to 5 +providers. + +The system will: + +- Convert clinical audio dictation to text using Whisper ASR with medical-specialized + vocabulary +- Extract clinical entities (diagnoses, procedures, anatomical locations) using NLP +- Map extracted entities to ICD-10 diagnosis codes and CPT procedure codes +- Apply payer-specific billing optimization rules to maximize reimbursement +- Perform RAG-powered claim scrubbing against NCCI edits and LCD/NCD coverage + determinations +- Provide human-in-the-loop review workflow for mandatory claim verification +- Integrate with existing EMR systems (Epic, Athena Centricity, CureMD) via FHIR and + REST APIs +- Support a template-based fast-track workflow for standard procedures (e.g., standard 2- + level ACDF) +- Maintain HIPAA compliance through cloud-hosted, VPC-isolated AI infrastructure + +### 1.3 System Boundary + +#### 1.3.1 In Scope (Our System) + +- Audio capture via mobile application (smartphone-based, no dedicated hardware) +- Speech-to-text conversion with medical terminology using Whisper STT +- EMR data retrieval (patient demographics, insurance, clinical documentation) from Epic, + Athena, CureMD +- AI-powered entity extraction and ICD-10/CPT code mapping with confidence scoring +- RAG-powered claim scrubbing against payer-specific rules, NCCI edits, and LCD/NCD +- Payer-specific optimization and claim generation (CMS-1500 professional claims) +- Template-based fast-track workflow for standard procedures +- Human review workflow with mandatory approval before finalization +- Export of approved claims for downstream processing + + +- Audit trails, compliance reporting, and analytics dashboard +- Provider interface supporting dictate β†’ review β†’ approve β†’ submit workflow + +#### 1.3.2 Out of Scope (Existing Practice Workflow) + +- Claim submission to clearinghouse or insurance payers (EDI 837P/837I transmission) +- Payment posting and ERA/835 processing +- Patient billing and collections +- Denial appeals and follow-up automation +- Accounts receivable management +- Operative report generation/amendment (future phase consideration) +- ModMed integration (mentioned in original brief, formally excluded from MVP) +- UB-04 institutional claims (MVP covers CMS-1500 professional claims only) + +_Note: The practice will continue using their existing systems (Athena Centricity) for claim +submission and payment processing. Our system generates the approved claim and hands off +to their existing workflow._ + +#### [NEW] 1.3.3 Out-of-Network Billing Lifecycle Context + +Developers must understand the out-of-network (OON) billing lifecycle to build this system +effectively. The typical OON claim lifecycle is: Procedure β†’ Coding & Claim Generation (our +system) β†’ Submission to Payer β†’ Initial Denial (common for OON) β†’ Appeal with +Documentation β†’ Arbitration (if needed) β†’ Payment. Our system’s output directly impacts +denial rates and appeal success by ensuring accurate coding, proper modifier application, and +comprehensive documentation from the start. + +### 1.4 Definitions, Acronyms, and Abbreviations + +``` +Term Definition +``` +``` +ASR Automatic Speech Recognition β€” converts spoken words to text +CPT Current Procedural Terminology β€” medical procedure codes maintained by +AMA +``` +``` +EMR/EHR Electronic Medical/Health Record β€” digital patient medical history +FHIR Fast Healthcare Interoperability Resources β€” healthcare data exchange +standard +HIPAA Health Insurance Portability and Accountability Act β€” US healthcare privacy +law +ICD- 10 International Classification of Diseases, 10th Revision β€” diagnosis coding +system +LCD/NCD Local/National Coverage Determination β€” payer coverage policies +``` +``` +LLM Large Language Model β€” AI model trained on large text datasets +``` +``` +LoRA Low-Rank Adaptation β€” technique for efficient LLM fine-tuning +MDM Medical Decision Making β€” complexity level used for E/M coding +``` +``` +MVP Minimum Viable Product β€” initial product version with core features +``` + +``` +NCCI National Correct Coding Initiative β€” CMS edits to prevent improper code +pairs +NLP Natural Language Processing β€” AI techniques for understanding text +``` +``` +OON Out-of-Network β€” healthcare providers not contracted with insurance plans +``` +``` +PHI Protected Health Information β€” patient health data protected by HIPAA +RAG Retrieval-Augmented Generation β€” AI technique combining semantic search +with LLM +STT Speech-to-Text β€” audio to text conversion technology +``` +``` +WER Word Error Rate β€” standard metric for measuring STT accuracy +``` +### 1.5 References + +- SO070FY26: Real-Time Dictation and Automated Billing AI Platform (December 2025) +- SOW: AI Dictation Application Development, Dextra Labs v +- Product Vision Document β€” Dr. Brian McHugh Discovery Session Notes (1st & 2nd + calls) +- Project Kickoff Presentation (February 2026) +- AI-Enabled Real-Time Dictation and Automated Billing Project Brief +- HIPAA Security Rule (45 CFR Part 164) +- HL7 FHIR R4 Specification +- Epic FHIR API Documentation + + +## 2. Overall Description + +### 2.1 Product Perspective + +The AI-Powered Medical Billing Automation System addresses a significant gap in the +healthcare revenue cycle management market. While in-network billing has seen substantial +automation advances, out-of-network specialty billing remains largely manual due to complex +payer-specific coding requirements, high-value procedures requiring precise documentation, +institutional knowledge locked within experienced human coders, and stringent HIPAA +requirements limiting cloud-based AI adoption. + +This system serves as a standalone application integrating with existing EMR infrastructure +while providing its own AI-powered processing layer. + +### 2.2 Why Historical Billing Data is Critical + +The system requires access to historical billing data from the practice (filtered for relevance). +This data serves several critical purposes: + +- Pattern Recognition: Learning which CPT codes are most successful for specific + procedures with specific payers +- Denial Avoidance: Understanding which code combinations and documentation patterns + lead to claim denials +- Payer-Specific Optimization: Different insurers have different preferences (e.g., BCBS + prefers certain code sequences while Cigna accepts others) +- Institutional Knowledge Capture: Expert coders have accumulated decades of + knowledge about what works + +**Important:** Historical data will be filtered to exclude outdated policy periods. For example, if +United Healthcare restructured policies 3 years ago, denial patterns from before that date are +excluded as no longer relevant. Data filtering criteria must be defined per-payer based on policy +change dates. + +### 2.3 Product Functions + +#### 2.3.1 Audio Capture and Transcription + +- Smartphone-based audio recording for clinical dictation (no dedicated hardware in MVP) +- AI-powered speech-to-text conversion with medical vocabulary fine-tuning +- Noise reduction optimized for hospital environments (post-procedure hallway, dictation + room, OR) +- Single speaker per dictation session (clinic claims may have PA voice, surgical claims + have surgeon voice) + +#### 2.3.2 Clinical Entity Extraction + +- Automated extraction of diagnoses, procedures, anatomical locations, and laterality +- Temporal relationship mapping for procedure sequencing +- Confidence scoring for each extracted entity + +#### 2.3.3 Medical Code Mapping + +- Automatic mapping of diagnoses to ICD-10 codes and procedures to CPT codes + + +- Modifier suggestions based on context (-59, -51, -LT/RT, etc.) +- Confidence scoring with alternative code suggestions for edge cases + +#### [NEW] 2.3.4 Template-Based Fast-Track Workflow + +For standard, high-volume procedures (e.g., standard 2-level ACDF), the system provides a +fast-track workflow where the surgeon selects a procedure template and associates it with a +patient. The system auto-generates the claim with pre-mapped CPT/ICD codes, modifiers, and +justification. The surgeon can add dictation notes for any deviations. This significantly reduces +time-to-claim for routine procedures. + +#### 2.3.5 RAG-Powered Claim Scrubbing + +Automated claim review using Retrieval-Augmented Generation against payer-specific rules, +NCCI edits, and Local/National Coverage Determinations (LCD/NCD). The RAG corpus +includes policy documents per individual plan, coding manuals, billing guidelines, and the +practice’s laminated cheat sheets. + +#### 2.3.6 Payer-Specific Optimization + +- Business rules engine for payer-specific coding strategies +- Historical denial pattern analysis for proactive optimization +- Coverage for top 30 local insurance plans in MVP (starting with top 10 for initial + validation) + +#### 2.3.7 Claim Generation and Review + +- Automated generation of submission-ready CMS- 15 00 professional claims +- Mandatory human review interface with AI-generated content highlighting +- Correction workflow with feedback captured for model improvement +- Complete audit trails for compliance and quality tracking + +### 2.4 User Classes and Characteristics + +``` +User Class Description Technical Proficiency +``` +``` +Surgeon/Physician Primary users who record clinical dictation via +mobile app. Up to 5 providers in MVP. +``` +``` +Low β€” Requires simple, +intuitive interface +Medical Biller Reviews AI-generated claims, makes +corrections, approves submissions +``` +``` +Moderate β€” Familiar with +billing systems +``` +``` +Billing Supervisor Oversees billing operations, manages rules +and policies +``` +``` +Moderate β€” Dashboard and +reporting focus +``` +``` +System +Administrator +``` +``` +Manages system configuration, users, and +integrations +``` +``` +High β€” Technical +administration +``` +### 2.5 Operating Environment + +- Web Application: Modern browsers (Chrome, Firefox, Safari, Edge) on Windows, + macOS +- Mobile Application: iOS 14+ and Android 11+ smartphones for audio capture + + +- Server Infrastructure: Cloud-hosted (AWS/GCP/Azure) with HIPAA-compliant VPC + isolation +- AI Infrastructure: GPU-enabled cloud instances or Mac Mini M-series for LLM inference +- Database: PostgreSQL for structured data; vector database for RAG/semantic search +- Network: Always-online connectivity required (no offline mode in MVP) + +### 2.6 Design and Implementation Constraints + +- HIPAA Compliance: All PHI must remain within HIPAA-compliant, VPC-isolated + infrastructure +- Self-Hosted AI: No external API calls for patient data processing (LLM inference is self- + hosted) +- Human-in-the-Loop: Mandatory human review before any claim finalization +- Single Practice: MVP limited to Dr. McHugh’s practice only (up to 5 providers) +- Surgical Focus: MVP focused on 5-10 common spine surgery types (CMS- 1500 + professional claims only) +- English Only: Speech-to-text optimized for English medical dictation, single speaker per + session +- Open-Source LLM: Llama/Mistral/Mixtral/Qwen with LoRA fine-tuning + +### 2.7 Assumptions and Dependencies + +#### 2.7.1 Assumptions + +- Internet connectivity is always available at all practice locations +- Practice will provide access to historical billing and denial data (filtered for relevance) +- Payer policy documents, coding manuals, and cheat sheets will be provided for RAG + corpus +- Human billers will be available for review and correction during MVP +- Epic FHIR API access will be granted through proper channels (timeline dependency) +- No major scope changes during the execution of agreed phases +- Client will provide list of 5-10 common spine surgeries with typical CPT/ICD code + combinations +- Client will provide prioritized list of top 30 insurance plans for payer rules configuration + +#### 2.7.2 Dependencies + +- Epic Systems: FHIR R4 API availability and access credentials (critical path β€” can take + 3 - 6 months) +- Athena Centricity: Legacy system data export capabilities and API availability +- CureMD: REST API documentation and access +- Open-source LLM: Availability of Llama 2/3 or Mistral/Mixtral models for self-hosting +- Whisper ASR: Model availability for self-hosting with medical vocabulary fine-tuning +- Client SMEs: Availability of billing and clinical subject matter experts for validation + + +## 3. Functional Requirements + +All functional requirements are mapped to SOW milestones for traceability. Priority levels: Must +(MVP-critical), Should (important but can be deferred), Could (nice-to-have). + +### 3.1 Audio Capture Module (Mobile App) β€” Milestone 2 + +``` +Req ID Requirement Description Milestone Priority +FR-AC- +001 +``` +``` +System shall allow users to record audio dictation using +smartphone microphone +``` +``` +M2 Must +``` +``` +FR-AC- +002 +``` +``` +System shall support audio recording in standard formats +(AAC, MP3, WAV) +``` +``` +M2 Must +``` +``` +FR-AC- +003 +``` +``` +System shall display recording duration and audio level +indicator +``` +``` +M2 Must +``` +``` +FR-AC- +004 +``` +``` +System shall allow pause and resume during recording +session +``` +``` +M2 Must^ +``` +``` +FR-AC- +005 +``` +``` +System shall require user to associate recording with patient +identifier (MRN or encounter ID from EMR lookup) +``` +``` +M2 Must^ +``` +``` +FR-AC- +006 +``` +``` +System shall encrypt audio files at rest on device before +upload (AES-256) +``` +``` +M2 Must^ +``` +``` +FR-AC- +007 +``` +``` +System shall automatically upload recordings when network is +available +``` +``` +M2 Must^ +``` +``` +FR-AC- +008 +``` +``` +System shall support multiple recording sessions per patient +encounter +``` +``` +M2 Must^ +``` +``` +FR-AC- +009 +``` +``` +System shall support template-based fast-track: surgeon +selects standard procedure template and patient, bypassing +full dictation +``` +``` +M2 Must^ +``` +### 3.2 Speech-to-Text Module β€” Milestones 2- 3 + +``` +Req ID Requirement Description Milestone Priority +``` +``` +FR-ST- 001 System shall convert audio dictation to text with β‰₯97% word +accuracy (WER) on a mutually agreed medical dictation test +corpus +``` +``` +M3 Must^ +``` +``` +FR-ST- 002 System shall use medical-specific vocabulary (ICD-10 terms, +CPT terms, drug names, anatomical terms) for recognition +``` +``` +M3 Must +``` +``` +FR-ST- 003 System shall recognize ICD-10 and CPT codes when spoken +directly +``` +``` +M3 Should +``` +``` +FR-ST- 004 System shall handle medical drug names and dosages +accurately +``` +``` +M3 Must +``` +``` +FR-ST- 005 System shall apply AI-powered noise reduction for hospital +environments (hallway, dictation room, OR) +``` +``` +M3 Must +``` +``` +FR-ST- 006 System shall provide transcript with timestamps for review M2 Must +``` + +``` +FR-ST- 007 System shall allow manual correction of transcription errors M2 Must^ +FR-ST- 008 System shall mark low-confidence words/phrases for human +review +``` +``` +M3 Must^ +``` +### 3.3 Clinical Entity Extraction Module β€” Milestone 3 + +``` +Req ID Requirement Description Milestone Priority +``` +``` +FR-EE- 001 System shall extract diagnoses from clinical documentation M3 Must^ +FR-EE- 002 System shall identify procedures and treatments performed M3 Must^ +FR-EE- 003 System shall recognize anatomical locations and laterality +(left/right) +``` +``` +M3 Must^ +``` +``` +FR-EE- 004 System shall extract temporal relationships between +procedures +``` +``` +M3 Should^ +``` +``` +FR-EE- 005 System shall identify patient demographics from EMR +integration +``` +``` +M2 Must^ +``` +``` +FR-EE- 006 System shall extract insurance/payer information for the +encounter +``` +``` +M2 Must^ +``` +``` +FR-EE- 007 System shall provide confidence scores for each extracted +entity +``` +``` +M3 Must^ +``` +### 3.4 Code Mapping Module β€” Milestone 3 + +``` +Req ID Requirement Description Milestone Priority +FR-CM- +001 +``` +``` +System shall map extracted diagnoses to ICD-10 codes M3 Must +``` +``` +FR-CM- +002 +``` +``` +System shall map procedures to appropriate CPT codes M3 Must +``` +``` +FR-CM- +003 +``` +``` +System shall suggest appropriate CPT modifiers based on +context (-59, -51, -LT/RT, etc.) +``` +``` +M3 Must +``` +``` +FR-CM- +004 +``` +``` +System shall provide confidence scores for each code +mapping +``` +``` +M3 Must^ +``` +``` +FR-CM- +005 +``` +``` +System shall suggest alternative codes for low-confidence +mappings (<80%) +``` +``` +M3 Must^ +``` +``` +FR-CM- +006 +``` +``` +System shall validate code combinations against NCCI/CCI +edits +``` +``` +M3 Must^ +``` +``` +FR-CM- +007 +``` +``` +System shall support neurosurgery-specific code sets (5- 10 +common spine procedures) +``` +``` +M3 Must^ +``` +``` +FR-CM- +008 +``` +``` +System shall support orthopedic surgery-specific code sets M3 Should^ +``` +``` +FR-CM- +009 +``` +``` +System shall apply confidence thresholds: >90% auto- +suggest, 70-90% flag for review, <70% require manual coding +``` +``` +M3 Must^ +``` + +``` +FR-CM- +010 +``` +``` +System shall determine and assign Medical Decision Making +(MDM) level based on clinical documentation complexity +``` +``` +M3 Must^ +``` +``` +FR-CM- +011 +``` +``` +System shall generate medical necessity justification text to +support assigned codes +``` +``` +M3 Must^ +``` +### 3.5 RAG-Powered Claim Scrubbing Module β€” Milestone 3 + +``` +Req ID Requirement Description Milestone Priority +``` +``` +FR-RS- +001 +``` +``` +System shall scrub generated claims against payer-specific +rules using RAG +``` +``` +M3 Must^ +``` +``` +FR-RS- +002 +``` +``` +System shall validate claims against NCCI edits for improper +code pair detection +``` +``` +M3 Must^ +``` +``` +FR-RS- +003 +``` +``` +System shall check claims against Local Coverage +Determinations (LCD) +``` +``` +M3 Must +``` +``` +FR-RS- +004 +``` +``` +System shall check claims against National Coverage +Determinations (NCD) +``` +``` +M3 Must +``` +``` +FR-RS- +005 +``` +``` +System shall flag claims failing scrubbing rules with specific +failure reasons +``` +``` +M3 Must +``` +``` +FR-RS- +006 +``` +``` +System shall suggest corrective actions for failed scrubbing +checks +``` +``` +M3 Should +``` +``` +FR-RS- +007 +``` +``` +RAG corpus shall include: policy docs per plan, coding +manuals, cheat sheets, billing guidelines +``` +``` +M3 Must +``` +### 3.6 Payer Rules Engine β€” Milestones 3- 4 + +``` +Req ID Requirement Description Milestone Priority +FR-PR- +001 +``` +``` +System shall maintain payer-specific coding rules for top 30 +local plans +``` +``` +M3 Must +``` +``` +FR-PR- +002 +``` +``` +System shall apply different coding strategies based on payer M3 Must^ +``` +``` +FR-PR- +003 +``` +``` +System shall incorporate historical denial pattern analysis M3 Must^ +``` +``` +FR-PR- +004 +``` +``` +System shall flag claims at high risk of denial based on +historical patterns +``` +``` +M4 Must^ +``` +``` +FR-PR- +005 +``` +``` +System shall optimize code selection for maximum +reimbursement +``` +``` +M3 Must^ +``` +``` +FR-PR- +006 +``` +``` +System shall allow manual update of payer rules by billing +staff +``` +``` +M4 Must^ +``` +``` +FR-PR- +007 +``` +``` +System shall track payer rule changes with version history M4 Should^ +``` +### 3.7 Claim Generation Module β€” Milestone 4 + + +``` +Req ID Requirement Description Milestone Priority +FR-CG- +001 +``` +``` +System shall generate complete billing claims from processed +data +``` +``` +M4 Must +``` +``` +FR-CG- +002 +``` +``` +System shall populate all required CMS-1500 claim fields +automatically +``` +``` +M4 Must +``` +``` +FR-CG- +003 +``` +``` +System shall consolidate multi-session audio into single +patient claim +``` +``` +M4 Must +``` +``` +FR-CG- +004 +``` +``` +System shall validate claim completeness before presenting +for review +``` +``` +M4 Must +``` +``` +FR-CG- +005 +``` +``` +System shall generate claims in CMS-1500 professional +format +``` +``` +M4 Must +``` +``` +FR-CG- +006 +``` +``` +System shall support multi-session audio consolidation +including recordings spanning multiple days for a single +surgical encounter +``` +``` +M4 Must +``` +``` +FR-CG- +007 +``` +``` +System shall implement claim state machine with defined +states: Draft, Pending STT, Pending AI Review, Ready for +Human Review, Approved, Rejected, Exported +``` +``` +M4 Must +``` +### 3.8 Human Review Workflow β€” Milestone 4 + +``` +Req ID Requirement Description Milestone Priority +``` +``` +FR-HR- +001 +``` +``` +System shall present claims for mandatory human review +before finalization +``` +``` +M4 Must^ +``` +``` +FR-HR- +002 +``` +``` +System shall highlight AI-generated content for reviewer +attention +``` +``` +M4 Must^ +``` +``` +FR-HR- +003 +``` +``` +System shall display original audio/transcript alongside +generated claim (side-by-side view) +``` +``` +M4 Must^ +``` +``` +FR-HR- +004 +``` +``` +System shall allow reviewers to modify any claim field M4 Must^ +``` +``` +FR-HR- +005 +``` +``` +System shall track all reviewer corrections for model +retraining feedback loop +``` +``` +M4 Must^ +``` +``` +FR-HR- +006 +``` +``` +System shall require explicit approval before claim finalization M4 Must +``` +``` +FR-HR- +007 +``` +``` +System shall support claim rejection with reason capture and +re-routing to biller for manual correction +``` +``` +M4 Must +``` +``` +FR-HR- +008 +``` +``` +System shall maintain complete audit trail of all claim +modifications +``` +``` +M4 Must +``` +``` +FR-HR- +009 +``` +``` +System shall display EMR data (demographics, insurance) +alongside claim for cross-reference +``` +``` +M4 Must +``` +### 3.9 EMR Integration β€” Milestones 2- 4 + + +``` +Req ID Requirement Description Milestone Priority +FR-EMR- +001 +``` +``` +System shall integrate with Epic via FHIR R4 API for hospital +records (operative reports, clinical notes) +``` +``` +M4 Must +``` +``` +FR-EMR- +002 +``` +``` +System shall integrate with Athena Centricity for billing data, +denied claims, patient records +``` +``` +M2 Must +``` +``` +FR-EMR- +003 +``` +``` +System shall integrate with CureMD via REST API for patient +demographics and clinical docs +``` +``` +M2 Must +``` +``` +FR-EMR- +004 +``` +``` +System shall retrieve patient demographics from EMR M2 Must +``` +``` +FR-EMR- +005 +``` +``` +System shall retrieve insurance/payer information from EMR M2 Must +``` +``` +FR-EMR- +006 +``` +``` +System shall retrieve clinical documentation and notes from +EMR +``` +``` +M2 Must +``` +``` +FR-EMR- +007 +``` +``` +System shall handle EMR connection failures gracefully with +retry logic +``` +``` +M4 Must +``` +### 3.10 Reporting and Analytics β€” Milestone 5 + +``` +Req ID Requirement Description Milestone Priority +FR-RA- +001 +``` +``` +System shall provide dashboard with claim processing status +(pending, approved, rejected, exported) +``` +``` +M5 Must^ +``` +``` +FR-RA- +002 +``` +``` +System shall display AI accuracy metrics (STT WER, code +mapping accuracy) +``` +``` +M5 Must^ +``` +``` +FR-RA- +003 +``` +``` +System shall track and report claim approval/rejection rates M5 Must^ +``` +``` +FR-RA- +004 +``` +``` +System shall provide human correction rate analytics (per +coder, per payer) +``` +``` +M5 Must^ +``` +``` +FR-RA- +005 +``` +``` +System shall generate audit reports for compliance review M5 Must^ +``` +``` +FR-RA- +006 +``` +``` +System shall support search/filter claims by patient, date +range, status, payer +``` +``` +M5 Must^ +``` +``` +FR-RA- +007 +``` +``` +System shall track same-day charge capture rate (target: +80%) +``` +``` +M5 Must^ +``` +### [NEW] 3.11 Notification System β€” Milestone 4 + +``` +Req ID Requirement Description Milestone Priority +FR-NT- 001 System shall notify billers when a new claim is ready for +review +``` +``` +M4 Must +``` +``` +FR-NT- 002 System shall notify providers when a claim is approved or +requires attention +``` +``` +M4 Should +``` +``` +FR-NT- 003 System shall support in-app push notifications on mobile M4 Should +``` + +``` +FR-NT- 004 System shall support email notifications for critical events M4 Should^ +``` +### [NEW] 3.12 Data Migration & Model Training Pipeline β€” Milestones 1, 3, + +### 6 + +``` +Req ID Requirement Description Milestone Priority +``` +``` +FR-DM- +001 +``` +``` +System shall provide ETL pipeline to extract, clean, and load +historical billing data from Athena Centricity +``` +``` +M1 Must^ +``` +``` +FR-DM- +002 +``` +``` +System shall filter historical data per-payer based on policy +change dates to exclude outdated patterns +``` +``` +M1 Must^ +``` +``` +FR-DM- +003 +``` +``` +System shall ingest and index policy documents, coding +manuals, and cheat sheets into the RAG vector store +``` +``` +M1 Must^ +``` +``` +FR-DM- +004 +``` +``` +System shall capture human reviewer corrections in a +structured format suitable for model retraining +``` +``` +M4 Must^ +``` +``` +FR-DM- +005 +``` +``` +System shall support periodic model retraining/fine-tuning +using accumulated correction data (manual trigger in MVP) +``` +``` +M6 Should^ +``` + +## 4. Non-Functional Requirements + +### 4.1 Performance Requirements + +``` +Req ID Requirement Target Milestone +NFR-P- +001 +``` +``` +Speech-to-text processing time per minute of +audio +``` +``` +< 90 seconds M +``` +``` +NFR-P- +002 +``` +``` +Claim generation time from completed transcript < 90 seconds M +``` +``` +NFR-P- +003 +``` +``` +Web dashboard page load time < 3 seconds M +``` +``` +NFR-P- +004 +``` +``` +Mobile app recording start latency < 2 seconds M +``` +``` +NFR-P- +005 +``` +``` +Provider submission workflow completion +(dictate to approve) +``` +``` +< 1 minute M +``` +``` +NFR-P- +006 +``` +``` +Concurrent users supported 20+ simultaneous M +``` +### 4.2 Security Requirements + +``` +Req ID Requirement Milestone +``` +``` +NFR-S- +001 +``` +``` +All PHI shall be encrypted at rest using AES-256 encryption M +``` +``` +NFR-S- +002 +``` +``` +All data in transit shall be encrypted using TLS 1.3 M +``` +``` +NFR-S- +003 +``` +``` +User authentication shall use OAuth 2.0 with multi-factor authentication M +``` +``` +NFR-S- +004 +``` +``` +Role-based access control shall restrict data access by user role M +``` +``` +NFR-S- +005 +``` +``` +All user actions shall be logged in tamper-proof audit trail M +``` +``` +NFR-S- +006 +``` +``` +Session timeout shall occur after 15 minutes of inactivity M +``` +``` +NFR-S- +007 +``` +``` +AI inference shall occur entirely within VPC-isolated, HIPAA-compliant +infrastructure +``` +##### M + +``` +NFR-S- +008 +``` +``` +No PHI shall be transmitted to external cloud AI services M +``` +### 4.3 Reliability and Availability + +``` +Req ID Requirement Milestone +``` + +``` +NFR-R- +001 +``` +``` +System availability target: 99.5% during business hours (6 AM - 10 PM +ET) +``` +##### M + +``` +NFR-R- +002 +``` +``` +Mean time to recovery from failure: < 4 hours M +``` +``` +NFR-R- +003 +``` +``` +Database backup frequency: Daily with 30-day retention M +``` +``` +NFR-R- +004 +``` +``` +Audio file retention: 7 years (aligned with HIPAA audit trail requirements) M +``` +``` +NFR-R- +005 +``` +``` +Graceful degradation when EMR integration is unavailable M +``` +### 4.4 Compliance Requirements + +``` +Req ID Requirement Milestone +NFR-C- +001 +``` +``` +System shall comply with HIPAA Privacy Rule requirements M +``` +``` +NFR-C- +002 +``` +``` +System shall comply with HIPAA Security Rule requirements M +``` +``` +NFR-C- +003 +``` +``` +Business Associate Agreement (BAA) required with all cloud providers M +``` +``` +NFR-C- +004 +``` +``` +Complete audit trails shall be maintained for minimum 7 years M +``` +``` +NFR-C- +005 +``` +``` +System shall support compliance reporting for audits M +``` +### 4.5 Usability Requirements + +``` +Req ID Requirement Milestone +``` +``` +NFR-U- +001 +``` +``` +Mobile app shall require < 3 taps to start recording M +``` +``` +NFR-U- +002 +``` +``` +New billers shall be productive within 4 hours of training M +``` +``` +NFR-U- +003 +``` +``` +Web interface shall be accessible on screen sizes >= 1280px width M +``` +``` +NFR-U- +004 +``` +``` +System shall provide clear error messages with suggested corrective +actions +``` +##### M + + +## 5. System Architecture + +### 5.1 Architecture Overview + +The system follows a modular, layered architecture designed for scalability and future +expansion to agentic AI capabilities. The architecture separates concerns across four distinct +layers. + +#### 5.1.1 Experience Layer + +- Web Dashboard (React.js + TypeScript): Primary interface for billers to review claims + and view analytics +- Mobile App (React Native / Expo): Smartphone-based audio capture, template selection, + and claim status +- REST API Gateway: Secure API with OAuth 2.0 authentication, rate limiting, and routing + +#### 5.1.2 Processing Layer + +- Workflow Engine: Orchestrates the claim generation pipeline (audio β†’ transcript β†’ + entities β†’ codes β†’ claim) +- Business Rules Engine: Implements payer-specific billing rules and optimization logic +- Claim Scrubbing Engine: RAG-powered validation against NCCI, LCD/NCD, and payer + rules +- NLP Processor: Coordinates speech-to-text and clinical entity extraction + +#### 5.1.3 AI Layer + +- Open Source LLM: Self-hosted Llama/Mistral/Mixtral/Qwen with LoRA fine-tuning +- Speech-to-Text Engine: Whisper with medical vocabulary fine-tuning +- Vector Database: For semantic search and RAG (pgvector or Weaviate/Pinecone) + +#### 5.1.4 Data Layer + +- Knowledge Base: Vector database for semantic search across policy docs, coding + manuals, cheat sheets +- Historical Data: Billing and denial history (filtered per-payer by policy change dates) +- Operational Database: PostgreSQL for claims, users, transactions, and audit logs + +### 5.2 Technology Stack + +``` +Layer Technology Rationale +Frontend React.js + TypeScript Modern, maintainable UI +``` +``` +Mobile React Native / Expo Cross-platform iOS/Android +API Server FastAPI (Python) High performance, async, ML +ecosystem +``` +``` +LLM Llama/Mistral/Mixtral + LoRA Open source, fine-tunable, self- +hostable +``` +``` +Speech-to-Text Whisper (OpenAI) High accuracy, self-hostable, medical +fine-tuning +``` + +``` +Database PostgreSQL + pgvector Structured + vector search in single +DB +Cache Redis Session management, API caching +``` +``` +Hosting AWS/GCP/Azure (HIPAA BAA) Cloud-hosted, VPC isolation, scalable +``` +### 5.3 Deployment Architecture + +The MVP deployment is designed for cloud-hosted operation within a HIPAA-compliant VPC: + +- Application Server: Cloud VM (8+ cores, 32GB RAM) for API, workflow engine, and web + app +- AI Inference: GPU-enabled cloud instance or Mac Mini M-series for LLM and Whisper +- Database: PostgreSQL with pgvector extension on managed service (RDS/Cloud SQL) +- Storage: Encrypted object storage for audio files and documents +- Estimated monthly infrastructure cost: $1,100-1,600/month + + +## 6. EMR Integration Specifications + +### 6.1 Epic Integration (Hospital) + +``` +Attribute Specification +Integration Method FHIR R4 API +``` +``` +Authentication OAuth 2.0 with SMART on FHIR +Data Retrieved Patient demographics, encounters, operative reports, clinical notes, +insurance +``` +``` +Environment Hospital inpatient and surgical records +Risk Epic FHIR approval can take 3-6 months. Timeline contingency required. +``` +### 6.2 Athena Centricity Integration (Outpatient Legacy) + +``` +Attribute Specification +``` +``` +Integration Method Custom API Connector (legacy Centricity version β€” confirm actual version +and API limitations) +``` +``` +Data Retrieved Historical billing data, denied claims with reason codes, patient records +Note Existing connector available β€” reduces integration time. Emily has flagged +potential roadblocks with older Centricity version vs modern Athena API. +``` +### 6.3 CureMD Integration (Outpatient) + +``` +Attribute Specification +Integration Method REST API +``` +``` +Data Retrieved Patient demographics, appointments, clinical documentation +Note Existing connector available β€” reduces integration time +``` +### [NEW] 6.4 ModMed β€” Formally Excluded from MVP + +ModMed was listed in the original project brief but has been dropped from all subsequent +documents. It is formally excluded from MVP scope. If needed in a future phase, a separate +integration SOW will be required. + + +## 7. MVP Scope Definition + +### 7.1 In Scope (MVP) + +1. Integration with Epic (hospital) and Athena Centricity/CureMD (outpatient) +2. Speech-to-text with medical dictionary (smartphone-based capture via Expo/React + Native) +3. AI-powered claim generation from clinical documentation (5-10 common spine surgery + types) +4. RAG-powered claim scrubbing against NCCI edits and LCD/NCD coverage + determinations +5. Payer-specific CPT code optimization for top 30 local plans (starting with top 10) +6. Template-based fast-track workflow for standard procedures +7. Human review interface with mandatory approval workflow +8. Complete audit trails for compliance +9. Manual policy updates by billing staff +10. Single-practice deployment (Dr. McHugh’s practice, up to 5 providers) +11. CMS-1500 professional claims (surgical billing focus) +12. Provider interface supporting dictate β†’ review β†’ approve β†’ submit workflow +13. Analytics dashboard with claim status, AI accuracy, and correction rates + +### 7.2 Out of Scope (Future Phases) + +- Self-learning/auto-updating rules engine +- Agentic AI capabilities +- Dedicated hardware dictation devices +- Multi-accent speech recognition optimization +- Offline/mobile-only processing +- Multi-practice/multi-tenant deployment +- Full automation without human review +- Outpatient clinic visit (E/M) billing automation +- Direct clearinghouse/EDI submission integration (837P/837I) +- Denial appeals automation +- Payment posting and ERA/835 processing +- Patient billing and collections +- UB-04 institutional claims +- ModMed integration +- Operative report generation/amendment + + +## 8. Project Timeline β€” SOW Milestone Aligned + +### 8.1 Milestone Schedule + +``` +Phase Deliverable Target Date Duration SOW Milestone +Kick +Off +``` +``` +Scope & Team +Walkthrough +``` +``` +04 Feb 2026 β€” β€” +``` +``` +Phase +1 +``` +``` +Foundation & Architecture 11 Mar 2026 5 weeks M1: Infrastructure +``` +``` +Phase +2 +``` +``` +Core Platform +Development +``` +``` +22 Apr 2026 6 weeks M2: Core Platform +``` +``` +Phase +3 +``` +``` +AI Engine Development 24 Jun 2026 9 weeks M3: AI Engine +``` +``` +Phase +4 +``` +``` +Integration & Workflow 12 Aug 2026 7 weeks M4: Integration +``` +``` +Phase +5 +``` +``` +Testing & Go-Live 23 Sep 2026 6 weeks M5: Go-Live +``` +``` +Phase +6 +``` +``` +Support & Maintenance Post go-live Ongoing M6: Support +``` +### 8.2 Phase Details with Deliverables + +#### Milestone 1 β€” Foundation & Architecture (Weeks 1-5) + +- HIPAA-compliant cloud infrastructure setup (VPC, encryption, IAM) +- Database schema design and deployment (PostgreSQL + pgvector) +- Authentication and RBAC framework (OAuth 2.0 + MFA) +- API gateway and security layer +- CI/CD pipeline configuration +- Historical data extraction from Athena Centricity (ETL pipeline) +- Data cleaning and filtering (per-payer policy change date filtering) +- Policy document collection and RAG corpus preparation + +**Acceptance: Infrastructure operational, data pipeline running, security audit passed.** + +#### Milestone 2 β€” Core Platform Development (Weeks 6-11) + +- Web dashboard foundation (React.js + TypeScript) +- Mobile app with audio capture (Expo/React Native) +- Speech-to-text module integration (Whisper base) +- Athena Centricity connector implementation +- CureMD API connector implementation +- Patient lookup and encounter linking via EMR +- Template-based fast-track UI for standard procedures + +**Acceptance: Audio capture working end-to-end, EMR data retrieval functional, basic +transcript generation.** + + +#### Milestone 3 β€” AI Engine Development (Weeks 8-17) + +- LLM fine-tuning on billing data using LoRA (orthopedic/neurosurgery domain) +- Medical vocabulary fine-tuning for Whisper STT +- Clinical entity extraction pipeline (NLP) +- ICD-10 and CPT code mapping engine with confidence scoring +- RAG pipeline for claim scrubbing (NCCI, LCD/NCD, payer rules) +- Business rules engine with payer-specific optimization +- Modifier logic implementation + +**Acceptance: AI generates claims for test cases with β‰₯90% code mapping accuracy, STT +β‰₯97% WER on test corpus.** + +#### Milestone 4 β€” Integration & Workflow (Weeks 15-21) + +- End-to-end pipeline integration (audio β†’ transcript β†’ entities β†’ codes β†’ claim) +- Human review interface with side-by-side transcript/claim/EMR view +- Claim correction workflow with model feedback loop +- Epic FHIR integration (contingent on API access approval) +- Audit trail and compliance logging +- Notification system for claim events +- Claim export mechanism for downstream processing + +**Acceptance: Full workflow functional, human review working, audit trails complete.** + +#### Milestone 5 β€” Testing & Go-Live (Weeks 19-24) + +- User acceptance testing with billing staff (real claim scenarios) +- Bug fixes and performance optimization +- Analytics dashboard and reporting +- Documentation and training materials +- Go-live preparation and cutover support +- Pilot with 1-3 surgeons, then rapid expansion + +**Acceptance: UAT sign-off, 80% same-day charge capture target met, system stable in +production.** + +#### Milestone 6 β€” Support & Maintenance (Post Go-Live) + +- Ongoing bug fixes and performance monitoring +- Model retraining based on correction feedback +- Payer rule updates as policies change +- System monitoring and infrastructure management + + +## 9. Success Metrics + +``` +Metric Target Measured At +Same-day charge capture rate 80% M5 Go-Live +``` +``` +Claim denial reduction 10 - 25% reduction 3 months post go-live +A/R cycle improvement 5 - 10 days faster 3 months post go-live +``` +``` +Provider submission time (dictate to approve) < 1 minute M5 UAT +``` +``` +STT accuracy (Word Error Rate) β‰₯97% on test corpus M3 acceptance +Code mapping accuracy β‰₯90% on test cases M3 acceptance +``` + +## 10. Risk Assessment + +``` +Risk Impact Detail Mitigation +AI Hallucination HIGH^ Incorrect codes generated and +submitted +``` +``` +Mandatory human review for all +claims; confidence thresholds +with escalation +``` +``` +Epic API Delay HIGH^ FHIR approval can take 3- 6 +months; blocks hospital +integration +``` +``` +Begin Epic approval process +immediately; build with mock +data; Epic integration is M4 +deliverable allowing parallel work +``` +``` +STT Accuracy HIGH^ 99% target unrealistic; industry +standard is 95-98% +``` +``` +Revised to β‰₯97% WER; medical +vocabulary fine-tuning; noise +reduction; iterative improvement +via feedback loop +``` +``` +Policy Staleness MED^ Outdated rules cause denials Manual updates by billing staff in +MVP; per-payer relevance +filtering; future auto-learning +``` +``` +Data Quality MED^ Historical data has gaps or +errors +``` +``` +Data validation phase in M1; filter +obsolete records per-payer; SME +review of training data +Adoption Friction MED^ Users resist workflow changes Smartphone-only capture (no +hardware); template fast-track for +routine procedures; < 1 min +submission target +Security Breach HIGH^ PHI exposure Self-hosted LLM within VPC; +AES-256 encryption; HIPAA +controls; no external AI API calls +``` +``` +Integration Delays MED^ EMR API access issues +(especially Athena legacy) +``` +``` +Existing connectors for +Athena/CureMD; confirm +Centricity version; early API +testing in M1 +``` +``` +Scope Creep MED^ Requirements grow beyond +agreed MVP +``` +``` +Strict scope freeze after M1; +change requests require written +approval with cost/timeline +impact +``` + +## 11. Open Questions Requiring Resolution + +The following items must be resolved before or during Phase 1. Items marked CRITICAL block +development if unresolved. + +### 11.1 Development-Critical Questions + +``` +# Question Priority Owner +``` +``` +Q1 What are the 5-10 common spine surgery types for +MVP with their typical CPT + ICD-10 code +combinations? +``` +**CRITICAL** (^) Dr. McHugh / Billing +team +Q2 What is the Athena Centricity version? What APIs +are available? (Emily flagged potential legacy +roadblocks) +**CRITICAL** (^) Emily / Tech team +Q3 Is Epic FHIR API access already granted or is +approval still pending? What is the expected +timeline? +**CRITICAL** (^) Emily +Q4 What is the exact claim handoff mechanism to +Athena? (API push, file import, manual re-entry?) +**CRITICAL** (^) Emily +Q5 What is the timeline for receiving historical billing +data and payer policy documents? +**CRITICAL** (^) Emily +Q6 What is the prioritized list of top 30 insurance +plans? (Start with top 10 for initial validation) +**HIGH** (^) Emily / Billing team +Q7 What patient identifier links audio to EMR? (MRN, +encounter ID, manual entry, or EMR lookup?) +**HIGH** (^) Dr. McHugh / Emily +Q8 Mobile platforms: iOS only, Android only, or both +required for MVP? +**HIGH** (^) Dr. McHugh +Q9 What specific data fields should be pulled from +Epic vs Athena vs CureMD for a given claim? +**HIGH** (^) Emily / Billing team +Q10 Expected daily volume: how many audio +recordings per day, average length per recording? +**HIGH** (^) Dr. McHugh +Q11 Total concurrent users (providers + billers + +admin)? +**HIGH** (^) Emily +Q12 What format should approved claims be exported +in? (PDF CMS-1500, data file, API push to Athena) +**HIGH** (^) Emily +Q13 Cloud provider preference: AWS vs GCP vs +Azure? (GCP recommended in cost analysis) +**MEDIUM** (^) Emily +Q14 Has infrastructure budget (~$1,100-1,600/month) +been confirmed? +**MEDIUM** (^) Emily +Q15 What notification methods are required? (Email, +SMS, in-app push, or combination) +**MEDIUM** (^) Emily + + +_Note: CRITICAL questions must be resolved before Phase 1 can be completed. HIGH questions +must be resolved before Phase 2 begins._ + + +## 12. Stakeholders and Team Structure + +### 12.1 Client Stakeholders + +``` +Role Name Contact +Executive Sponsor Dr. Brian McHugh DrMcHugh@mchughneurosurgery.com +``` +``` +Project Manager Emily Clifford emily@farmtotablehealth.com / +1 (716) 983- 2572 +``` +### 12.2 Implementation Partner (Dextra Labs) + +``` +Role Name Email +Engagement +Partner +``` +``` +Vijay Agarwal vijay@dextralabs.com +``` +``` +Project Manager Gaurang Ghadigaonkar gaurang.ghadigaonkar@dextralabs.com +Solution Architect Yasha Khandelwal yasha@dextralabs.com +``` +``` +Technical Lead TBD β€” +``` +### 12.3 Development Partner (Tech4Biz Solutions Pvt Ltd) + +Tech4Biz Solutions serves as the development partner, providing technical execution and +existing EMR connector expertise (Athena Centricity, CureMD). + + +## 13. Document Approval + +This Software Requirements Specification requires approval from the following stakeholders +before development proceeds: + +``` +Role Name Signature Date +``` +``` +Executive Sponsor Dr. Brian McHugh +``` +Client PM Emily Clifford (^) +Engagement Partner Vijay Agarwal (Dextra) (^) +Solution Architect Yasha Khandelwal (Dextra) (^) + +#### Document Version History: + +``` +Version Date Author Changes +``` +``` +1.0 03 Feb 2026 Yasha Initial draft based on discovery documents +1.1 03 Feb 2026 Yasha Added open questions (Sec 10), clarified system +boundary (Sec 1.3) +2.0 16 Feb 2026 Yasha Major revision: SOW milestone alignment, template fast- +track workflow, RAG claim scrubbing, OON billing +context, scope narrowing (5-10 spine procedures, 5 +providers, CMS-1500 only), STT accuracy revised to 97% +WER, ModMed/UB-04 formally excluded, confidence +thresholds, notification system, success metrics, MDM +level generation, medical necessity justification, claim +state machine, data migration ETL pipeline, model +retraining pipeline, all requirements mapped to +milestones. Removed all references to specific AI model +names β€” SRS is now technology-agnostic on AI model +selection. +``` +``` +β€” End of Document β€” +```