aibilly_backend_code/srs_document.md

2000 lines
43 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Software Requirements Specification
## AI-Powered Medical Billing Automation System
#### For Out-of-Network Neurosurgical & Orthopedic Practices
```
Prepared by: Dextra Labs Pte Ltd
Document Version: 2.
Date: February 2026
```
**Document Information**
**Client** Dr. Brian McHugh / McHugh
Neurosurgery
**Project Name** AI-Powered Medical Billing Automation
**Implementation Partner** Dextra Labs
**Development Partner** Tech4Biz Solutions Pvt Ltd
**Document Status** v2.0 — Development-Ready
## Table of Contents
## 1. Introduction
### 1.1 Purpose
This Software Requirements Specification (SRS) document provides a comprehensive,
development-ready description of the AI-Powered Medical Billing Automation System. It details
functional and non-functional requirements, system architecture, interfaces, constraints, and
milestone-aligned deliverables for building an intelligent billing automation platform for out-of-
network neurosurgical and orthopedic practices.
This document serves as the primary reference for developers, project managers, technical
leads, QA personnel, and stakeholders throughout the development lifecycle. All requirements
are mapped to SOW milestones to ensure traceability from specification to delivery.
### 1.2 Scope
The system is designed to transform the end-to-end claim generation process for specialty
medical practices. **[UPDATED]** The MVP focuses on 5-10 common spine surgery types
(covering approximately 98-99% of typical procedures) for a single practice with up to 5
providers.
The system will:
- Convert clinical audio dictation to text using Whisper ASR with medical-specialized
vocabulary
- Extract clinical entities (diagnoses, procedures, anatomical locations) using NLP
- Map extracted entities to ICD-10 diagnosis codes and CPT procedure codes
- Apply payer-specific billing optimization rules to maximize reimbursement
- Perform RAG-powered claim scrubbing against NCCI edits and LCD/NCD coverage
determinations
- Provide human-in-the-loop review workflow for mandatory claim verification
- Integrate with existing EMR systems (Epic, Athena Centricity, CureMD) via FHIR and
REST APIs
- Support a template-based fast-track workflow for standard procedures (e.g., standard 2-
level ACDF)
- Maintain HIPAA compliance through cloud-hosted, VPC-isolated AI infrastructure
### 1.3 System Boundary
#### 1.3.1 In Scope (Our System)
- Audio capture via mobile application (smartphone-based, no dedicated hardware)
- Speech-to-text conversion with medical terminology using Whisper STT
- EMR data retrieval (patient demographics, insurance, clinical documentation) from Epic,
Athena, CureMD
- AI-powered entity extraction and ICD-10/CPT code mapping with confidence scoring
- RAG-powered claim scrubbing against payer-specific rules, NCCI edits, and LCD/NCD
- Payer-specific optimization and claim generation (CMS-1500 professional claims)
- Template-based fast-track workflow for standard procedures
- Human review workflow with mandatory approval before finalization
- Export of approved claims for downstream processing
- Audit trails, compliance reporting, and analytics dashboard
- Provider interface supporting dictate → review → approve → submit workflow
#### 1.3.2 Out of Scope (Existing Practice Workflow)
- Claim submission to clearinghouse or insurance payers (EDI 837P/837I transmission)
- Payment posting and ERA/835 processing
- Patient billing and collections
- Denial appeals and follow-up automation
- Accounts receivable management
- Operative report generation/amendment (future phase consideration)
- ModMed integration (mentioned in original brief, formally excluded from MVP)
- UB-04 institutional claims (MVP covers CMS-1500 professional claims only)
_Note: The practice will continue using their existing systems (Athena Centricity) for claim
submission and payment processing. Our system generates the approved claim and hands off
to their existing workflow._
#### [NEW] 1.3.3 Out-of-Network Billing Lifecycle Context
Developers must understand the out-of-network (OON) billing lifecycle to build this system
effectively. The typical OON claim lifecycle is: Procedure → Coding & Claim Generation (our
system) → Submission to Payer → Initial Denial (common for OON) → Appeal with
Documentation → Arbitration (if needed) → Payment. Our systems output directly impacts
denial rates and appeal success by ensuring accurate coding, proper modifier application, and
comprehensive documentation from the start.
### 1.4 Definitions, Acronyms, and Abbreviations
```
Term Definition
```
```
ASR Automatic Speech Recognition — converts spoken words to text
CPT Current Procedural Terminology — medical procedure codes maintained by
AMA
```
```
EMR/EHR Electronic Medical/Health Record — digital patient medical history
FHIR Fast Healthcare Interoperability Resources — healthcare data exchange
standard
HIPAA Health Insurance Portability and Accountability Act — US healthcare privacy
law
ICD- 10 International Classification of Diseases, 10th Revision — diagnosis coding
system
LCD/NCD Local/National Coverage Determination — payer coverage policies
```
```
LLM Large Language Model — AI model trained on large text datasets
```
```
LoRA Low-Rank Adaptation — technique for efficient LLM fine-tuning
MDM Medical Decision Making — complexity level used for E/M coding
```
```
MVP Minimum Viable Product — initial product version with core features
```
```
NCCI National Correct Coding Initiative — CMS edits to prevent improper code
pairs
NLP Natural Language Processing — AI techniques for understanding text
```
```
OON Out-of-Network — healthcare providers not contracted with insurance plans
```
```
PHI Protected Health Information — patient health data protected by HIPAA
RAG Retrieval-Augmented Generation — AI technique combining semantic search
with LLM
STT Speech-to-Text — audio to text conversion technology
```
```
WER Word Error Rate — standard metric for measuring STT accuracy
```
### 1.5 References
- SO070FY26: Real-Time Dictation and Automated Billing AI Platform (December 2025)
- SOW: AI Dictation Application Development, Dextra Labs v
- Product Vision Document — Dr. Brian McHugh Discovery Session Notes (1st & 2nd
calls)
- Project Kickoff Presentation (February 2026)
- AI-Enabled Real-Time Dictation and Automated Billing Project Brief
- HIPAA Security Rule (45 CFR Part 164)
- HL7 FHIR R4 Specification
- Epic FHIR API Documentation
## 2. Overall Description
### 2.1 Product Perspective
The AI-Powered Medical Billing Automation System addresses a significant gap in the
healthcare revenue cycle management market. While in-network billing has seen substantial
automation advances, out-of-network specialty billing remains largely manual due to complex
payer-specific coding requirements, high-value procedures requiring precise documentation,
institutional knowledge locked within experienced human coders, and stringent HIPAA
requirements limiting cloud-based AI adoption.
This system serves as a standalone application integrating with existing EMR infrastructure
while providing its own AI-powered processing layer.
### 2.2 Why Historical Billing Data is Critical
The system requires access to historical billing data from the practice (filtered for relevance).
This data serves several critical purposes:
- Pattern Recognition: Learning which CPT codes are most successful for specific
procedures with specific payers
- Denial Avoidance: Understanding which code combinations and documentation patterns
lead to claim denials
- Payer-Specific Optimization: Different insurers have different preferences (e.g., BCBS
prefers certain code sequences while Cigna accepts others)
- Institutional Knowledge Capture: Expert coders have accumulated decades of
knowledge about what works
**Important:** Historical data will be filtered to exclude outdated policy periods. For example, if
United Healthcare restructured policies 3 years ago, denial patterns from before that date are
excluded as no longer relevant. Data filtering criteria must be defined per-payer based on policy
change dates.
### 2.3 Product Functions
#### 2.3.1 Audio Capture and Transcription
- Smartphone-based audio recording for clinical dictation (no dedicated hardware in MVP)
- AI-powered speech-to-text conversion with medical vocabulary fine-tuning
- Noise reduction optimized for hospital environments (post-procedure hallway, dictation
room, OR)
- Single speaker per dictation session (clinic claims may have PA voice, surgical claims
have surgeon voice)
#### 2.3.2 Clinical Entity Extraction
- Automated extraction of diagnoses, procedures, anatomical locations, and laterality
- Temporal relationship mapping for procedure sequencing
- Confidence scoring for each extracted entity
#### 2.3.3 Medical Code Mapping
- Automatic mapping of diagnoses to ICD-10 codes and procedures to CPT codes
- Modifier suggestions based on context (-59, -51, -LT/RT, etc.)
- Confidence scoring with alternative code suggestions for edge cases
#### [NEW] 2.3.4 Template-Based Fast-Track Workflow
For standard, high-volume procedures (e.g., standard 2-level ACDF), the system provides a
fast-track workflow where the surgeon selects a procedure template and associates it with a
patient. The system auto-generates the claim with pre-mapped CPT/ICD codes, modifiers, and
justification. The surgeon can add dictation notes for any deviations. This significantly reduces
time-to-claim for routine procedures.
#### 2.3.5 RAG-Powered Claim Scrubbing
Automated claim review using Retrieval-Augmented Generation against payer-specific rules,
NCCI edits, and Local/National Coverage Determinations (LCD/NCD). The RAG corpus
includes policy documents per individual plan, coding manuals, billing guidelines, and the
practices laminated cheat sheets.
#### 2.3.6 Payer-Specific Optimization
- Business rules engine for payer-specific coding strategies
- Historical denial pattern analysis for proactive optimization
- Coverage for top 30 local insurance plans in MVP (starting with top 10 for initial
validation)
#### 2.3.7 Claim Generation and Review
- Automated generation of submission-ready CMS- 15 00 professional claims
- Mandatory human review interface with AI-generated content highlighting
- Correction workflow with feedback captured for model improvement
- Complete audit trails for compliance and quality tracking
### 2.4 User Classes and Characteristics
```
User Class Description Technical Proficiency
```
```
Surgeon/Physician Primary users who record clinical dictation via
mobile app. Up to 5 providers in MVP.
```
```
Low — Requires simple,
intuitive interface
Medical Biller Reviews AI-generated claims, makes
corrections, approves submissions
```
```
Moderate — Familiar with
billing systems
```
```
Billing Supervisor Oversees billing operations, manages rules
and policies
```
```
Moderate — Dashboard and
reporting focus
```
```
System
Administrator
```
```
Manages system configuration, users, and
integrations
```
```
High — Technical
administration
```
### 2.5 Operating Environment
- Web Application: Modern browsers (Chrome, Firefox, Safari, Edge) on Windows,
macOS
- Mobile Application: iOS 14+ and Android 11+ smartphones for audio capture
- Server Infrastructure: Cloud-hosted (AWS/GCP/Azure) with HIPAA-compliant VPC
isolation
- AI Infrastructure: GPU-enabled cloud instances or Mac Mini M-series for LLM inference
- Database: PostgreSQL for structured data; vector database for RAG/semantic search
- Network: Always-online connectivity required (no offline mode in MVP)
### 2.6 Design and Implementation Constraints
- HIPAA Compliance: All PHI must remain within HIPAA-compliant, VPC-isolated
infrastructure
- Self-Hosted AI: No external API calls for patient data processing (LLM inference is self-
hosted)
- Human-in-the-Loop: Mandatory human review before any claim finalization
- Single Practice: MVP limited to Dr. McHughs practice only (up to 5 providers)
- Surgical Focus: MVP focused on 5-10 common spine surgery types (CMS- 1500
professional claims only)
- English Only: Speech-to-text optimized for English medical dictation, single speaker per
session
- Open-Source LLM: Llama/Mistral/Mixtral/Qwen with LoRA fine-tuning
### 2.7 Assumptions and Dependencies
#### 2.7.1 Assumptions
- Internet connectivity is always available at all practice locations
- Practice will provide access to historical billing and denial data (filtered for relevance)
- Payer policy documents, coding manuals, and cheat sheets will be provided for RAG
corpus
- Human billers will be available for review and correction during MVP
- Epic FHIR API access will be granted through proper channels (timeline dependency)
- No major scope changes during the execution of agreed phases
- Client will provide list of 5-10 common spine surgeries with typical CPT/ICD code
combinations
- Client will provide prioritized list of top 30 insurance plans for payer rules configuration
#### 2.7.2 Dependencies
- Epic Systems: FHIR R4 API availability and access credentials (critical path — can take
3 - 6 months)
- Athena Centricity: Legacy system data export capabilities and API availability
- CureMD: REST API documentation and access
- Open-source LLM: Availability of Llama 2/3 or Mistral/Mixtral models for self-hosting
- Whisper ASR: Model availability for self-hosting with medical vocabulary fine-tuning
- Client SMEs: Availability of billing and clinical subject matter experts for validation
## 3. Functional Requirements
All functional requirements are mapped to SOW milestones for traceability. Priority levels: Must
(MVP-critical), Should (important but can be deferred), Could (nice-to-have).
### 3.1 Audio Capture Module (Mobile App) — Milestone 2
```
Req ID Requirement Description Milestone Priority
FR-AC-
001
```
```
System shall allow users to record audio dictation using
smartphone microphone
```
```
M2 Must
```
```
FR-AC-
002
```
```
System shall support audio recording in standard formats
(AAC, MP3, WAV)
```
```
M2 Must
```
```
FR-AC-
003
```
```
System shall display recording duration and audio level
indicator
```
```
M2 Must
```
```
FR-AC-
004
```
```
System shall allow pause and resume during recording
session
```
```
M2 Must^
```
```
FR-AC-
005
```
```
System shall require user to associate recording with patient
identifier (MRN or encounter ID from EMR lookup)
```
```
M2 Must^
```
```
FR-AC-
006
```
```
System shall encrypt audio files at rest on device before
upload (AES-256)
```
```
M2 Must^
```
```
FR-AC-
007
```
```
System shall automatically upload recordings when network is
available
```
```
M2 Must^
```
```
FR-AC-
008
```
```
System shall support multiple recording sessions per patient
encounter
```
```
M2 Must^
```
```
FR-AC-
009
```
```
System shall support template-based fast-track: surgeon
selects standard procedure template and patient, bypassing
full dictation
```
```
M2 Must^
```
### 3.2 Speech-to-Text Module — Milestones 2- 3
```
Req ID Requirement Description Milestone Priority
```
```
FR-ST- 001 System shall convert audio dictation to text with ≥97% word
accuracy (WER) on a mutually agreed medical dictation test
corpus
```
```
M3 Must^
```
```
FR-ST- 002 System shall use medical-specific vocabulary (ICD-10 terms,
CPT terms, drug names, anatomical terms) for recognition
```
```
M3 Must
```
```
FR-ST- 003 System shall recognize ICD-10 and CPT codes when spoken
directly
```
```
M3 Should
```
```
FR-ST- 004 System shall handle medical drug names and dosages
accurately
```
```
M3 Must
```
```
FR-ST- 005 System shall apply AI-powered noise reduction for hospital
environments (hallway, dictation room, OR)
```
```
M3 Must
```
```
FR-ST- 006 System shall provide transcript with timestamps for review M2 Must
```
```
FR-ST- 007 System shall allow manual correction of transcription errors M2 Must^
FR-ST- 008 System shall mark low-confidence words/phrases for human
review
```
```
M3 Must^
```
### 3.3 Clinical Entity Extraction Module — Milestone 3
```
Req ID Requirement Description Milestone Priority
```
```
FR-EE- 001 System shall extract diagnoses from clinical documentation M3 Must^
FR-EE- 002 System shall identify procedures and treatments performed M3 Must^
FR-EE- 003 System shall recognize anatomical locations and laterality
(left/right)
```
```
M3 Must^
```
```
FR-EE- 004 System shall extract temporal relationships between
procedures
```
```
M3 Should^
```
```
FR-EE- 005 System shall identify patient demographics from EMR
integration
```
```
M2 Must^
```
```
FR-EE- 006 System shall extract insurance/payer information for the
encounter
```
```
M2 Must^
```
```
FR-EE- 007 System shall provide confidence scores for each extracted
entity
```
```
M3 Must^
```
### 3.4 Code Mapping Module — Milestone 3
```
Req ID Requirement Description Milestone Priority
FR-CM-
001
```
```
System shall map extracted diagnoses to ICD-10 codes M3 Must
```
```
FR-CM-
002
```
```
System shall map procedures to appropriate CPT codes M3 Must
```
```
FR-CM-
003
```
```
System shall suggest appropriate CPT modifiers based on
context (-59, -51, -LT/RT, etc.)
```
```
M3 Must
```
```
FR-CM-
004
```
```
System shall provide confidence scores for each code
mapping
```
```
M3 Must^
```
```
FR-CM-
005
```
```
System shall suggest alternative codes for low-confidence
mappings (<80%)
```
```
M3 Must^
```
```
FR-CM-
006
```
```
System shall validate code combinations against NCCI/CCI
edits
```
```
M3 Must^
```
```
FR-CM-
007
```
```
System shall support neurosurgery-specific code sets (5- 10
common spine procedures)
```
```
M3 Must^
```
```
FR-CM-
008
```
```
System shall support orthopedic surgery-specific code sets M3 Should^
```
```
FR-CM-
009
```
```
System shall apply confidence thresholds: >90% auto-
suggest, 70-90% flag for review, <70% require manual coding
```
```
M3 Must^
```
```
FR-CM-
010
```
```
System shall determine and assign Medical Decision Making
(MDM) level based on clinical documentation complexity
```
```
M3 Must^
```
```
FR-CM-
011
```
```
System shall generate medical necessity justification text to
support assigned codes
```
```
M3 Must^
```
### 3.5 RAG-Powered Claim Scrubbing Module — Milestone 3
```
Req ID Requirement Description Milestone Priority
```
```
FR-RS-
001
```
```
System shall scrub generated claims against payer-specific
rules using RAG
```
```
M3 Must^
```
```
FR-RS-
002
```
```
System shall validate claims against NCCI edits for improper
code pair detection
```
```
M3 Must^
```
```
FR-RS-
003
```
```
System shall check claims against Local Coverage
Determinations (LCD)
```
```
M3 Must
```
```
FR-RS-
004
```
```
System shall check claims against National Coverage
Determinations (NCD)
```
```
M3 Must
```
```
FR-RS-
005
```
```
System shall flag claims failing scrubbing rules with specific
failure reasons
```
```
M3 Must
```
```
FR-RS-
006
```
```
System shall suggest corrective actions for failed scrubbing
checks
```
```
M3 Should
```
```
FR-RS-
007
```
```
RAG corpus shall include: policy docs per plan, coding
manuals, cheat sheets, billing guidelines
```
```
M3 Must
```
### 3.6 Payer Rules Engine — Milestones 3- 4
```
Req ID Requirement Description Milestone Priority
FR-PR-
001
```
```
System shall maintain payer-specific coding rules for top 30
local plans
```
```
M3 Must
```
```
FR-PR-
002
```
```
System shall apply different coding strategies based on payer M3 Must^
```
```
FR-PR-
003
```
```
System shall incorporate historical denial pattern analysis M3 Must^
```
```
FR-PR-
004
```
```
System shall flag claims at high risk of denial based on
historical patterns
```
```
M4 Must^
```
```
FR-PR-
005
```
```
System shall optimize code selection for maximum
reimbursement
```
```
M3 Must^
```
```
FR-PR-
006
```
```
System shall allow manual update of payer rules by billing
staff
```
```
M4 Must^
```
```
FR-PR-
007
```
```
System shall track payer rule changes with version history M4 Should^
```
### 3.7 Claim Generation Module — Milestone 4
```
Req ID Requirement Description Milestone Priority
FR-CG-
001
```
```
System shall generate complete billing claims from processed
data
```
```
M4 Must
```
```
FR-CG-
002
```
```
System shall populate all required CMS-1500 claim fields
automatically
```
```
M4 Must
```
```
FR-CG-
003
```
```
System shall consolidate multi-session audio into single
patient claim
```
```
M4 Must
```
```
FR-CG-
004
```
```
System shall validate claim completeness before presenting
for review
```
```
M4 Must
```
```
FR-CG-
005
```
```
System shall generate claims in CMS-1500 professional
format
```
```
M4 Must
```
```
FR-CG-
006
```
```
System shall support multi-session audio consolidation
including recordings spanning multiple days for a single
surgical encounter
```
```
M4 Must
```
```
FR-CG-
007
```
```
System shall implement claim state machine with defined
states: Draft, Pending STT, Pending AI Review, Ready for
Human Review, Approved, Rejected, Exported
```
```
M4 Must
```
### 3.8 Human Review Workflow — Milestone 4
```
Req ID Requirement Description Milestone Priority
```
```
FR-HR-
001
```
```
System shall present claims for mandatory human review
before finalization
```
```
M4 Must^
```
```
FR-HR-
002
```
```
System shall highlight AI-generated content for reviewer
attention
```
```
M4 Must^
```
```
FR-HR-
003
```
```
System shall display original audio/transcript alongside
generated claim (side-by-side view)
```
```
M4 Must^
```
```
FR-HR-
004
```
```
System shall allow reviewers to modify any claim field M4 Must^
```
```
FR-HR-
005
```
```
System shall track all reviewer corrections for model
retraining feedback loop
```
```
M4 Must^
```
```
FR-HR-
006
```
```
System shall require explicit approval before claim finalization M4 Must
```
```
FR-HR-
007
```
```
System shall support claim rejection with reason capture and
re-routing to biller for manual correction
```
```
M4 Must
```
```
FR-HR-
008
```
```
System shall maintain complete audit trail of all claim
modifications
```
```
M4 Must
```
```
FR-HR-
009
```
```
System shall display EMR data (demographics, insurance)
alongside claim for cross-reference
```
```
M4 Must
```
### 3.9 EMR Integration — Milestones 2- 4
```
Req ID Requirement Description Milestone Priority
FR-EMR-
001
```
```
System shall integrate with Epic via FHIR R4 API for hospital
records (operative reports, clinical notes)
```
```
M4 Must
```
```
FR-EMR-
002
```
```
System shall integrate with Athena Centricity for billing data,
denied claims, patient records
```
```
M2 Must
```
```
FR-EMR-
003
```
```
System shall integrate with CureMD via REST API for patient
demographics and clinical docs
```
```
M2 Must
```
```
FR-EMR-
004
```
```
System shall retrieve patient demographics from EMR M2 Must
```
```
FR-EMR-
005
```
```
System shall retrieve insurance/payer information from EMR M2 Must
```
```
FR-EMR-
006
```
```
System shall retrieve clinical documentation and notes from
EMR
```
```
M2 Must
```
```
FR-EMR-
007
```
```
System shall handle EMR connection failures gracefully with
retry logic
```
```
M4 Must
```
### 3.10 Reporting and Analytics — Milestone 5
```
Req ID Requirement Description Milestone Priority
FR-RA-
001
```
```
System shall provide dashboard with claim processing status
(pending, approved, rejected, exported)
```
```
M5 Must^
```
```
FR-RA-
002
```
```
System shall display AI accuracy metrics (STT WER, code
mapping accuracy)
```
```
M5 Must^
```
```
FR-RA-
003
```
```
System shall track and report claim approval/rejection rates M5 Must^
```
```
FR-RA-
004
```
```
System shall provide human correction rate analytics (per
coder, per payer)
```
```
M5 Must^
```
```
FR-RA-
005
```
```
System shall generate audit reports for compliance review M5 Must^
```
```
FR-RA-
006
```
```
System shall support search/filter claims by patient, date
range, status, payer
```
```
M5 Must^
```
```
FR-RA-
007
```
```
System shall track same-day charge capture rate (target:
80%)
```
```
M5 Must^
```
### [NEW] 3.11 Notification System — Milestone 4
```
Req ID Requirement Description Milestone Priority
FR-NT- 001 System shall notify billers when a new claim is ready for
review
```
```
M4 Must
```
```
FR-NT- 002 System shall notify providers when a claim is approved or
requires attention
```
```
M4 Should
```
```
FR-NT- 003 System shall support in-app push notifications on mobile M4 Should
```
```
FR-NT- 004 System shall support email notifications for critical events M4 Should^
```
### [NEW] 3.12 Data Migration & Model Training Pipeline — Milestones 1, 3,
### 6
```
Req ID Requirement Description Milestone Priority
```
```
FR-DM-
001
```
```
System shall provide ETL pipeline to extract, clean, and load
historical billing data from Athena Centricity
```
```
M1 Must^
```
```
FR-DM-
002
```
```
System shall filter historical data per-payer based on policy
change dates to exclude outdated patterns
```
```
M1 Must^
```
```
FR-DM-
003
```
```
System shall ingest and index policy documents, coding
manuals, and cheat sheets into the RAG vector store
```
```
M1 Must^
```
```
FR-DM-
004
```
```
System shall capture human reviewer corrections in a
structured format suitable for model retraining
```
```
M4 Must^
```
```
FR-DM-
005
```
```
System shall support periodic model retraining/fine-tuning
using accumulated correction data (manual trigger in MVP)
```
```
M6 Should^
```
## 4. Non-Functional Requirements
### 4.1 Performance Requirements
```
Req ID Requirement Target Milestone
NFR-P-
001
```
```
Speech-to-text processing time per minute of
audio
```
```
< 90 seconds M
```
```
NFR-P-
002
```
```
Claim generation time from completed transcript < 90 seconds M
```
```
NFR-P-
003
```
```
Web dashboard page load time < 3 seconds M
```
```
NFR-P-
004
```
```
Mobile app recording start latency < 2 seconds M
```
```
NFR-P-
005
```
```
Provider submission workflow completion
(dictate to approve)
```
```
< 1 minute M
```
```
NFR-P-
006
```
```
Concurrent users supported 20+ simultaneous M
```
### 4.2 Security Requirements
```
Req ID Requirement Milestone
```
```
NFR-S-
001
```
```
All PHI shall be encrypted at rest using AES-256 encryption M
```
```
NFR-S-
002
```
```
All data in transit shall be encrypted using TLS 1.3 M
```
```
NFR-S-
003
```
```
User authentication shall use OAuth 2.0 with multi-factor authentication M
```
```
NFR-S-
004
```
```
Role-based access control shall restrict data access by user role M
```
```
NFR-S-
005
```
```
All user actions shall be logged in tamper-proof audit trail M
```
```
NFR-S-
006
```
```
Session timeout shall occur after 15 minutes of inactivity M
```
```
NFR-S-
007
```
```
AI inference shall occur entirely within VPC-isolated, HIPAA-compliant
infrastructure
```
##### M
```
NFR-S-
008
```
```
No PHI shall be transmitted to external cloud AI services M
```
### 4.3 Reliability and Availability
```
Req ID Requirement Milestone
```
```
NFR-R-
001
```
```
System availability target: 99.5% during business hours (6 AM - 10 PM
ET)
```
##### M
```
NFR-R-
002
```
```
Mean time to recovery from failure: < 4 hours M
```
```
NFR-R-
003
```
```
Database backup frequency: Daily with 30-day retention M
```
```
NFR-R-
004
```
```
Audio file retention: 7 years (aligned with HIPAA audit trail requirements) M
```
```
NFR-R-
005
```
```
Graceful degradation when EMR integration is unavailable M
```
### 4.4 Compliance Requirements
```
Req ID Requirement Milestone
NFR-C-
001
```
```
System shall comply with HIPAA Privacy Rule requirements M
```
```
NFR-C-
002
```
```
System shall comply with HIPAA Security Rule requirements M
```
```
NFR-C-
003
```
```
Business Associate Agreement (BAA) required with all cloud providers M
```
```
NFR-C-
004
```
```
Complete audit trails shall be maintained for minimum 7 years M
```
```
NFR-C-
005
```
```
System shall support compliance reporting for audits M
```
### 4.5 Usability Requirements
```
Req ID Requirement Milestone
```
```
NFR-U-
001
```
```
Mobile app shall require < 3 taps to start recording M
```
```
NFR-U-
002
```
```
New billers shall be productive within 4 hours of training M
```
```
NFR-U-
003
```
```
Web interface shall be accessible on screen sizes >= 1280px width M
```
```
NFR-U-
004
```
```
System shall provide clear error messages with suggested corrective
actions
```
##### M
## 5. System Architecture
### 5.1 Architecture Overview
The system follows a modular, layered architecture designed for scalability and future
expansion to agentic AI capabilities. The architecture separates concerns across four distinct
layers.
#### 5.1.1 Experience Layer
- Web Dashboard (React.js + TypeScript): Primary interface for billers to review claims
and view analytics
- Mobile App (React Native / Expo): Smartphone-based audio capture, template selection,
and claim status
- REST API Gateway: Secure API with OAuth 2.0 authentication, rate limiting, and routing
#### 5.1.2 Processing Layer
- Workflow Engine: Orchestrates the claim generation pipeline (audio → transcript →
entities → codes → claim)
- Business Rules Engine: Implements payer-specific billing rules and optimization logic
- Claim Scrubbing Engine: RAG-powered validation against NCCI, LCD/NCD, and payer
rules
- NLP Processor: Coordinates speech-to-text and clinical entity extraction
#### 5.1.3 AI Layer
- Open Source LLM: Self-hosted Llama/Mistral/Mixtral/Qwen with LoRA fine-tuning
- Speech-to-Text Engine: Whisper with medical vocabulary fine-tuning
- Vector Database: For semantic search and RAG (pgvector or Weaviate/Pinecone)
#### 5.1.4 Data Layer
- Knowledge Base: Vector database for semantic search across policy docs, coding
manuals, cheat sheets
- Historical Data: Billing and denial history (filtered per-payer by policy change dates)
- Operational Database: PostgreSQL for claims, users, transactions, and audit logs
### 5.2 Technology Stack
```
Layer Technology Rationale
Frontend React.js + TypeScript Modern, maintainable UI
```
```
Mobile React Native / Expo Cross-platform iOS/Android
API Server FastAPI (Python) High performance, async, ML
ecosystem
```
```
LLM Llama/Mistral/Mixtral + LoRA Open source, fine-tunable, self-
hostable
```
```
Speech-to-Text Whisper (OpenAI) High accuracy, self-hostable, medical
fine-tuning
```
```
Database PostgreSQL + pgvector Structured + vector search in single
DB
Cache Redis Session management, API caching
```
```
Hosting AWS/GCP/Azure (HIPAA BAA) Cloud-hosted, VPC isolation, scalable
```
### 5.3 Deployment Architecture
The MVP deployment is designed for cloud-hosted operation within a HIPAA-compliant VPC:
- Application Server: Cloud VM (8+ cores, 32GB RAM) for API, workflow engine, and web
app
- AI Inference: GPU-enabled cloud instance or Mac Mini M-series for LLM and Whisper
- Database: PostgreSQL with pgvector extension on managed service (RDS/Cloud SQL)
- Storage: Encrypted object storage for audio files and documents
- Estimated monthly infrastructure cost: $1,100-1,600/month
## 6. EMR Integration Specifications
### 6.1 Epic Integration (Hospital)
```
Attribute Specification
Integration Method FHIR R4 API
```
```
Authentication OAuth 2.0 with SMART on FHIR
Data Retrieved Patient demographics, encounters, operative reports, clinical notes,
insurance
```
```
Environment Hospital inpatient and surgical records
Risk Epic FHIR approval can take 3-6 months. Timeline contingency required.
```
### 6.2 Athena Centricity Integration (Outpatient Legacy)
```
Attribute Specification
```
```
Integration Method Custom API Connector (legacy Centricity version — confirm actual version
and API limitations)
```
```
Data Retrieved Historical billing data, denied claims with reason codes, patient records
Note Existing connector available — reduces integration time. Emily has flagged
potential roadblocks with older Centricity version vs modern Athena API.
```
### 6.3 CureMD Integration (Outpatient)
```
Attribute Specification
Integration Method REST API
```
```
Data Retrieved Patient demographics, appointments, clinical documentation
Note Existing connector available — reduces integration time
```
### [NEW] 6.4 ModMed — Formally Excluded from MVP
ModMed was listed in the original project brief but has been dropped from all subsequent
documents. It is formally excluded from MVP scope. If needed in a future phase, a separate
integration SOW will be required.
## 7. MVP Scope Definition
### 7.1 In Scope (MVP)
1. Integration with Epic (hospital) and Athena Centricity/CureMD (outpatient)
2. Speech-to-text with medical dictionary (smartphone-based capture via Expo/React
Native)
3. AI-powered claim generation from clinical documentation (5-10 common spine surgery
types)
4. RAG-powered claim scrubbing against NCCI edits and LCD/NCD coverage
determinations
5. Payer-specific CPT code optimization for top 30 local plans (starting with top 10)
6. Template-based fast-track workflow for standard procedures
7. Human review interface with mandatory approval workflow
8. Complete audit trails for compliance
9. Manual policy updates by billing staff
10. Single-practice deployment (Dr. McHughs practice, up to 5 providers)
11. CMS-1500 professional claims (surgical billing focus)
12. Provider interface supporting dictate → review → approve → submit workflow
13. Analytics dashboard with claim status, AI accuracy, and correction rates
### 7.2 Out of Scope (Future Phases)
- Self-learning/auto-updating rules engine
- Agentic AI capabilities
- Dedicated hardware dictation devices
- Multi-accent speech recognition optimization
- Offline/mobile-only processing
- Multi-practice/multi-tenant deployment
- Full automation without human review
- Outpatient clinic visit (E/M) billing automation
- Direct clearinghouse/EDI submission integration (837P/837I)
- Denial appeals automation
- Payment posting and ERA/835 processing
- Patient billing and collections
- UB-04 institutional claims
- ModMed integration
- Operative report generation/amendment
## 8. Project Timeline — SOW Milestone Aligned
### 8.1 Milestone Schedule
```
Phase Deliverable Target Date Duration SOW Milestone
Kick
Off
```
```
Scope & Team
Walkthrough
```
```
04 Feb 2026 — —
```
```
Phase
1
```
```
Foundation & Architecture 11 Mar 2026 5 weeks M1: Infrastructure
```
```
Phase
2
```
```
Core Platform
Development
```
```
22 Apr 2026 6 weeks M2: Core Platform
```
```
Phase
3
```
```
AI Engine Development 24 Jun 2026 9 weeks M3: AI Engine
```
```
Phase
4
```
```
Integration & Workflow 12 Aug 2026 7 weeks M4: Integration
```
```
Phase
5
```
```
Testing & Go-Live 23 Sep 2026 6 weeks M5: Go-Live
```
```
Phase
6
```
```
Support & Maintenance Post go-live Ongoing M6: Support
```
### 8.2 Phase Details with Deliverables
#### Milestone 1 — Foundation & Architecture (Weeks 1-5)
- HIPAA-compliant cloud infrastructure setup (VPC, encryption, IAM)
- Database schema design and deployment (PostgreSQL + pgvector)
- Authentication and RBAC framework (OAuth 2.0 + MFA)
- API gateway and security layer
- CI/CD pipeline configuration
- Historical data extraction from Athena Centricity (ETL pipeline)
- Data cleaning and filtering (per-payer policy change date filtering)
- Policy document collection and RAG corpus preparation
**Acceptance: Infrastructure operational, data pipeline running, security audit passed.**
#### Milestone 2 — Core Platform Development (Weeks 6-11)
- Web dashboard foundation (React.js + TypeScript)
- Mobile app with audio capture (Expo/React Native)
- Speech-to-text module integration (Whisper base)
- Athena Centricity connector implementation
- CureMD API connector implementation
- Patient lookup and encounter linking via EMR
- Template-based fast-track UI for standard procedures
**Acceptance: Audio capture working end-to-end, EMR data retrieval functional, basic
transcript generation.**
#### Milestone 3 — AI Engine Development (Weeks 8-17)
- LLM fine-tuning on billing data using LoRA (orthopedic/neurosurgery domain)
- Medical vocabulary fine-tuning for Whisper STT
- Clinical entity extraction pipeline (NLP)
- ICD-10 and CPT code mapping engine with confidence scoring
- RAG pipeline for claim scrubbing (NCCI, LCD/NCD, payer rules)
- Business rules engine with payer-specific optimization
- Modifier logic implementation
**Acceptance: AI generates claims for test cases with ≥90% code mapping accuracy, STT
≥97% WER on test corpus.**
#### Milestone 4 — Integration & Workflow (Weeks 15-21)
- End-to-end pipeline integration (audio → transcript → entities → codes → claim)
- Human review interface with side-by-side transcript/claim/EMR view
- Claim correction workflow with model feedback loop
- Epic FHIR integration (contingent on API access approval)
- Audit trail and compliance logging
- Notification system for claim events
- Claim export mechanism for downstream processing
**Acceptance: Full workflow functional, human review working, audit trails complete.**
#### Milestone 5 — Testing & Go-Live (Weeks 19-24)
- User acceptance testing with billing staff (real claim scenarios)
- Bug fixes and performance optimization
- Analytics dashboard and reporting
- Documentation and training materials
- Go-live preparation and cutover support
- Pilot with 1-3 surgeons, then rapid expansion
**Acceptance: UAT sign-off, 80% same-day charge capture target met, system stable in
production.**
#### Milestone 6 — Support & Maintenance (Post Go-Live)
- Ongoing bug fixes and performance monitoring
- Model retraining based on correction feedback
- Payer rule updates as policies change
- System monitoring and infrastructure management
## 9. Success Metrics
```
Metric Target Measured At
Same-day charge capture rate 80% M5 Go-Live
```
```
Claim denial reduction 10 - 25% reduction 3 months post go-live
A/R cycle improvement 5 - 10 days faster 3 months post go-live
```
```
Provider submission time (dictate to approve) < 1 minute M5 UAT
```
```
STT accuracy (Word Error Rate) ≥97% on test corpus M3 acceptance
Code mapping accuracy ≥90% on test cases M3 acceptance
```
## 10. Risk Assessment
```
Risk Impact Detail Mitigation
AI Hallucination HIGH^ Incorrect codes generated and
submitted
```
```
Mandatory human review for all
claims; confidence thresholds
with escalation
```
```
Epic API Delay HIGH^ FHIR approval can take 3- 6
months; blocks hospital
integration
```
```
Begin Epic approval process
immediately; build with mock
data; Epic integration is M4
deliverable allowing parallel work
```
```
STT Accuracy HIGH^ 99% target unrealistic; industry
standard is 95-98%
```
```
Revised to ≥97% WER; medical
vocabulary fine-tuning; noise
reduction; iterative improvement
via feedback loop
```
```
Policy Staleness MED^ Outdated rules cause denials Manual updates by billing staff in
MVP; per-payer relevance
filtering; future auto-learning
```
```
Data Quality MED^ Historical data has gaps or
errors
```
```
Data validation phase in M1; filter
obsolete records per-payer; SME
review of training data
Adoption Friction MED^ Users resist workflow changes Smartphone-only capture (no
hardware); template fast-track for
routine procedures; < 1 min
submission target
Security Breach HIGH^ PHI exposure Self-hosted LLM within VPC;
AES-256 encryption; HIPAA
controls; no external AI API calls
```
```
Integration Delays MED^ EMR API access issues
(especially Athena legacy)
```
```
Existing connectors for
Athena/CureMD; confirm
Centricity version; early API
testing in M1
```
```
Scope Creep MED^ Requirements grow beyond
agreed MVP
```
```
Strict scope freeze after M1;
change requests require written
approval with cost/timeline
impact
```
## 11. Open Questions Requiring Resolution
The following items must be resolved before or during Phase 1. Items marked CRITICAL block
development if unresolved.
### 11.1 Development-Critical Questions
```
# Question Priority Owner
```
```
Q1 What are the 5-10 common spine surgery types for
MVP with their typical CPT + ICD-10 code
combinations?
```
**CRITICAL** (^) Dr. McHugh / Billing
team
Q2 What is the Athena Centricity version? What APIs
are available? (Emily flagged potential legacy
roadblocks)
**CRITICAL** (^) Emily / Tech team
Q3 Is Epic FHIR API access already granted or is
approval still pending? What is the expected
timeline?
**CRITICAL** (^) Emily
Q4 What is the exact claim handoff mechanism to
Athena? (API push, file import, manual re-entry?)
**CRITICAL** (^) Emily
Q5 What is the timeline for receiving historical billing
data and payer policy documents?
**CRITICAL** (^) Emily
Q6 What is the prioritized list of top 30 insurance
plans? (Start with top 10 for initial validation)
**HIGH** (^) Emily / Billing team
Q7 What patient identifier links audio to EMR? (MRN,
encounter ID, manual entry, or EMR lookup?)
**HIGH** (^) Dr. McHugh / Emily
Q8 Mobile platforms: iOS only, Android only, or both
required for MVP?
**HIGH** (^) Dr. McHugh
Q9 What specific data fields should be pulled from
Epic vs Athena vs CureMD for a given claim?
**HIGH** (^) Emily / Billing team
Q10 Expected daily volume: how many audio
recordings per day, average length per recording?
**HIGH** (^) Dr. McHugh
Q11 Total concurrent users (providers + billers +
admin)?
**HIGH** (^) Emily
Q12 What format should approved claims be exported
in? (PDF CMS-1500, data file, API push to Athena)
**HIGH** (^) Emily
Q13 Cloud provider preference: AWS vs GCP vs
Azure? (GCP recommended in cost analysis)
**MEDIUM** (^) Emily
Q14 Has infrastructure budget (~$1,100-1,600/month)
been confirmed?
**MEDIUM** (^) Emily
Q15 What notification methods are required? (Email,
SMS, in-app push, or combination)
**MEDIUM** (^) Emily
_Note: CRITICAL questions must be resolved before Phase 1 can be completed. HIGH questions
must be resolved before Phase 2 begins._
## 12. Stakeholders and Team Structure
### 12.1 Client Stakeholders
```
Role Name Contact
Executive Sponsor Dr. Brian McHugh DrMcHugh@mchughneurosurgery.com
```
```
Project Manager Emily Clifford emily@farmtotablehealth.com / +1 (716) 983- 2572
```
### 12.2 Implementation Partner (Dextra Labs)
```
Role Name Email
Engagement
Partner
```
```
Vijay Agarwal vijay@dextralabs.com
```
```
Project Manager Gaurang Ghadigaonkar gaurang.ghadigaonkar@dextralabs.com
Solution Architect Yasha Khandelwal yasha@dextralabs.com
```
```
Technical Lead TBD —
```
### 12.3 Development Partner (Tech4Biz Solutions Pvt Ltd)
Tech4Biz Solutions serves as the development partner, providing technical execution and
existing EMR connector expertise (Athena Centricity, CureMD).
## 13. Document Approval
This Software Requirements Specification requires approval from the following stakeholders
before development proceeds:
```
Role Name Signature Date
```
```
Executive Sponsor Dr. Brian McHugh
```
Client PM Emily Clifford (^)
Engagement Partner Vijay Agarwal (Dextra) (^)
Solution Architect Yasha Khandelwal (Dextra) (^)
#### Document Version History:
```
Version Date Author Changes
```
```
1.0 03 Feb 2026 Yasha Initial draft based on discovery documents
1.1 03 Feb 2026 Yasha Added open questions (Sec 10), clarified system
boundary (Sec 1.3)
2.0 16 Feb 2026 Yasha Major revision: SOW milestone alignment, template fast-
track workflow, RAG claim scrubbing, OON billing
context, scope narrowing (5-10 spine procedures, 5
providers, CMS-1500 only), STT accuracy revised to 97%
WER, ModMed/UB-04 formally excluded, confidence
thresholds, notification system, success metrics, MDM
level generation, medical necessity justification, claim
state machine, data migration ETL pipeline, model
retraining pipeline, all requirements mapped to
milestones. Removed all references to specific AI model
names — SRS is now technology-agnostic on AI model
selection.
```
```
— End of Document —
```