DriverTrac/docs/ASSESSMENT_REPORT.md
2025-11-24 18:38:24 +05:30

493 lines
14 KiB
Markdown

# DSMS/ADAS Visual Analysis - Comprehensive Assessment Report
## Executive Summary
This report provides a systematic evaluation of the current Streamlit-based Driver State Monitoring System (DSMS) and Advanced Driver Assistance System (ADAS) implementation, with focus on optimizing for low-specification CPUs while maintaining high accuracy.
**Current Status**: ⚠️ **Non-Functional** - Missing 9/11 critical dependencies, multiple code bugs, and significant performance bottlenecks.
---
## 1. Assessment of Current Implementation
### 1.1 Code Structure Analysis
**Strengths:**
- ✅ Modular class-based design (`RealTimePredictor`)
- ✅ Streamlit caching enabled (`@st.cache_resource`)
- ✅ Frame skipping mechanism (`inference_skip: 3`)
- ✅ Logging infrastructure in place
- ✅ ONNX optimization mentioned for YOLO
**Critical Issues Identified:**
#### 🔴 **CRITICAL BUG #1: Incorrect Optical Flow API Usage**
```125:131:track_drive.py
def optical_flow(self, prev_frame, curr_frame):
"""OpenCV flow for speed, braking, accel."""
prev_gray = cv2.cvtColor(prev_frame, cv2.COLOR_BGR2GRAY)
curr_gray = cv2.cvtColor(curr_frame, cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowPyrLK(prev_gray, curr_gray, None, None)
magnitude = np.mean(np.sqrt(flow[0]**2 + flow[1]**2))
return magnitude
```
**Problem**: `calcOpticalFlowPyrLK` requires feature points as input, not full images. This will cause a runtime error.
**Impact**: ⚠️ **CRITICAL** - Will crash on execution
#### 🔴 **CRITICAL BUG #2: VideoMAE JIT Scripting Failure**
```48:53:track_drive.py
processor = VideoMAEImageProcessor.from_pretrained(CONFIG['videomae_model'])
videomae = VideoMAEForVideoClassification.from_pretrained(CONFIG['videomae_model'])
videomae = torch.jit.script(videomae)
torch.jit.save(videomae, 'videomae_ts.pt')
videomae = torch.jit.load('videomae_ts.pt')
```
**Problem**: Transformer models cannot be JIT scripted directly. This will fail at runtime.
**Impact**: ⚠️ **CRITICAL** - Model loading will crash
#### 🔴 **CRITICAL BUG #3: ONNX Export on Every Load**
```39:41:track_drive.py
yolo_base = YOLO(CONFIG['yolo_base'])
yolo_base.export(format='onnx', int8=True) # Quantize once
yolo_session = ort.InferenceSession('yolov8n.onnx')
```
**Problem**: ONNX export runs every time `load_models()` is called, even with caching. Should be conditional.
**Impact**: ⚠️ **HIGH** - Slow startup, unnecessary file I/O
#### 🟡 **PERFORMANCE ISSUE #1: Untrained Isolation Forest**
```60:60:track_drive.py
iso_forest = IsolationForest(contamination=0.1, random_state=42)
```
**Problem**: Isolation Forest is instantiated but never trained. Will produce random predictions.
**Impact**: ⚠️ **MEDIUM** - Anomaly detection non-functional
#### 🟡 **PERFORMANCE ISSUE #2: Multiple Heavy Models Loaded Simultaneously**
All models (YOLO, VideoMAE, MediaPipe, Roboflow, Isolation Forest) load at startup regardless of usage.
**Impact**: ⚠️ **HIGH** - Very slow startup, high memory usage
#### 🟡 **PERFORMANCE ISSUE #3: Redundant Color Conversions**
```101:101:track_drive.py
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
```
And later:
```253:253:track_drive.py
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
```
**Impact**: ⚠️ **MEDIUM** - Unnecessary CPU cycles
#### 🟡 **PERFORMANCE ISSUE #4: VideoMAE Processing Every Frame**
VideoMAE (large transformer) processes 8-frame sequences even when not needed.
**Impact**: ⚠️ **HIGH** - Major CPU bottleneck on low-spec hardware
#### 🟡 **PERFORMANCE ISSUE #5: No Model Quantization for VideoMAE**
VideoMAE runs in FP32, consuming significant memory and compute.
**Impact**: ⚠️ **HIGH** - Not suitable for low-spec CPUs
#### 🟡 **PERFORMANCE ISSUE #6: Inefficient YOLO ONNX Parsing**
```87:91:track_drive.py
bboxes = outputs[0][0, :, :4] # xyxy
confs = outputs[0][0, :, 4]
classes = np.argmax(outputs[0][0, :, 5:], axis=1) # COCO classes
high_conf = confs > CONFIG['conf_threshold']
return {'bboxes': bboxes[high_conf], 'confs': confs[high_conf], 'classes': classes[high_conf]}
```
**Problem**: Assumes incorrect ONNX output format. YOLOv8 ONNX outputs are different.
**Impact**: ⚠️ **HIGH** - Detection results will be incorrect
### 1.2 Dependency Status
**Current Installation Status:**
- ✅ numpy (1.26.4)
- ✅ yaml (6.0.1)
- ❌ streamlit - MISSING
- ❌ opencv-python - MISSING
- ❌ ultralytics - MISSING
- ❌ mediapipe - MISSING
- ❌ roboflow - MISSING
- ❌ scikit-learn - MISSING
- ❌ transformers - MISSING
- ❌ torch - MISSING
- ❌ onnxruntime - MISSING
**Installation Required**: 9 packages missing (~2GB download, ~5GB disk space)
### 1.3 Algorithm Analysis
**Current Techniques:**
1. **Object Detection**: YOLOv8n (nano) - ✅ Good choice for low-spec
2. **Face Analysis**: MediaPipe Face Mesh - ✅ Efficient, CPU-friendly
3. **Action Recognition**: VideoMAE-base - ❌ Too heavy for low-spec CPUs
4. **Seatbelt Detection**: Roboflow custom model - ⚠️ Unknown performance
5. **Optical Flow**: Incorrect implementation - ❌ Will crash
6. **Anomaly Detection**: Isolation Forest (untrained) - ❌ Non-functional
---
## 2. Evaluation Criteria
### 2.1 Success Metrics
**Accuracy Targets:**
- DSMS Alerts: >90% precision, >85% recall
- ADAS Alerts: >95% precision, >90% recall
- False Positive Rate: <5%
**Performance Targets (Low-Spec CPU - 4 cores, 2GHz, 8GB RAM):**
- Frame Processing: >10 FPS sustained
- Model Loading: <30 seconds
- Memory Usage: <4GB peak
- CPU Utilization: <80% average
- Latency: <100ms per frame (with skipping)
**Resource Utilization:**
- Model Size: <500MB total (quantized)
- Disk I/O: Minimal (cached models)
- Network: None after initial download
### 2.2 Open-Source Tool Evaluation
**Current Tools:**
| Tool | Status | CPU Efficiency | Accuracy | Recommendation |
|------|--------|----------------|----------|----------------|
| YOLOv8n | ✅ Good | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | **Keep** - Optimize |
| MediaPipe | ✅ Good | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | **Keep** |
| VideoMAE-base | ❌ Too Heavy | ⭐ | ⭐⭐⭐⭐⭐ | **Replace** |
| Roboflow API | ⚠️ Unknown | ⭐⭐⭐ | ⭐⭐⭐ | **Evaluate** |
| Isolation Forest | ⚠️ Untrained | ⭐⭐⭐⭐ | N/A | **Fix** |
---
## 3. Improvement Suggestions
### 3.1 Critical Bug Fixes (Priority 1)
#### Fix #1: Correct Optical Flow Implementation
**Replace** `calcOpticalFlowPyrLK` with `calcOpticalFlowFarneback` (dense flow) or implement proper Lucas-Kanade with feature detection.
**Recommended**: Use `cv2.calcOpticalFlowFarneback` for dense flow (simpler, faster).
#### Fix #2: Remove VideoMAE JIT Scripting
**Replace** with direct model loading or ONNX conversion if quantization needed.
**Alternative**: Use lighter action recognition (MediaPipe Pose + heuristics).
#### Fix #3: Conditional ONNX Export
**Add** file existence check before export.
#### Fix #4: Fix YOLO ONNX Output Parsing
**Use** Ultralytics built-in ONNX post-processing or correct output format.
### 3.2 Performance Optimizations (Priority 2)
#### Optimization #1: Replace VideoMAE with Lightweight Alternative
**Options:**
- **Option A**: MediaPipe Pose + Temporal Logic (yawn detection via mouth opening)
- **Option B**: Lightweight 2D CNN (MobileNet-based) for action classification
- **Option C**: Remove action recognition, use face analysis only
**Recommendation**: **Option A** - Zero additional model, uses existing MediaPipe.
#### Optimization #2: Lazy Model Loading
**Implement**: Load models only when needed, not all at startup.
#### Optimization #3: Model Quantization
- YOLO: ✅ Already ONNX INT8 (verify)
- VideoMAE: Convert to INT8 ONNX or remove
- MediaPipe: Already optimized
#### Optimization #4: Frame Processing Pipeline
- Cache color conversions
- Reduce resolution further (320x240 for face, 640x480 for objects)
- Process different regions at different rates
#### Optimization #5: Smart Frame Skipping
- Different skip rates for different models
- Face analysis: Every frame (fast)
- Object detection: Every 3rd frame
- Action recognition: Every 10th frame (if kept)
### 3.3 Algorithm Enhancements (Priority 3)
#### Enhancement #1: Train Isolation Forest
Collect normal driving features, train offline, save model.
#### Enhancement #2: Improve Distance Estimation
Use camera calibration or stereo vision for accurate distance.
#### Enhancement #3: Better PERCLOS Calculation
Use proper Eye Aspect Ratio (EAR) formula instead of simplified version.
#### Enhancement #4: Temporal Smoothing
Add moving average filters to reduce false positives.
---
## 4. Implementation Plan
### Phase 1: Critical Fixes (Week 1)
**Goal**: Make code functional and runnable
1. **Day 1-2: Fix Critical Bugs**
- [ ] Fix optical flow implementation
- [ ] Remove VideoMAE JIT scripting
- [ ] Fix YOLO ONNX parsing
- [ ] Add conditional ONNX export
- [ ] Add error handling
2. **Day 3-4: Dependency Setup**
- [ ] Install all dependencies
- [ ] Test basic functionality
- [ ] Fix import errors
3. **Day 5: Basic Testing**
- [ ] Run with webcam/video file
- [ ] Verify no crashes
- [ ] Measure baseline performance
### Phase 2: Performance Optimization (Week 2)
**Goal**: Achieve >10 FPS on low-spec CPU
1. **Day 1-2: Replace VideoMAE**
- [ ] Implement MediaPipe Pose-based action detection
- [ ] Remove VideoMAE dependencies
- [ ] Test accuracy vs. performance
2. **Day 3: Optimize Processing Pipeline**
- [ ] Implement multi-resolution processing
- [ ] Add frame caching
- [ ] Optimize color conversions
3. **Day 4: Model Quantization**
- [ ] Verify YOLO INT8 quantization
- [ ] Test accuracy retention
- [ ] Measure speedup
4. **Day 5: Smart Frame Skipping**
- [ ] Implement per-model skip rates
- [ ] Add temporal smoothing
- [ ] Benchmark performance
### Phase 3: Accuracy Improvements (Week 3)
**Goal**: Achieve >90% accuracy targets
1. **Day 1-2: Fix Detection Logic**
- [ ] Train Isolation Forest
- [ ] Improve PERCLOS calculation
- [ ] Fix distance estimation
2. **Day 3-4: Temporal Smoothing**
- [ ] Add moving averages
- [ ] Implement state machines for alerts
- [ ] Reduce false positives
3. **Day 5: Calibration Tools**
- [ ] Add distance calibration
- [ ] Add speed calibration
- [ ] Create config file
### Phase 4: Testing & Validation (Week 4)
**Goal**: Validate improvements
1. **Day 1-2: Unit Tests**
- [ ] Test each component
- [ ] Mock dependencies
- [ ] Verify edge cases
2. **Day 3-4: Integration Tests**
- [ ] Test full pipeline
- [ ] Measure metrics
- [ ] Compare before/after
3. **Day 5: Documentation**
- [ ] Update code comments
- [ ] Create user guide
- [ ] Document calibration
---
## 5. Testing and Validation Framework
### 5.1 Test Dataset Requirements
**Required Test Videos:**
- Normal driving (baseline)
- Drowsy driver (PERCLOS > threshold)
- Distracted driver (phone, looking away)
- No seatbelt scenarios
- FCW scenarios (approaching vehicle)
- LDW scenarios (lane departure)
- Mixed scenarios
**Minimum**: 10 videos, 30 seconds each, various lighting conditions
### 5.2 Metrics Collection
**Performance Metrics:**
```python
metrics = {
'fps': float, # Frames per second
'latency_ms': float, # Per-frame latency
'memory_mb': float, # Peak memory usage
'cpu_percent': float, # Average CPU usage
'model_load_time': float # Startup time
}
```
**Accuracy Metrics:**
```python
accuracy_metrics = {
'precision': float, # TP / (TP + FP)
'recall': float, # TP / (TP + FN)
'f1_score': float, # 2 * (precision * recall) / (precision + recall)
'false_positive_rate': float # FP / (FP + TN)
}
```
### 5.3 Testing Script Structure
```python
# test_performance.py
def benchmark_inference():
"""Measure FPS, latency, memory"""
pass
def test_accuracy():
"""Run on test dataset, compute metrics"""
pass
def test_edge_cases():
"""Test with missing data, errors"""
pass
```
### 5.4 Success Criteria
**Performance:**
- ✅ FPS > 10 on target hardware
- ✅ Latency < 100ms per frame
- Memory < 4GB
- CPU < 80%
**Accuracy:**
- DSMS Precision > 90%
- ✅ DSMS Recall > 85%
- ✅ ADAS Precision > 95%
- ✅ FPR < 5%
---
## 6. Documentation Requirements
### 6.1 Code Documentation
**Required:**
- Docstrings for all functions/classes
- Type hints where applicable
- Inline comments for complex logic
- Algorithm references (papers, docs)
**Template:**
```python
def function_name(param1: type, param2: type) -> return_type:
"""
Brief description.
Args:
param1: Description
param2: Description
Returns:
Description
Raises:
ExceptionType: When this happens
References:
- Paper/URL if applicable
"""
```
### 6.2 User Documentation
**Required Sections:**
1. **Installation Guide**
- System requirements
- Dependency installation
- Configuration setup
2. **Usage Guide**
- How to run the application
- Configuration options
- Calibration procedures
3. **Troubleshooting**
- Common issues
- Performance tuning
- Accuracy improvements
### 6.3 Technical Documentation
**Required:**
- Architecture diagram
- Model specifications
- Performance benchmarks
- Accuracy reports
---
## 7. Immediate Action Items
### 🔴 **CRITICAL - Do First:**
1. Fix optical flow bug (will crash)
2. Remove VideoMAE JIT scripting (will crash)
3. Fix YOLO ONNX parsing (incorrect results)
4. Install missing dependencies
### 🟡 **HIGH PRIORITY - Do Next:**
1. Replace VideoMAE with lightweight alternative
2. Add conditional ONNX export
3. Implement proper error handling
4. Train Isolation Forest
### 🟢 **MEDIUM PRIORITY - Do Later:**
1. Optimize frame processing
2. Add temporal smoothing
3. Improve calibration
4. Add comprehensive tests
---
## 8. Estimated Impact
**After Fixes:**
- **Functionality**: Code will run without crashes
- **Performance**: 🟡 5-8 FPS 🟢 12-15 FPS (estimated)
- **Memory**: 🟡 6-8GB 🟢 2-3GB (estimated)
- **Accuracy**: 🟡 Unknown 🟢 >90% (with improvements)
**Timeline**: 4 weeks for full implementation
**Effort**: ~160 hours (1 FTE month)
---
## Conclusion
The current implementation has a solid foundation but requires significant fixes and optimizations to be production-ready, especially for low-specification CPUs. The proposed improvements will address critical bugs, reduce resource usage by ~60%, and improve accuracy through better algorithms and temporal smoothing.
**Next Step**: Begin Phase 1 - Critical Fixes