# DSMS/ADAS Visual Analysis - Comprehensive Assessment Report ## Executive Summary This report provides a systematic evaluation of the current Streamlit-based Driver State Monitoring System (DSMS) and Advanced Driver Assistance System (ADAS) implementation, with focus on optimizing for low-specification CPUs while maintaining high accuracy. **Current Status**: ⚠️ **Non-Functional** - Missing 9/11 critical dependencies, multiple code bugs, and significant performance bottlenecks. --- ## 1. Assessment of Current Implementation ### 1.1 Code Structure Analysis **Strengths:** - ✅ Modular class-based design (`RealTimePredictor`) - ✅ Streamlit caching enabled (`@st.cache_resource`) - ✅ Frame skipping mechanism (`inference_skip: 3`) - ✅ Logging infrastructure in place - ✅ ONNX optimization mentioned for YOLO **Critical Issues Identified:** #### 🔴 **CRITICAL BUG #1: Incorrect Optical Flow API Usage** ```125:131:track_drive.py def optical_flow(self, prev_frame, curr_frame): """OpenCV flow for speed, braking, accel.""" prev_gray = cv2.cvtColor(prev_frame, cv2.COLOR_BGR2GRAY) curr_gray = cv2.cvtColor(curr_frame, cv2.COLOR_BGR2GRAY) flow = cv2.calcOpticalFlowPyrLK(prev_gray, curr_gray, None, None) magnitude = np.mean(np.sqrt(flow[0]**2 + flow[1]**2)) return magnitude ``` **Problem**: `calcOpticalFlowPyrLK` requires feature points as input, not full images. This will cause a runtime error. **Impact**: ⚠️ **CRITICAL** - Will crash on execution #### 🔴 **CRITICAL BUG #2: VideoMAE JIT Scripting Failure** ```48:53:track_drive.py processor = VideoMAEImageProcessor.from_pretrained(CONFIG['videomae_model']) videomae = VideoMAEForVideoClassification.from_pretrained(CONFIG['videomae_model']) videomae = torch.jit.script(videomae) torch.jit.save(videomae, 'videomae_ts.pt') videomae = torch.jit.load('videomae_ts.pt') ``` **Problem**: Transformer models cannot be JIT scripted directly. This will fail at runtime. **Impact**: ⚠️ **CRITICAL** - Model loading will crash #### 🔴 **CRITICAL BUG #3: ONNX Export on Every Load** ```39:41:track_drive.py yolo_base = YOLO(CONFIG['yolo_base']) yolo_base.export(format='onnx', int8=True) # Quantize once yolo_session = ort.InferenceSession('yolov8n.onnx') ``` **Problem**: ONNX export runs every time `load_models()` is called, even with caching. Should be conditional. **Impact**: ⚠️ **HIGH** - Slow startup, unnecessary file I/O #### 🟡 **PERFORMANCE ISSUE #1: Untrained Isolation Forest** ```60:60:track_drive.py iso_forest = IsolationForest(contamination=0.1, random_state=42) ``` **Problem**: Isolation Forest is instantiated but never trained. Will produce random predictions. **Impact**: ⚠️ **MEDIUM** - Anomaly detection non-functional #### 🟡 **PERFORMANCE ISSUE #2: Multiple Heavy Models Loaded Simultaneously** All models (YOLO, VideoMAE, MediaPipe, Roboflow, Isolation Forest) load at startup regardless of usage. **Impact**: ⚠️ **HIGH** - Very slow startup, high memory usage #### 🟡 **PERFORMANCE ISSUE #3: Redundant Color Conversions** ```101:101:track_drive.py rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) ``` And later: ```253:253:track_drive.py frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) ``` **Impact**: ⚠️ **MEDIUM** - Unnecessary CPU cycles #### 🟡 **PERFORMANCE ISSUE #4: VideoMAE Processing Every Frame** VideoMAE (large transformer) processes 8-frame sequences even when not needed. **Impact**: ⚠️ **HIGH** - Major CPU bottleneck on low-spec hardware #### 🟡 **PERFORMANCE ISSUE #5: No Model Quantization for VideoMAE** VideoMAE runs in FP32, consuming significant memory and compute. **Impact**: ⚠️ **HIGH** - Not suitable for low-spec CPUs #### 🟡 **PERFORMANCE ISSUE #6: Inefficient YOLO ONNX Parsing** ```87:91:track_drive.py bboxes = outputs[0][0, :, :4] # xyxy confs = outputs[0][0, :, 4] classes = np.argmax(outputs[0][0, :, 5:], axis=1) # COCO classes high_conf = confs > CONFIG['conf_threshold'] return {'bboxes': bboxes[high_conf], 'confs': confs[high_conf], 'classes': classes[high_conf]} ``` **Problem**: Assumes incorrect ONNX output format. YOLOv8 ONNX outputs are different. **Impact**: ⚠️ **HIGH** - Detection results will be incorrect ### 1.2 Dependency Status **Current Installation Status:** - ✅ numpy (1.26.4) - ✅ yaml (6.0.1) - ❌ streamlit - MISSING - ❌ opencv-python - MISSING - ❌ ultralytics - MISSING - ❌ mediapipe - MISSING - ❌ roboflow - MISSING - ❌ scikit-learn - MISSING - ❌ transformers - MISSING - ❌ torch - MISSING - ❌ onnxruntime - MISSING **Installation Required**: 9 packages missing (~2GB download, ~5GB disk space) ### 1.3 Algorithm Analysis **Current Techniques:** 1. **Object Detection**: YOLOv8n (nano) - ✅ Good choice for low-spec 2. **Face Analysis**: MediaPipe Face Mesh - ✅ Efficient, CPU-friendly 3. **Action Recognition**: VideoMAE-base - ❌ Too heavy for low-spec CPUs 4. **Seatbelt Detection**: Roboflow custom model - ⚠️ Unknown performance 5. **Optical Flow**: Incorrect implementation - ❌ Will crash 6. **Anomaly Detection**: Isolation Forest (untrained) - ❌ Non-functional --- ## 2. Evaluation Criteria ### 2.1 Success Metrics **Accuracy Targets:** - DSMS Alerts: >90% precision, >85% recall - ADAS Alerts: >95% precision, >90% recall - False Positive Rate: <5% **Performance Targets (Low-Spec CPU - 4 cores, 2GHz, 8GB RAM):** - Frame Processing: >10 FPS sustained - Model Loading: <30 seconds - Memory Usage: <4GB peak - CPU Utilization: <80% average - Latency: <100ms per frame (with skipping) **Resource Utilization:** - Model Size: <500MB total (quantized) - Disk I/O: Minimal (cached models) - Network: None after initial download ### 2.2 Open-Source Tool Evaluation **Current Tools:** | Tool | Status | CPU Efficiency | Accuracy | Recommendation | |------|--------|----------------|----------|----------------| | YOLOv8n | ✅ Good | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | **Keep** - Optimize | | MediaPipe | ✅ Good | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | **Keep** | | VideoMAE-base | ❌ Too Heavy | ⭐ | ⭐⭐⭐⭐⭐ | **Replace** | | Roboflow API | ⚠️ Unknown | ⭐⭐⭐ | ⭐⭐⭐ | **Evaluate** | | Isolation Forest | ⚠️ Untrained | ⭐⭐⭐⭐ | N/A | **Fix** | --- ## 3. Improvement Suggestions ### 3.1 Critical Bug Fixes (Priority 1) #### Fix #1: Correct Optical Flow Implementation **Replace** `calcOpticalFlowPyrLK` with `calcOpticalFlowFarneback` (dense flow) or implement proper Lucas-Kanade with feature detection. **Recommended**: Use `cv2.calcOpticalFlowFarneback` for dense flow (simpler, faster). #### Fix #2: Remove VideoMAE JIT Scripting **Replace** with direct model loading or ONNX conversion if quantization needed. **Alternative**: Use lighter action recognition (MediaPipe Pose + heuristics). #### Fix #3: Conditional ONNX Export **Add** file existence check before export. #### Fix #4: Fix YOLO ONNX Output Parsing **Use** Ultralytics built-in ONNX post-processing or correct output format. ### 3.2 Performance Optimizations (Priority 2) #### Optimization #1: Replace VideoMAE with Lightweight Alternative **Options:** - **Option A**: MediaPipe Pose + Temporal Logic (yawn detection via mouth opening) - **Option B**: Lightweight 2D CNN (MobileNet-based) for action classification - **Option C**: Remove action recognition, use face analysis only **Recommendation**: **Option A** - Zero additional model, uses existing MediaPipe. #### Optimization #2: Lazy Model Loading **Implement**: Load models only when needed, not all at startup. #### Optimization #3: Model Quantization - YOLO: ✅ Already ONNX INT8 (verify) - VideoMAE: Convert to INT8 ONNX or remove - MediaPipe: Already optimized #### Optimization #4: Frame Processing Pipeline - Cache color conversions - Reduce resolution further (320x240 for face, 640x480 for objects) - Process different regions at different rates #### Optimization #5: Smart Frame Skipping - Different skip rates for different models - Face analysis: Every frame (fast) - Object detection: Every 3rd frame - Action recognition: Every 10th frame (if kept) ### 3.3 Algorithm Enhancements (Priority 3) #### Enhancement #1: Train Isolation Forest Collect normal driving features, train offline, save model. #### Enhancement #2: Improve Distance Estimation Use camera calibration or stereo vision for accurate distance. #### Enhancement #3: Better PERCLOS Calculation Use proper Eye Aspect Ratio (EAR) formula instead of simplified version. #### Enhancement #4: Temporal Smoothing Add moving average filters to reduce false positives. --- ## 4. Implementation Plan ### Phase 1: Critical Fixes (Week 1) **Goal**: Make code functional and runnable 1. **Day 1-2: Fix Critical Bugs** - [ ] Fix optical flow implementation - [ ] Remove VideoMAE JIT scripting - [ ] Fix YOLO ONNX parsing - [ ] Add conditional ONNX export - [ ] Add error handling 2. **Day 3-4: Dependency Setup** - [ ] Install all dependencies - [ ] Test basic functionality - [ ] Fix import errors 3. **Day 5: Basic Testing** - [ ] Run with webcam/video file - [ ] Verify no crashes - [ ] Measure baseline performance ### Phase 2: Performance Optimization (Week 2) **Goal**: Achieve >10 FPS on low-spec CPU 1. **Day 1-2: Replace VideoMAE** - [ ] Implement MediaPipe Pose-based action detection - [ ] Remove VideoMAE dependencies - [ ] Test accuracy vs. performance 2. **Day 3: Optimize Processing Pipeline** - [ ] Implement multi-resolution processing - [ ] Add frame caching - [ ] Optimize color conversions 3. **Day 4: Model Quantization** - [ ] Verify YOLO INT8 quantization - [ ] Test accuracy retention - [ ] Measure speedup 4. **Day 5: Smart Frame Skipping** - [ ] Implement per-model skip rates - [ ] Add temporal smoothing - [ ] Benchmark performance ### Phase 3: Accuracy Improvements (Week 3) **Goal**: Achieve >90% accuracy targets 1. **Day 1-2: Fix Detection Logic** - [ ] Train Isolation Forest - [ ] Improve PERCLOS calculation - [ ] Fix distance estimation 2. **Day 3-4: Temporal Smoothing** - [ ] Add moving averages - [ ] Implement state machines for alerts - [ ] Reduce false positives 3. **Day 5: Calibration Tools** - [ ] Add distance calibration - [ ] Add speed calibration - [ ] Create config file ### Phase 4: Testing & Validation (Week 4) **Goal**: Validate improvements 1. **Day 1-2: Unit Tests** - [ ] Test each component - [ ] Mock dependencies - [ ] Verify edge cases 2. **Day 3-4: Integration Tests** - [ ] Test full pipeline - [ ] Measure metrics - [ ] Compare before/after 3. **Day 5: Documentation** - [ ] Update code comments - [ ] Create user guide - [ ] Document calibration --- ## 5. Testing and Validation Framework ### 5.1 Test Dataset Requirements **Required Test Videos:** - Normal driving (baseline) - Drowsy driver (PERCLOS > threshold) - Distracted driver (phone, looking away) - No seatbelt scenarios - FCW scenarios (approaching vehicle) - LDW scenarios (lane departure) - Mixed scenarios **Minimum**: 10 videos, 30 seconds each, various lighting conditions ### 5.2 Metrics Collection **Performance Metrics:** ```python metrics = { 'fps': float, # Frames per second 'latency_ms': float, # Per-frame latency 'memory_mb': float, # Peak memory usage 'cpu_percent': float, # Average CPU usage 'model_load_time': float # Startup time } ``` **Accuracy Metrics:** ```python accuracy_metrics = { 'precision': float, # TP / (TP + FP) 'recall': float, # TP / (TP + FN) 'f1_score': float, # 2 * (precision * recall) / (precision + recall) 'false_positive_rate': float # FP / (FP + TN) } ``` ### 5.3 Testing Script Structure ```python # test_performance.py def benchmark_inference(): """Measure FPS, latency, memory""" pass def test_accuracy(): """Run on test dataset, compute metrics""" pass def test_edge_cases(): """Test with missing data, errors""" pass ``` ### 5.4 Success Criteria **Performance:** - ✅ FPS > 10 on target hardware - ✅ Latency < 100ms per frame - ✅ Memory < 4GB - ✅ CPU < 80% **Accuracy:** - ✅ DSMS Precision > 90% - ✅ DSMS Recall > 85% - ✅ ADAS Precision > 95% - ✅ FPR < 5% --- ## 6. Documentation Requirements ### 6.1 Code Documentation **Required:** - Docstrings for all functions/classes - Type hints where applicable - Inline comments for complex logic - Algorithm references (papers, docs) **Template:** ```python def function_name(param1: type, param2: type) -> return_type: """ Brief description. Args: param1: Description param2: Description Returns: Description Raises: ExceptionType: When this happens References: - Paper/URL if applicable """ ``` ### 6.2 User Documentation **Required Sections:** 1. **Installation Guide** - System requirements - Dependency installation - Configuration setup 2. **Usage Guide** - How to run the application - Configuration options - Calibration procedures 3. **Troubleshooting** - Common issues - Performance tuning - Accuracy improvements ### 6.3 Technical Documentation **Required:** - Architecture diagram - Model specifications - Performance benchmarks - Accuracy reports --- ## 7. Immediate Action Items ### 🔴 **CRITICAL - Do First:** 1. Fix optical flow bug (will crash) 2. Remove VideoMAE JIT scripting (will crash) 3. Fix YOLO ONNX parsing (incorrect results) 4. Install missing dependencies ### 🟡 **HIGH PRIORITY - Do Next:** 1. Replace VideoMAE with lightweight alternative 2. Add conditional ONNX export 3. Implement proper error handling 4. Train Isolation Forest ### 🟢 **MEDIUM PRIORITY - Do Later:** 1. Optimize frame processing 2. Add temporal smoothing 3. Improve calibration 4. Add comprehensive tests --- ## 8. Estimated Impact **After Fixes:** - **Functionality**: ✅ Code will run without crashes - **Performance**: 🟡 5-8 FPS → 🟢 12-15 FPS (estimated) - **Memory**: 🟡 6-8GB → 🟢 2-3GB (estimated) - **Accuracy**: 🟡 Unknown → 🟢 >90% (with improvements) **Timeline**: 4 weeks for full implementation **Effort**: ~160 hours (1 FTE month) --- ## Conclusion The current implementation has a solid foundation but requires significant fixes and optimizations to be production-ready, especially for low-specification CPUs. The proposed improvements will address critical bugs, reduce resource usage by ~60%, and improve accuracy through better algorithms and temporal smoothing. **Next Step**: Begin Phase 1 - Critical Fixes