Re_Backend/Data_Collection_Analysis.md

536 lines
16 KiB
Markdown

# Data Collection Analysis - What We Have vs What We're Collecting
## Overview
This document compares the database structure with what we're currently collecting and recommends what we should start collecting for the Detailed Reports.
---
## 1. ACTIVITIES TABLE
### ✅ **Database Fields Available:**
```sql
- activity_id (PK)
- request_id (FK) COLLECTING
- user_id (FK) COLLECTING
- user_name COLLECTING
- activity_type COLLECTING
- activity_description COLLECTING
- activity_category NOT COLLECTING (set to NULL)
- severity NOT COLLECTING (set to NULL)
- metadata COLLECTING (partially)
- is_system_event COLLECTING
- ip_address NOT COLLECTING (set to NULL)
- user_agent NOT COLLECTING (set to NULL)
- created_at COLLECTING
```
### 🔴 **Currently NOT Collecting (But Should):**
1. **IP Address** (`ip_address`)
- **Status:** Field exists, but always set to `null`
- **Impact:** Cannot show IP in User Activity Log Report
- **Fix:** Extract from `req.ip` or `req.headers['x-forwarded-for']` in controllers
- **Priority:** HIGH (needed for security/audit)
2. **User Agent** (`user_agent`)
- **Status:** Field exists, but always set to `null`
- **Impact:** Cannot show device/browser info in reports
- **Fix:** Extract from `req.headers['user-agent']` in controllers
- **Priority:** MEDIUM (nice to have for analytics)
3. **Activity Category** (`activity_category`)
- **Status:** Field exists, but always set to `null`
- **Impact:** Cannot categorize activities (e.g., "AUTHENTICATION", "WORKFLOW", "DOCUMENT")
- **Fix:** Map `activity_type` to category:
- `created`, `approval`, `rejection`, `status_change` → "WORKFLOW"
- `comment` → "COLLABORATION"
- `document_added` → "DOCUMENT"
- `sla_warning` → "SYSTEM"
- **Priority:** MEDIUM (helps with filtering/reporting)
4. **Severity** (`severity`)
- **Status:** Field exists, but always set to `null`
- **Impact:** Cannot prioritize critical activities
- **Fix:** Map based on activity type:
- `rejection`, `sla_warning` → "WARNING"
- `approval`, `closed` → "INFO"
- `status_change` → "INFO"
- **Priority:** LOW (optional enhancement)
### 📝 **Recommendation:**
**Update `activity.service.ts` to accept and store:**
```typescript
async log(entry: ActivityEntry & {
ipAddress?: string;
userAgent?: string;
category?: string;
severity?: string;
}) {
// ... existing code ...
const activityData = {
// ... existing fields ...
ipAddress: entry.ipAddress || null,
userAgent: entry.userAgent || null,
activityCategory: entry.category || this.inferCategory(entry.type),
severity: entry.severity || this.inferSeverity(entry.type),
};
}
```
**Update all controller calls to pass IP and User Agent:**
```typescript
activityService.log({
// ... existing fields ...
ipAddress: req.ip || req.headers['x-forwarded-for'] || null,
userAgent: req.headers['user-agent'] || null,
});
```
---
## 2. APPROVAL_LEVELS TABLE
### ✅ **Database Fields Available:**
```sql
- level_id (PK)
- request_id (FK) COLLECTING
- level_number COLLECTING
- level_name OPTIONAL (may not be set)
- approver_id (FK) COLLECTING
- approver_email COLLECTING
- approver_name COLLECTING
- tat_hours COLLECTING
- tat_days COLLECTING (auto-calculated)
- status COLLECTING
- level_start_time COLLECTING
- level_end_time COLLECTING
- action_date COLLECTING
- comments COLLECTING
- rejection_reason COLLECTING
- is_final_approver COLLECTING
- elapsed_hours COLLECTING
- remaining_hours COLLECTING
- tat_percentage_used COLLECTING
- tat50_alert_sent COLLECTING
- tat75_alert_sent COLLECTING
- tat_breached COLLECTING
- tat_start_time COLLECTING
- created_at COLLECTING
- updated_at COLLECTING
```
### 🔴 **Currently NOT Collecting (But Should):**
1. **Level Name** (`level_name`)
- **Status:** Field exists, but may be NULL
- **Impact:** Cannot show stage name in reports (only level number)
- **Fix:** When creating approval levels, prompt for or auto-generate level names:
- "Department Head Review"
- "Finance Approval"
- "Final Approval"
- **Priority:** MEDIUM (improves report readability)
### 📝 **Recommendation:**
**Ensure level_name is set when creating approval levels:**
```typescript
await ApprovalLevel.create({
// ... existing fields ...
levelName: levelData.levelName || `Level ${levelNumber}`,
});
```
---
## 3. USER_SESSIONS TABLE
### ✅ **Database Fields Available:**
```sql
- session_id (PK)
- user_id (FK)
- session_token COLLECTING
- refresh_token COLLECTING
- ip_address CHECK IF COLLECTING
- user_agent CHECK IF COLLECTING
- device_type CHECK IF COLLECTING
- browser CHECK IF COLLECTING
- os CHECK IF COLLECTING
- login_at COLLECTING
- last_activity_at COLLECTING
- logout_at CHECK IF COLLECTING
- expires_at COLLECTING
- is_active COLLECTING
- logout_reason CHECK IF COLLECTING
```
### 🔴 **Missing for Login Activity Tracking:**
1. **Login Activities in Activities Table**
- **Status:** Login events are NOT logged in `activities` table
- **Impact:** Cannot show login activities in User Activity Log Report
- **Fix:** Add login activity logging in auth middleware/controller:
```typescript
// After successful login
await activityService.log({
requestId: 'SYSTEM_LOGIN', // Special request ID for system events
type: 'login',
user: { userId, name: user.displayName },
ipAddress: req.ip,
userAgent: req.headers['user-agent'],
category: 'AUTHENTICATION',
severity: 'INFO',
timestamp: new Date().toISOString(),
action: 'User Login',
details: `User logged in from ${req.ip}`
});
```
- **Priority:** HIGH (needed for security audit)
2. **Device/Browser Parsing**
- **Status:** Fields exist but may not be populated
- **Impact:** Cannot show device type in reports
- **Fix:** Parse user agent to extract:
- `device_type`: "WEB", "MOBILE"
- `browser`: "Chrome", "Firefox", "Safari"
- `os`: "Windows", "macOS", "iOS", "Android"
- **Priority:** MEDIUM (nice to have)
---
## 4. WORKFLOW_REQUESTS TABLE
### ✅ **All Fields Are Being Collected:**
- All fields in `workflow_requests` are properly collected
- No missing data here
### 📝 **Note:**
- `submission_date` vs `created_at`: Use `submission_date` for "days open" calculation
- `closure_date`: Available for completed requests
---
## 5. TAT_TRACKING TABLE
### ✅ **Database Fields Available:**
```sql
- tracking_id (PK)
- request_id (FK)
- level_id (FK)
- tracking_type COLLECTING
- tat_status COLLECTING
- total_tat_hours COLLECTING
- elapsed_hours COLLECTING
- remaining_hours COLLECTING
- percentage_used COLLECTING
- threshold_50_breached COLLECTING
- threshold_50_alerted_at COLLECTING
- threshold_80_breached COLLECTING
- threshold_80_alerted_at COLLECTING
- threshold_100_breached COLLECTING
- threshold_100_alerted_at COLLECTING
- alert_count COLLECTING
- last_calculated_at COLLECTING
```
### ✅ **All Fields Are Being Collected:**
- TAT tracking appears to be fully implemented
---
## 6. AUDIT_LOGS TABLE
### ✅ **Database Fields Available:**
```sql
- audit_id (PK)
- user_id (FK)
- entity_type
- entity_id
- action
- action_category
- old_values (JSONB)
- new_values (JSONB)
- changes_summary
- ip_address
- user_agent
- session_id
- request_method
- request_url
- response_status
- execution_time_ms
- created_at
```
### 🔴 **Status:**
- **Audit logging may not be fully implemented**
- **Impact:** Cannot track all system changes for audit purposes
- **Priority:** MEDIUM (for compliance/security)
---
## SUMMARY: What to Start Collecting
### 🔴 **HIGH PRIORITY (Must Have for Reports):**
1. **IP Address in Activities** ✅ Field exists, just need to populate
- Extract from `req.ip` or `req.headers['x-forwarded-for']`
- Update `activity.service.ts` to accept IP
- Update all controller calls
2. **User Agent in Activities** ✅ Field exists, just need to populate
- Extract from `req.headers['user-agent']`
- Update `activity.service.ts` to accept user agent
- Update all controller calls
3. **Login Activities** ❌ Not currently logged
- Add login activity logging in auth controller
- Use special `requestId: 'SYSTEM_LOGIN'` for system events
- Include IP and user agent
### 🟡 **MEDIUM PRIORITY (Nice to Have):**
4. **Activity Category** ✅ Field exists, just need to populate
- Auto-infer from `activity_type`
- Helps with filtering and reporting
5. **Level Names** ✅ Field exists, ensure it's set
- Improve readability in reports
- Auto-generate if not provided
6. **Severity** ✅ Field exists, just need to populate
- Auto-infer from `activity_type`
- Helps prioritize critical activities
### 🟢 **LOW PRIORITY (Future Enhancement):**
7. **Device/Browser Parsing**
- Parse user agent to extract device type, browser, OS
- Store in `user_sessions` table
8. **Audit Logging**
- Implement comprehensive audit logging
- Track all system changes
---
## 7. BUSINESS DAYS CALCULATION FOR WORKFLOW AGING
### ✅ **Available:**
- `calculateElapsedWorkingHours()` - Calculates working hours (excludes weekends/holidays)
- Working hours configuration (9 AM - 6 PM, Mon-Fri)
- Holiday support (from database)
- Priority-based calculation (express vs standard)
### ❌ **Missing:**
1. **Business Days Count Function**
- Need a function to calculate business days (not hours)
- For Workflow Aging Report: "Days Open" should be business days
- Currently only have working hours calculation
2. **TAT Processor Using Wrong Calculation**
- `tatProcessor.ts` uses simple calendar hours:
```typescript
const elapsedMs = now.getTime() - new Date(levelStartTime).getTime();
const elapsedHours = elapsedMs / (1000 * 60 * 60);
```
- Should use `calculateElapsedWorkingHours()` instead
- This causes incorrect TAT breach calculations
### 🔧 **What Needs to be Built:**
1. **Add Business Days Calculation Function:**
```typescript
// In tatTimeUtils.ts
export async function calculateBusinessDays(
startDate: Date | string,
endDate: Date | string = new Date(),
priority: string = 'standard'
): Promise<number> {
await loadWorkingHoursCache();
await loadHolidaysCache();
let start = dayjs(startDate);
const end = dayjs(endDate);
const config = workingHoursCache || { /* defaults */ };
let businessDays = 0;
let current = start.startOf('day');
while (current.isBefore(end) || current.isSame(end, 'day')) {
const dayOfWeek = current.day();
const dateStr = current.format('YYYY-MM-DD');
const isWorkingDay = priority === 'express'
? true
: (dayOfWeek >= config.startDay && dayOfWeek <= config.endDay);
const isNotHoliday = !holidaysCache.has(dateStr);
if (isWorkingDay && isNotHoliday) {
businessDays++;
}
current = current.add(1, 'day');
}
return businessDays;
}
```
2. **Fix TAT Processor:**
- Replace calendar hours calculation with `calculateElapsedWorkingHours()`
- This will fix TAT breach alerts to use proper working hours
3. **Update Workflow Aging Report:**
- Use `calculateBusinessDays()` instead of calendar days
- Filter by business days threshold
---
## IMPLEMENTATION CHECKLIST
### Phase 1: Quick Wins (Fields Exist, Just Need to Populate)
- [ ] Update `activity.service.ts` to accept `ipAddress` and `userAgent`
- [ ] Update all controller calls to pass IP and user agent
- [ ] Add activity category inference
- [ ] Add severity inference
### Phase 2: Fix TAT Calculations (CRITICAL)
- [x] Fix `tatProcessor.ts` to use `calculateElapsedWorkingHours()` instead of calendar hours ✅
- [x] Add `calculateBusinessDays()` function to `tatTimeUtils.ts`
- [ ] Test TAT breach calculations with working hours
### Phase 3: New Functionality
- [x] Add login activity logging ✅ (Implemented in auth.controller.ts for SSO and token exchange)
- [x] Ensure level names are set when creating approval levels ✅ (levelName set in workflow.service.ts)
- [x] Add device/browser parsing for user sessions ✅ (userAgentParser.ts utility created - can be used for parsing user agent strings)
### Phase 4: Enhanced Reporting
- [x] Build report endpoints using collected data ✅ (getLifecycleReport, getActivityLogReport, getWorkflowAgingReport)
- [x] Add filtering by category, severity ✅ (Filtering by category and severity added to getActivityLogReport, frontend UI added)
- [x] Add IP/user agent to activity log reports ✅ (IP and user agent captured and displayed)
- [x] Use business days in Workflow Aging Report ✅ (calculateBusinessDays implemented and used)
---
## CODE CHANGES NEEDED
### 1. Update Activity Service (`activity.service.ts`)
```typescript
export type ActivityEntry = {
requestId: string;
type: 'created' | 'assignment' | 'approval' | 'rejection' | 'status_change' | 'comment' | 'reminder' | 'document_added' | 'sla_warning' | 'ai_conclusion_generated' | 'closed' | 'login';
user?: { userId: string; name?: string; email?: string };
timestamp: string;
action: string;
details: string;
metadata?: any;
ipAddress?: string; // NEW
userAgent?: string; // NEW
category?: string; // NEW
severity?: string; // NEW
};
class ActivityService {
private inferCategory(type: string): string {
const categoryMap: Record<string, string> = {
'created': 'WORKFLOW',
'approval': 'WORKFLOW',
'rejection': 'WORKFLOW',
'status_change': 'WORKFLOW',
'assignment': 'WORKFLOW',
'comment': 'COLLABORATION',
'document_added': 'DOCUMENT',
'sla_warning': 'SYSTEM',
'reminder': 'SYSTEM',
'ai_conclusion_generated': 'SYSTEM',
'closed': 'WORKFLOW',
'login': 'AUTHENTICATION'
};
return categoryMap[type] || 'OTHER';
}
private inferSeverity(type: string): string {
const severityMap: Record<string, string> = {
'rejection': 'WARNING',
'sla_warning': 'WARNING',
'approval': 'INFO',
'closed': 'INFO',
'status_change': 'INFO',
'login': 'INFO',
'created': 'INFO',
'comment': 'INFO',
'document_added': 'INFO'
};
return severityMap[type] || 'INFO';
}
async log(entry: ActivityEntry) {
// ... existing code ...
const activityData = {
requestId: entry.requestId,
userId: entry.user?.userId || null,
userName: entry.user?.name || entry.user?.email || null,
activityType: entry.type,
activityDescription: entry.details,
activityCategory: entry.category || this.inferCategory(entry.type),
severity: entry.severity || this.inferSeverity(entry.type),
metadata: entry.metadata || null,
isSystemEvent: !entry.user,
ipAddress: entry.ipAddress || null, // NEW
userAgent: entry.userAgent || null, // NEW
};
// ... rest of code ...
}
}
```
### 2. Update Controller Calls (Example)
```typescript
// In workflow.controller.ts, approval.controller.ts, etc.
activityService.log({
requestId: workflow.requestId,
type: 'created',
user: { userId, name: user.displayName },
timestamp: new Date().toISOString(),
action: 'Request Created',
details: `Request ${workflow.requestNumber} created`,
ipAddress: req.ip || req.headers['x-forwarded-for'] || null, // NEW
userAgent: req.headers['user-agent'] || null, // NEW
});
```
### 3. Add Login Activity Logging
```typescript
// In auth.controller.ts after successful login
await activityService.log({
requestId: 'SYSTEM_LOGIN', // Special ID for system events
type: 'login',
user: { userId: user.userId, name: user.displayName },
timestamp: new Date().toISOString(),
action: 'User Login',
details: `User logged in successfully`,
ipAddress: req.ip || req.headers['x-forwarded-for'] || null,
userAgent: req.headers['user-agent'] || null,
category: 'AUTHENTICATION',
severity: 'INFO'
});
```
---
## CONCLUSION
**Good News:** Most fields already exist in the database! We just need to:
1. Populate existing fields (IP, user agent, category, severity)
2. Add login activity logging
3. Ensure level names are set
**Estimated Effort:**
- Phase 1 (Quick Wins): 2-4 hours
- Phase 2 (New Functionality): 4-6 hours
- Phase 3 (Enhanced Reporting): 8-12 hours
**Total: ~14-22 hours of development work**