Re_Backend/Data_Collection_Analysis.md

16 KiB

Data Collection Analysis - What We Have vs What We're Collecting

Overview

This document compares the database structure with what we're currently collecting and recommends what we should start collecting for the Detailed Reports.


1. ACTIVITIES TABLE

Database Fields Available:

- activity_id (PK)
- request_id (FK)  COLLECTING
- user_id (FK)  COLLECTING
- user_name  COLLECTING
- activity_type  COLLECTING
- activity_description  COLLECTING
- activity_category  NOT COLLECTING (set to NULL)
- severity  NOT COLLECTING (set to NULL)
- metadata  COLLECTING (partially)
- is_system_event  COLLECTING
- ip_address  NOT COLLECTING (set to NULL)
- user_agent  NOT COLLECTING (set to NULL)
- created_at  COLLECTING

🔴 Currently NOT Collecting (But Should):

  1. IP Address (ip_address)

    • Status: Field exists, but always set to null
    • Impact: Cannot show IP in User Activity Log Report
    • Fix: Extract from req.ip or req.headers['x-forwarded-for'] in controllers
    • Priority: HIGH (needed for security/audit)
  2. User Agent (user_agent)

    • Status: Field exists, but always set to null
    • Impact: Cannot show device/browser info in reports
    • Fix: Extract from req.headers['user-agent'] in controllers
    • Priority: MEDIUM (nice to have for analytics)
  3. Activity Category (activity_category)

    • Status: Field exists, but always set to null
    • Impact: Cannot categorize activities (e.g., "AUTHENTICATION", "WORKFLOW", "DOCUMENT")
    • Fix: Map activity_type to category:
      • created, approval, rejection, status_change → "WORKFLOW"
      • comment → "COLLABORATION"
      • document_added → "DOCUMENT"
      • sla_warning → "SYSTEM"
    • Priority: MEDIUM (helps with filtering/reporting)
  4. Severity (severity)

    • Status: Field exists, but always set to null
    • Impact: Cannot prioritize critical activities
    • Fix: Map based on activity type:
      • rejection, sla_warning → "WARNING"
      • approval, closed → "INFO"
      • status_change → "INFO"
    • Priority: LOW (optional enhancement)

📝 Recommendation:

Update activity.service.ts to accept and store:

async log(entry: ActivityEntry & { 
  ipAddress?: string; 
  userAgent?: string;
  category?: string;
  severity?: string;
}) {
  // ... existing code ...
  const activityData = {
    // ... existing fields ...
    ipAddress: entry.ipAddress || null,
    userAgent: entry.userAgent || null,
    activityCategory: entry.category || this.inferCategory(entry.type),
    severity: entry.severity || this.inferSeverity(entry.type),
  };
}

Update all controller calls to pass IP and User Agent:

activityService.log({
  // ... existing fields ...
  ipAddress: req.ip || req.headers['x-forwarded-for'] || null,
  userAgent: req.headers['user-agent'] || null,
});

2. APPROVAL_LEVELS TABLE

Database Fields Available:

- level_id (PK)
- request_id (FK)  COLLECTING
- level_number  COLLECTING
- level_name  OPTIONAL (may not be set)
- approver_id (FK)  COLLECTING
- approver_email  COLLECTING
- approver_name  COLLECTING
- tat_hours  COLLECTING
- tat_days  COLLECTING (auto-calculated)
- status  COLLECTING
- level_start_time  COLLECTING
- level_end_time  COLLECTING
- action_date  COLLECTING
- comments  COLLECTING
- rejection_reason  COLLECTING
- is_final_approver  COLLECTING
- elapsed_hours  COLLECTING
- remaining_hours  COLLECTING
- tat_percentage_used  COLLECTING
- tat50_alert_sent  COLLECTING
- tat75_alert_sent  COLLECTING
- tat_breached  COLLECTING
- tat_start_time  COLLECTING
- created_at  COLLECTING
- updated_at  COLLECTING

🔴 Currently NOT Collecting (But Should):

  1. Level Name (level_name)
    • Status: Field exists, but may be NULL
    • Impact: Cannot show stage name in reports (only level number)
    • Fix: When creating approval levels, prompt for or auto-generate level names:
      • "Department Head Review"
      • "Finance Approval"
      • "Final Approval"
    • Priority: MEDIUM (improves report readability)

📝 Recommendation:

Ensure level_name is set when creating approval levels:

await ApprovalLevel.create({
  // ... existing fields ...
  levelName: levelData.levelName || `Level ${levelNumber}`,
});

3. USER_SESSIONS TABLE

Database Fields Available:

- session_id (PK)
- user_id (FK)
- session_token  COLLECTING
- refresh_token  COLLECTING
- ip_address  CHECK IF COLLECTING
- user_agent  CHECK IF COLLECTING
- device_type  CHECK IF COLLECTING
- browser  CHECK IF COLLECTING
- os  CHECK IF COLLECTING
- login_at  COLLECTING
- last_activity_at  COLLECTING
- logout_at  CHECK IF COLLECTING
- expires_at  COLLECTING
- is_active  COLLECTING
- logout_reason  CHECK IF COLLECTING

🔴 Missing for Login Activity Tracking:

  1. Login Activities in Activities Table

    • Status: Login events are NOT logged in activities table
    • Impact: Cannot show login activities in User Activity Log Report
    • Fix: Add login activity logging in auth middleware/controller:
      // After successful login
      await activityService.log({
        requestId: 'SYSTEM_LOGIN', // Special request ID for system events
        type: 'login',
        user: { userId, name: user.displayName },
        ipAddress: req.ip,
        userAgent: req.headers['user-agent'],
        category: 'AUTHENTICATION',
        severity: 'INFO',
        timestamp: new Date().toISOString(),
        action: 'User Login',
        details: `User logged in from ${req.ip}`
      });
      
    • Priority: HIGH (needed for security audit)
  2. Device/Browser Parsing

    • Status: Fields exist but may not be populated
    • Impact: Cannot show device type in reports
    • Fix: Parse user agent to extract:
      • device_type: "WEB", "MOBILE"
      • browser: "Chrome", "Firefox", "Safari"
      • os: "Windows", "macOS", "iOS", "Android"
    • Priority: MEDIUM (nice to have)

4. WORKFLOW_REQUESTS TABLE

All Fields Are Being Collected:

  • All fields in workflow_requests are properly collected
  • No missing data here

📝 Note:

  • submission_date vs created_at: Use submission_date for "days open" calculation
  • closure_date: Available for completed requests

5. TAT_TRACKING TABLE

Database Fields Available:

- tracking_id (PK)
- request_id (FK)
- level_id (FK)
- tracking_type  COLLECTING
- tat_status  COLLECTING
- total_tat_hours  COLLECTING
- elapsed_hours  COLLECTING
- remaining_hours  COLLECTING
- percentage_used  COLLECTING
- threshold_50_breached  COLLECTING
- threshold_50_alerted_at  COLLECTING
- threshold_80_breached  COLLECTING
- threshold_80_alerted_at  COLLECTING
- threshold_100_breached  COLLECTING
- threshold_100_alerted_at  COLLECTING
- alert_count  COLLECTING
- last_calculated_at  COLLECTING

All Fields Are Being Collected:

  • TAT tracking appears to be fully implemented

6. AUDIT_LOGS TABLE

Database Fields Available:

- audit_id (PK)
- user_id (FK)
- entity_type
- entity_id
- action
- action_category
- old_values (JSONB)
- new_values (JSONB)
- changes_summary
- ip_address
- user_agent
- session_id
- request_method
- request_url
- response_status
- execution_time_ms
- created_at

🔴 Status:

  • Audit logging may not be fully implemented
  • Impact: Cannot track all system changes for audit purposes
  • Priority: MEDIUM (for compliance/security)

SUMMARY: What to Start Collecting

🔴 HIGH PRIORITY (Must Have for Reports):

  1. IP Address in Activities Field exists, just need to populate

    • Extract from req.ip or req.headers['x-forwarded-for']
    • Update activity.service.ts to accept IP
    • Update all controller calls
  2. User Agent in Activities Field exists, just need to populate

    • Extract from req.headers['user-agent']
    • Update activity.service.ts to accept user agent
    • Update all controller calls
  3. Login Activities Not currently logged

    • Add login activity logging in auth controller
    • Use special requestId: 'SYSTEM_LOGIN' for system events
    • Include IP and user agent

🟡 MEDIUM PRIORITY (Nice to Have):

  1. Activity Category Field exists, just need to populate

    • Auto-infer from activity_type
    • Helps with filtering and reporting
  2. Level Names Field exists, ensure it's set

    • Improve readability in reports
    • Auto-generate if not provided
  3. Severity Field exists, just need to populate

    • Auto-infer from activity_type
    • Helps prioritize critical activities

🟢 LOW PRIORITY (Future Enhancement):

  1. Device/Browser Parsing

    • Parse user agent to extract device type, browser, OS
    • Store in user_sessions table
  2. Audit Logging

    • Implement comprehensive audit logging
    • Track all system changes

7. BUSINESS DAYS CALCULATION FOR WORKFLOW AGING

Available:

  • calculateElapsedWorkingHours() - Calculates working hours (excludes weekends/holidays)
  • Working hours configuration (9 AM - 6 PM, Mon-Fri)
  • Holiday support (from database)
  • Priority-based calculation (express vs standard)

Missing:

  1. Business Days Count Function

    • Need a function to calculate business days (not hours)
    • For Workflow Aging Report: "Days Open" should be business days
    • Currently only have working hours calculation
  2. TAT Processor Using Wrong Calculation

    • tatProcessor.ts uses simple calendar hours:
      const elapsedMs = now.getTime() - new Date(levelStartTime).getTime();
      const elapsedHours = elapsedMs / (1000 * 60 * 60);
      
    • Should use calculateElapsedWorkingHours() instead
    • This causes incorrect TAT breach calculations

🔧 What Needs to be Built:

  1. Add Business Days Calculation Function:

    // In tatTimeUtils.ts
    export async function calculateBusinessDays(
      startDate: Date | string,
      endDate: Date | string = new Date(),
      priority: string = 'standard'
    ): Promise<number> {
      await loadWorkingHoursCache();
      await loadHolidaysCache();
    
      let start = dayjs(startDate);
      const end = dayjs(endDate);
      const config = workingHoursCache || { /* defaults */ };
    
      let businessDays = 0;
      let current = start.startOf('day');
    
      while (current.isBefore(end) || current.isSame(end, 'day')) {
        const dayOfWeek = current.day();
        const dateStr = current.format('YYYY-MM-DD');
    
        const isWorkingDay = priority === 'express' 
          ? true 
          : (dayOfWeek >= config.startDay && dayOfWeek <= config.endDay);
        const isNotHoliday = !holidaysCache.has(dateStr);
    
        if (isWorkingDay && isNotHoliday) {
          businessDays++;
        }
    
        current = current.add(1, 'day');
      }
    
      return businessDays;
    }
    
  2. Fix TAT Processor:

    • Replace calendar hours calculation with calculateElapsedWorkingHours()
    • This will fix TAT breach alerts to use proper working hours
  3. Update Workflow Aging Report:

    • Use calculateBusinessDays() instead of calendar days
    • Filter by business days threshold

IMPLEMENTATION CHECKLIST

Phase 1: Quick Wins (Fields Exist, Just Need to Populate)

  • Update activity.service.ts to accept ipAddress and userAgent
  • Update all controller calls to pass IP and user agent
  • Add activity category inference
  • Add severity inference

Phase 2: Fix TAT Calculations (CRITICAL)

  • Fix tatProcessor.ts to use calculateElapsedWorkingHours() instead of calendar hours
  • Add calculateBusinessDays() function to tatTimeUtils.ts
  • Test TAT breach calculations with working hours

Phase 3: New Functionality

  • Add login activity logging (Implemented in auth.controller.ts for SSO and token exchange)
  • Ensure level names are set when creating approval levels (levelName set in workflow.service.ts)
  • Add device/browser parsing for user sessions (userAgentParser.ts utility created - can be used for parsing user agent strings)

Phase 4: Enhanced Reporting

  • Build report endpoints using collected data (getLifecycleReport, getActivityLogReport, getWorkflowAgingReport)
  • Add filtering by category, severity (Filtering by category and severity added to getActivityLogReport, frontend UI added)
  • Add IP/user agent to activity log reports (IP and user agent captured and displayed)
  • Use business days in Workflow Aging Report (calculateBusinessDays implemented and used)

CODE CHANGES NEEDED

1. Update Activity Service (activity.service.ts)

export type ActivityEntry = {
  requestId: string;
  type: 'created' | 'assignment' | 'approval' | 'rejection' | 'status_change' | 'comment' | 'reminder' | 'document_added' | 'sla_warning' | 'ai_conclusion_generated' | 'closed' | 'login';
  user?: { userId: string; name?: string; email?: string };
  timestamp: string;
  action: string;
  details: string;
  metadata?: any;
  ipAddress?: string;  // NEW
  userAgent?: string;  // NEW
  category?: string;   // NEW
  severity?: string;   // NEW
};

class ActivityService {
  private inferCategory(type: string): string {
    const categoryMap: Record<string, string> = {
      'created': 'WORKFLOW',
      'approval': 'WORKFLOW',
      'rejection': 'WORKFLOW',
      'status_change': 'WORKFLOW',
      'assignment': 'WORKFLOW',
      'comment': 'COLLABORATION',
      'document_added': 'DOCUMENT',
      'sla_warning': 'SYSTEM',
      'reminder': 'SYSTEM',
      'ai_conclusion_generated': 'SYSTEM',
      'closed': 'WORKFLOW',
      'login': 'AUTHENTICATION'
    };
    return categoryMap[type] || 'OTHER';
  }

  private inferSeverity(type: string): string {
    const severityMap: Record<string, string> = {
      'rejection': 'WARNING',
      'sla_warning': 'WARNING',
      'approval': 'INFO',
      'closed': 'INFO',
      'status_change': 'INFO',
      'login': 'INFO',
      'created': 'INFO',
      'comment': 'INFO',
      'document_added': 'INFO'
    };
    return severityMap[type] || 'INFO';
  }

  async log(entry: ActivityEntry) {
    // ... existing code ...
    const activityData = {
      requestId: entry.requestId,
      userId: entry.user?.userId || null,
      userName: entry.user?.name || entry.user?.email || null,
      activityType: entry.type,
      activityDescription: entry.details,
      activityCategory: entry.category || this.inferCategory(entry.type),
      severity: entry.severity || this.inferSeverity(entry.type),
      metadata: entry.metadata || null,
      isSystemEvent: !entry.user,
      ipAddress: entry.ipAddress || null,  // NEW
      userAgent: entry.userAgent || null,  // NEW
    };
    // ... rest of code ...
  }
}

2. Update Controller Calls (Example)

// In workflow.controller.ts, approval.controller.ts, etc.
activityService.log({
  requestId: workflow.requestId,
  type: 'created',
  user: { userId, name: user.displayName },
  timestamp: new Date().toISOString(),
  action: 'Request Created',
  details: `Request ${workflow.requestNumber} created`,
  ipAddress: req.ip || req.headers['x-forwarded-for'] || null,  // NEW
  userAgent: req.headers['user-agent'] || null,  // NEW
});

3. Add Login Activity Logging

// In auth.controller.ts after successful login
await activityService.log({
  requestId: 'SYSTEM_LOGIN',  // Special ID for system events
  type: 'login',
  user: { userId: user.userId, name: user.displayName },
  timestamp: new Date().toISOString(),
  action: 'User Login',
  details: `User logged in successfully`,
  ipAddress: req.ip || req.headers['x-forwarded-for'] || null,
  userAgent: req.headers['user-agent'] || null,
  category: 'AUTHENTICATION',
  severity: 'INFO'
});

CONCLUSION

Good News: Most fields already exist in the database! We just need to:

  1. Populate existing fields (IP, user agent, category, severity)
  2. Add login activity logging
  3. Ensure level names are set

Estimated Effort:

  • Phase 1 (Quick Wins): 2-4 hours
  • Phase 2 (New Functionality): 4-6 hours
  • Phase 3 (Enhanced Reporting): 8-12 hours

Total: ~14-22 hours of development work