Re_Backend/docs/TAT_NOTIFICATION_SYSTEM.md

388 lines
10 KiB
Markdown

# TAT (Turnaround Time) Notification System
## Overview
The TAT Notification System automatically tracks and notifies approvers about their approval deadlines at key milestones (50%, 75%, and 100% of allotted time). It uses a queue-based architecture with BullMQ and Redis to ensure reliable, scheduled notifications.
## Architecture
```
┌─────────────────┐
│ Workflow │
│ Submission │
└────────┬────────┘
├──> Schedule TAT Jobs (50%, 75%, 100%)
┌────────▼────────┐ ┌──────────────┐ ┌─────────────┐
│ TAT Queue │────>│ TAT Worker │────>│ Processor │
│ (BullMQ) │ │ (Background)│ │ Handler │
└─────────────────┘ └──────────────┘ └──────┬──────┘
├──> Send Notification
├──> Update Database
└──> Log Activity
```
## Components
### 1. TAT Time Utilities (`tatTimeUtils.ts`)
Handles working hours calculations (Monday-Friday, 9 AM - 6 PM):
```typescript
// Calculate TAT milestones considering working hours
const { halfTime, seventyFive, full } = calculateTatMilestones(startDate, tatHours);
```
**Key Functions:**
- `addWorkingHours()`: Adds working hours to a start date, skipping weekends
- `calculateTatMilestones()`: Calculates 50%, 75%, and 100% time points
- `calculateDelay()`: Computes delay in milliseconds from now to target
### 2. TAT Queue (`tatQueue.ts`)
BullMQ queue configuration with Redis:
```typescript
export const tatQueue = new Queue('tatQueue', {
connection: IORedis,
defaultJobOptions: {
removeOnComplete: true,
removeOnFail: false,
attempts: 3,
backoff: { type: 'exponential', delay: 2000 }
}
});
```
### 3. TAT Processor (`tatProcessor.ts`)
Handles job execution when TAT milestones are reached:
```typescript
export async function handleTatJob(job: Job<TatJobData>) {
// Process tat50, tat75, or tatBreach
// - Send notification to approver
// - Update database flags
// - Log activity
}
```
**Job Types:**
- `tat50`: ⏳ 50% of TAT elapsed (gentle reminder)
- `tat75`: ⚠️ 75% of TAT elapsed (escalation warning)
- `tatBreach`: ⏰ 100% of TAT elapsed (breach notification)
### 4. TAT Worker (`tatWorker.ts`)
Background worker that processes jobs from the queue:
```typescript
export const tatWorker = new Worker('tatQueue', handleTatJob, {
connection,
concurrency: 5,
limiter: { max: 10, duration: 1000 }
});
```
**Features:**
- Concurrent job processing (up to 5 jobs)
- Rate limiting (10 jobs/second)
- Automatic retry on failure
- Graceful shutdown on SIGTERM/SIGINT
### 5. TAT Scheduler Service (`tatScheduler.service.ts`)
Service for scheduling and managing TAT jobs:
```typescript
// Schedule TAT jobs for an approval level
await tatSchedulerService.scheduleTatJobs(
requestId,
levelId,
approverId,
tatHours,
startTime
);
// Cancel TAT jobs when level is completed
await tatSchedulerService.cancelTatJobs(requestId, levelId);
```
## Database Schema
### New Fields in `approval_levels` Table
```sql
ALTER TABLE approval_levels ADD COLUMN tat50_alert_sent BOOLEAN NOT NULL DEFAULT false;
ALTER TABLE approval_levels ADD COLUMN tat75_alert_sent BOOLEAN NOT NULL DEFAULT false;
ALTER TABLE approval_levels ADD COLUMN tat_breached BOOLEAN NOT NULL DEFAULT false;
ALTER TABLE approval_levels ADD COLUMN tat_start_time TIMESTAMP WITH TIME ZONE;
```
**Field Descriptions:**
- `tat50_alert_sent`: Tracks if 50% notification was sent
- `tat75_alert_sent`: Tracks if 75% notification was sent
- `tat_breached`: Tracks if TAT deadline was breached
- `tat_start_time`: Timestamp when TAT monitoring started
## Integration Points
### 1. Workflow Submission
When a workflow is submitted, TAT monitoring starts for the first approval level:
```typescript
// workflow.service.ts - submitWorkflow()
await current.update({
levelStartTime: now,
tatStartTime: now,
status: ApprovalStatus.IN_PROGRESS
});
await tatSchedulerService.scheduleTatJobs(
requestId,
levelId,
approverId,
tatHours,
now
);
```
### 2. Approval Flow
When a level is approved, TAT jobs are cancelled and new ones are scheduled for the next level:
```typescript
// approval.service.ts - approveLevel()
// Cancel current level TAT jobs
await tatSchedulerService.cancelTatJobs(requestId, levelId);
// Schedule TAT jobs for next level
await tatSchedulerService.scheduleTatJobs(
nextRequestId,
nextLevelId,
nextApproverId,
nextTatHours,
now
);
```
### 3. Rejection Flow
When a level is rejected, all pending TAT jobs are cancelled:
```typescript
// approval.service.ts - approveLevel()
await tatSchedulerService.cancelTatJobs(requestId, levelId);
```
## Notification Flow
### 50% TAT Alert (⏳)
**Message:** "50% of TAT elapsed for Request REQ-XXX: [Title]"
**Actions:**
- Send push notification to approver
- Update `tat50_alert_sent = true`
- Update `tat_percentage_used = 50`
- Log activity: "50% of TAT time has elapsed"
### 75% TAT Alert (⚠️)
**Message:** "75% of TAT elapsed for Request REQ-XXX: [Title]. Please take action soon."
**Actions:**
- Send push notification to approver
- Update `tat75_alert_sent = true`
- Update `tat_percentage_used = 75`
- Log activity: "75% of TAT time has elapsed - Escalation warning"
### 100% TAT Breach (⏰)
**Message:** "TAT breached for Request REQ-XXX: [Title]. Immediate action required!"
**Actions:**
- Send push notification to approver
- Update `tat_breached = true`
- Update `tat_percentage_used = 100`
- Log activity: "TAT deadline reached - Breach notification"
## Configuration
### Environment Variables
```bash
# Redis connection for TAT queue
REDIS_URL=redis://localhost:6379
# Optional: TAT monitoring settings
TAT_CHECK_INTERVAL_MINUTES=30
TAT_REMINDER_THRESHOLD_1=50
TAT_REMINDER_THRESHOLD_2=80
```
### Docker Compose
Redis service is automatically configured:
```yaml
redis:
image: redis:7-alpine
container_name: re_workflow_redis
ports:
- "6379:6379"
volumes:
- redis_data:/data
networks:
- re_workflow_network
restart: unless-stopped
```
## Working Hours Configuration
**Default Schedule:**
- Working Days: Monday - Friday
- Working Hours: 9:00 AM - 6:00 PM (9 hours/day)
- Timezone: Server timezone
**To Modify:**
Edit `WORK_START_HOUR` and `WORK_END_HOUR` in `tatTimeUtils.ts`
## Example Scenario
### Scenario: 48-hour TAT Approval
1. **Workflow Submitted**: Monday 10:00 AM
2. **50% Alert (24 hours)**: Tuesday 10:00 AM
- Notification sent to approver
- Database updated: `tat50_alert_sent = true`
3. **75% Alert (36 hours)**: Wednesday 10:00 AM
- Escalation warning sent
- Database updated: `tat75_alert_sent = true`
4. **100% Breach (48 hours)**: Thursday 10:00 AM
- Breach alert sent
- Database updated: `tat_breached = true`
## Error Handling
### Queue Job Failures
- **Automatic Retry**: Failed jobs retry up to 3 times with exponential backoff
- **Error Logging**: All failures logged to console and logs
- **Non-Blocking**: TAT failures don't block workflow approval process
### Redis Connection Failures
- **Graceful Degradation**: Application continues to work even if Redis is down
- **Reconnection**: Automatic reconnection attempts
- **Logging**: Connection status logged
## Monitoring & Debugging
### Check Queue Status
```bash
# View jobs in Redis
redis-cli
> KEYS bull:tatQueue:*
> LRANGE bull:tatQueue:delayed 0 -1
```
### View Worker Logs
```bash
# Check worker status in application logs
grep "TAT Worker" logs/app.log
grep "TAT Scheduler" logs/app.log
grep "TAT Processor" logs/app.log
```
### Database Queries
```sql
-- Check TAT status for all approval levels
SELECT
level_id,
request_id,
approver_name,
tat_hours,
tat_percentage_used,
tat50_alert_sent,
tat75_alert_sent,
tat_breached,
level_start_time,
tat_start_time
FROM approval_levels
WHERE status IN ('PENDING', 'IN_PROGRESS');
-- Find breached TATs
SELECT * FROM approval_levels WHERE tat_breached = true;
```
## Best Practices
1. **Always Schedule on Level Start**: Ensure `tatStartTime` is set when a level becomes active
2. **Always Cancel on Level Complete**: Cancel jobs when level is approved/rejected to avoid duplicate notifications
3. **Use Job IDs**: Unique job IDs (`tat50-{requestId}-{levelId}`) allow easy cancellation
4. **Monitor Queue Health**: Regularly check Redis and worker status
5. **Test with Short TATs**: Use short TAT durations in development for testing
## Troubleshooting
### Notifications Not Sent
1. Check Redis connection: `redis-cli ping`
2. Verify worker is running: Check logs for "TAT Worker: Initialized"
3. Check job scheduling: Look for "TAT jobs scheduled" logs
4. Verify VAPID configuration for push notifications
### Duplicate Notifications
1. Ensure jobs are cancelled when level is completed
2. Check for duplicate job IDs in Redis
3. Verify `tat50_alert_sent` and `tat75_alert_sent` flags
### Jobs Not Executing
1. Check system time (jobs use timestamps)
2. Verify working hours calculation
3. Check job delays in Redis
4. Review worker concurrency and rate limits
## Future Enhancements
1. **Configurable Working Hours**: Allow per-organization working hours
2. **Holiday Calendar**: Skip public holidays in TAT calculations
3. **Escalation Rules**: Auto-escalate to manager on breach
4. **TAT Dashboard**: Real-time visualization of TAT statuses
5. **Email Notifications**: Add email alerts alongside push notifications
6. **SMS Notifications**: Critical breach alerts via SMS
## API Endpoints (Future)
Potential API endpoints for TAT management:
```
GET /api/tat/status/:requestId - Get TAT status for request
GET /api/tat/breaches - List all breached requests
POST /api/tat/extend/:levelId - Extend TAT for a level
GET /api/tat/analytics - TAT analytics and reports
```
## References
- [BullMQ Documentation](https://docs.bullmq.io/)
- [Redis Documentation](https://redis.io/documentation)
- [Day.js Documentation](https://day.js.org/)
- [Web Push Notifications](https://developer.mozilla.org/en-US/docs/Web/API/Push_API)
---
**Last Updated**: November 4, 2025
**Version**: 1.0.0
**Maintained By**: Royal Enfield Workflow Team