146 lines
3.7 KiB
Markdown
146 lines
3.7 KiB
Markdown
# Complete Load Test Failure Analysis
|
|
|
|
## 📊 Summary of Both Test Runs
|
|
|
|
### Test 1: 100 Students
|
|
- **Result**: 0% success (100% failed)
|
|
- **Error**: `InvalidSessionIdException: invalid session id: session deleted as the browser has closed the connection`
|
|
- **Root Cause**: **System Resource Exhaustion** - Too many concurrent browsers (100)
|
|
|
|
### Test 2: 1 Student
|
|
- **Result**: 0% success (100% failed)
|
|
- **Error**: `Password reset API call did not complete within timeout`
|
|
- **Root Cause**: **Backend API Performance** - Password reset API taking >16 seconds
|
|
|
|
---
|
|
|
|
## 🔴 Issue #1: 100 Students - System Resource Exhaustion
|
|
|
|
### Problem
|
|
Running 100 concurrent Chrome browsers exceeds system capacity.
|
|
|
|
### Solution ✅ IMPLEMENTED
|
|
**Reduce concurrency to 20-30 browsers:**
|
|
```bash
|
|
--workers 20 # Instead of 100
|
|
```
|
|
|
|
### Status
|
|
✅ **RESOLVED** - Use `--workers 20` for load testing
|
|
|
|
---
|
|
|
|
## 🔴 Issue #2: 1 Student - Backend API Timeout
|
|
|
|
### Problem
|
|
Password reset API is taking longer than 16 seconds to respond.
|
|
|
|
### Solution ✅ IMPLEMENTED
|
|
**Increased timeout from 16 seconds to 60 seconds:**
|
|
- Modified `pages/mandatory_reset_page.py`
|
|
- Changed: `max_wait = max(LONG_WAIT, 60)` (60 seconds minimum)
|
|
- Improved error messages with more context
|
|
|
|
### Status
|
|
✅ **FIXED** - Timeout increased to 60 seconds
|
|
|
|
---
|
|
|
|
## 🎯 What Each Issue Means
|
|
|
|
### Issue #1 (100 Students)
|
|
- **Automation**: ✅ Working correctly
|
|
- **Backend**: ✅ Working correctly
|
|
- **System**: ❌ Cannot handle 100 browsers
|
|
- **Fix**: Reduce to 20-30 concurrent browsers
|
|
|
|
### Issue #2 (1 Student)
|
|
- **Automation**: ✅ Working correctly
|
|
- **Backend**: ⚠️ **Slow API response** (>16 seconds)
|
|
- **System**: ✅ Can handle 1 browser
|
|
- **Fix**: ✅ Timeout increased to 60 seconds
|
|
|
|
---
|
|
|
|
## ✅ Recommended Test Strategy
|
|
|
|
### Step 1: Test with 1 Student (Verify Fix)
|
|
```bash
|
|
python3 tests/load_tests/test_generic_load_assessments.py \
|
|
--csv students_with_passwords_2025-12-15T10-49-08_01.csv \
|
|
--start 0 --end 1 \
|
|
--workers 1 \
|
|
--headless \
|
|
--metrics-interval 1
|
|
```
|
|
|
|
**Expected**: Should now work with 60-second timeout
|
|
|
|
### Step 2: Test with 10 Students
|
|
```bash
|
|
--start 0 --end 10 --workers 10
|
|
```
|
|
|
|
### Step 3: Test with 20 Students
|
|
```bash
|
|
--start 0 --end 20 --workers 20
|
|
```
|
|
|
|
### Step 4: Scale Up Gradually
|
|
- 20 → 30 → 50 → 100 (if system can handle it)
|
|
- Or use multi-device for 100+ students
|
|
|
|
---
|
|
|
|
## 🔍 Backend Performance Investigation
|
|
|
|
### If Timeout Still Occurs (Even with 60s)
|
|
|
|
**Check backend:**
|
|
1. **Backend Logs**: Look for password reset API calls
|
|
2. **Database Performance**: Check query times
|
|
3. **API Response Times**: Monitor endpoint performance
|
|
4. **Network**: Check for latency issues
|
|
|
|
**Possible Backend Issues:**
|
|
- Slow database queries
|
|
- Heavy server load
|
|
- Network latency
|
|
- Backend service issues
|
|
|
|
---
|
|
|
|
## 📋 Changes Made
|
|
|
|
### 1. Increased Password Reset Timeout
|
|
- **File**: `pages/mandatory_reset_page.py`
|
|
- **Change**: Timeout increased from 16s to 60s
|
|
- **Line**: ~392
|
|
|
|
### 2. Improved Error Messages
|
|
- **File**: `pages/mandatory_reset_page.py`
|
|
- **Change**: Better error context (modal status, errors, elapsed time)
|
|
- **Line**: ~447
|
|
|
|
### 3. Enhanced Toast Detection
|
|
- **File**: `pages/mandatory_reset_page.py`
|
|
- **Change**: Added data-testid detection + improved XPath fallbacks
|
|
- **Line**: ~406
|
|
|
|
---
|
|
|
|
## 🎯 Next Steps
|
|
|
|
1. **Test with 1 student** - Verify timeout fix works
|
|
2. **If successful** - Scale up to 10, then 20 students
|
|
3. **If timeout still occurs** - Investigate backend performance
|
|
4. **For 100 students** - Use `--workers 20` or multi-device
|
|
|
|
---
|
|
|
|
**Summary**:
|
|
- ✅ Issue #1 fixed: Use `--workers 20` instead of 100
|
|
- ✅ Issue #2 fixed: Timeout increased to 60 seconds
|
|
- ⚠️ If Issue #2 persists: Backend performance needs investigation
|
|
|