4.7 KiB
4.7 KiB
Performance Optimization Analysis - Question Timing
🔍 Current Problem
Observed: ~25 seconds per question (50+ minutes for 100 questions)
Expected: ~4-8 seconds per question (7-13 minutes for 100 questions)
📊 Current Wait Chain Per Question
Breakdown of Current Waits:
- Start of loop:
RandomizedWait.wait_for_page_load('navigation')→ 1-3s - Get question type:
time.sleep(0.5)→ 0.5s (hardcoded!) - Answer question: Depends on type → 2-6s (multiple choice)
- After answer:
RandomizedWait.wait_for_question_answer()→ 2-6s - Click Next:
click_next()which calls:wait_for_page_load()→ Up to 15s (waits for document.readyState)wait_for_loading_to_disappear()→ Up to 15s (waits for loading indicator)
- After Next:
RandomizedWait.wait_for_navigation('next')→ 1-3s (REDUNDANT!)
Total Per Question:
- Minimum: 1 + 0.5 + 2 + 2 + 0 + 0 + 1 = 6.5s
- Maximum: 3 + 0.5 + 6 + 6 + 15 + 15 + 3 = 48.5s
- Average: ~2 + 0.5 + 4 + 4 + 5 + 5 + 2 = 22.5s ✅ Matches your observation!
🐛 Issues Identified
1. Redundant Waits (Major Issue)
click_next()already waits for page load- Then we wait again with
RandomizedWait.wait_for_navigation() - Waste: 1-3 seconds per question
2. Hardcoded Sleep (Minor Issue)
get_question_type()hastime.sleep(0.5)- Should use smart wait instead
- Waste: 0.5 seconds per question
3. Excessive Explicit Waits (Major Issue)
wait_for_loading_to_disappear()waits up to 15 seconds- Loading indicators usually disappear in 1-2 seconds
- Waste: 10-13 seconds per question (if loading doesn't exist)
4. Double Page Load Check (Minor Issue)
wait_for_page_load()checks document.readyStatewait_for_loading_to_disappear()checks loading indicator- Both might be redundant
- Waste: 5-10 seconds per question
5. Unnecessary Start Wait (Minor Issue)
RandomizedWait.wait_for_page_load('navigation')at start of loop- Page is already loaded from previous Next click
- Waste: 1-3 seconds per question
✅ Optimization Strategy
Phase 1: Remove Redundant Waits (Quick Win)
- Remove
RandomizedWait.wait_for_navigation('next')afterclick_next() - Remove
RandomizedWait.wait_for_page_load('navigation')at start of loop - Savings: ~2-6 seconds per question
Phase 2: Optimize Explicit Waits (Medium Win)
- Reduce
wait_for_loading_to_disappear()timeout to 2-3 seconds - Make it non-blocking (don't fail if loading doesn't exist)
- Savings: ~10-13 seconds per question (if loading doesn't exist)
Phase 3: Replace Hardcoded Sleeps (Small Win)
- Replace
time.sleep(0.5)inget_question_type()with smart wait - Wait for question element to be visible instead
- Savings: ~0.3-0.5 seconds per question
Phase 4: Smart Wait Strategy (Advanced)
- Use explicit waits that return immediately when condition is met
- Don't wait full timeout if element is ready
- Savings: Variable, but significant
🎯 Expected Results After Optimization
Optimized Wait Chain:
- Get question type: Smart wait → 0.1-0.5s (was 0.5s)
- Answer question: Depends on type → 2-6s (unchanged)
- After answer:
RandomizedWait.wait_for_question_answer()→ 2-6s (unchanged) - Click Next:
click_next()with optimized waits → 1-3s (was 15-30s) - No redundant wait → 0s (was 1-3s)
Total Per Question (Optimized):
- Minimum: 0.1 + 2 + 2 + 1 = 5.1s
- Maximum: 0.5 + 6 + 6 + 3 = 15.5s
- Average: ~0.3 + 4 + 4 + 2 = 10.3s
For 100 Questions:
- Current: ~25s × 100 = ~42 minutes
- Optimized: ~10s × 100 = ~17 minutes
- Improvement: ~60% faster (25 minutes saved!)
🚀 Implementation Plan
- ✅ Remove redundant
RandomizedWait.wait_for_navigation()calls - ✅ Remove redundant
RandomizedWait.wait_for_page_load()at start - ✅ Optimize
wait_for_loading_to_disappear()timeout (2-3s instead of 15s) - ✅ Replace hardcoded
time.sleep(0.5)with smart wait - ✅ Test with 1 student to verify no breakage
- ✅ Test with 10 students to verify performance improvement
⚠️ Safety Considerations
- Don't remove explicit waits entirely - they ensure elements are ready
- Keep minimum wait times - prevents race conditions
- Test thoroughly - ensure no flakiness introduced
- Monitor for failures - if optimization causes issues, revert
📈 Success Metrics
- Target: <15 seconds per question (average)
- Target: <20 minutes for 100 questions
- Target: 0% increase in failure rate
- Target: Maintain 100% reliability