# AI Analysis Service - Documentation Index Welcome to the AI Analysis Service documentation. This service analyzes code repositories using Claude AI and integrates with the Git Integration Service and API Gateway. --- ## Quick Navigation ### Getting Started - [Quick Reference Guide](./QUICK_REFERENCE.md) - Fast commands and common operations - [Architecture Overview](#architecture-overview-below) - [Environment Setup](#environment-setup-below) ### In-Depth Documentation - [Complete Architecture Guide](./SERVICE_COMMUNICATION_ARCHITECTURE.md) - Comprehensive documentation - [Flow Diagrams](./FLOW_DIAGRAMS.md) - Visual representations of data flow - [Integration Examples](./INTEGRATION_EXAMPLE.md) - Code examples and usage patterns ### Technical Reference - [API Documentation](#api-endpoints-below) - [Service Configuration](#configuration-below) - [Troubleshooting Guide](#troubleshooting-below) --- ## Architecture Overview ### System Components ``` ┌──────────┐ ┌──────────────┐ ┌────────────────┐ ┌─────────────┐ │ Frontend │────▶│ API Gateway │────▶│ AI Analysis │◀───▶│ Git │ │ (Next.js)│ │ (Express.js) │ │ (FastAPI) │ │ Integration │ │ :3000 │ │ :8000 │ │ :8022 │ │ :8012 │ └──────────┘ └──────────────┘ └───────┬────────┘ └─────┬───────┘ │ │ ▼ ▼ ┌─────────┐ ┌──────────┐ │ Redis │ │PostgreSQL│ │ :6379 │ │ :5432 │ └─────────┘ └──────────┘ ``` ### Key Features 1. **AI-Powered Analysis**: Uses Claude API for intelligent code review 2. **Rate Limiting**: Manages Claude API limits (90 requests/minute) 3. **Smart Caching**: Redis-based caching reduces API calls by 60-70% 4. **Content Optimization**: Intelligently truncates large files 5. **Report Generation**: Creates PDF and JSON reports 6. **Multi-Service Integration**: Seamless communication between services --- ## Environment Setup ### Prerequisites - Docker & Docker Compose - Node.js 18+ (for local development) - Python 3.11+ (for local development) - Anthropic API Key - GitHub OAuth credentials ### Installation ```bash # 1. Clone repository git clone https://github.com/your-org/codenuk.git cd codenuk # 2. Set up environment variables cp backend/codenuk_backend_mine/services/ai-analysis-service/.env.example \ backend/codenuk_backend_mine/services/ai-analysis-service/.env # 3. Configure .env files # Edit .env files with your API keys and credentials # 4. Start services docker-compose up -d # 5. Verify services curl http://localhost:8000/health curl http://localhost:8022/health curl http://localhost:8012/health ``` ### Environment Variables #### AI Analysis Service ```bash ANTHROPIC_API_KEY=sk-ant-api03-... GIT_INTEGRATION_SERVICE_URL=http://git-integration:8012 REDIS_HOST=redis REDIS_PORT=6379 PORT=8022 ``` #### API Gateway ```bash AI_ANALYSIS_URL=http://localhost:8022 GIT_INTEGRATION_URL=http://localhost:8012 PORT=8000 ``` #### Git Integration ```bash GITHUB_CLIENT_ID=your_client_id GITHUB_CLIENT_SECRET=your_client_secret PUBLIC_BASE_URL=https://backend.codenuk.com POSTGRES_HOST=postgres PORT=8012 ``` --- ## API Endpoints ### AI Analysis Service #### Analyze Repository ```http POST /analyze-repository Content-Type: application/json { "repository_id": "uuid", "user_id": "user-uuid", "output_format": "pdf", "max_files": 100 } ``` **Response:** ```json { "success": true, "analysis_id": "repo_analysis_uuid_timestamp", "report_path": "/app/reports/..._analysis.pdf", "stats": { "total_files": 85, "code_quality_score": 7.8, "total_issues": 23 } } ``` #### Get Repository Info ```http GET /repository/{id}/info?user_id={userId} ``` #### Download Report ```http GET /reports/{filename} ``` #### Health Check ```http GET /health ``` ### Via API Gateway All endpoints are accessible through the API Gateway: ``` Direct: http://localhost:8022/analyze-repository Via Gateway: http://localhost:8000/api/ai-analysis/analyze-repository ``` --- ## Configuration ### Service Ports | Service | Port | Protocol | |---------|------|----------| | Frontend | 3000 | HTTP | | API Gateway | 8000 | HTTP | | AI Analysis | 8022 | HTTP | | Git Integration | 8012 | HTTP | | PostgreSQL | 5432 | TCP | | Redis | 6379 | TCP | ### Rate Limiting - **Claude API**: 90 requests per minute (configurable) - **Sliding Window**: Tracks requests over 60-second window - **Automatic Waiting**: Delays requests to prevent rate limit violations ### Caching - **Storage**: Redis - **TTL**: 24 hours (configurable) - **Key Format**: `analysis:{file_hash}` - **Hash Algorithm**: SHA-256 ### Content Optimization - **Threshold**: 8000 tokens (~32KB) - **Strategy**: Extract imports, functions, classes - **Truncation**: Intelligent context preservation --- ## Communication Flow ### 1. Repository Analysis Request ``` Frontend → API Gateway → AI Analysis → Git Integration ``` 1. User clicks "Analyze Repository" in frontend 2. Frontend sends POST request to API Gateway 3. Gateway forwards to AI Analysis Service 4. AI Analysis requests repository info from Git Integration 5. Git Integration returns file tree and metadata 6. AI Analysis processes each file: - Check Redis cache - Apply rate limiting - Optimize content - Send to Claude API - Cache result 7. Generate repository-level analysis 8. Create PDF/JSON report 9. Return results through Gateway to Frontend ### 2. File Content Retrieval ``` AI Analysis → Git Integration → File System ``` 1. AI Analysis requests file content 2. Git Integration resolves file path (case-insensitive) 3. Reads content from local storage 4. Returns content + metadata ### 3. OAuth Authentication ``` Frontend → API Gateway → Git Integration → GitHub → Git Integration → Frontend ``` 1. User attempts to access private repository 2. Git Integration detects authentication requirement 3. Returns OAuth URL 4. Frontend redirects to GitHub OAuth 5. User approves access 6. GitHub redirects back with code 7. Git Integration exchanges code for token 8. Token stored in PostgreSQL 9. User can now access private repository --- ## Troubleshooting ### Common Issues #### Service Connection Failed **Symptoms**: "Failed to get repository info" error **Solution**: ```bash # Check service status docker ps | grep git-integration # Check network connectivity docker network inspect backend-network # Restart service docker-compose restart git-integration ``` #### Rate Limit Exceeded **Symptoms**: Analysis fails with rate limit error **Solution**: ```bash # Option 1: Reduce max_files { "max_files": 50 # Instead of 100 } # Option 2: Lower rate limit CLAUDE_REQUESTS_PER_MINUTE=50 # In .env docker-compose restart ai-analysis ``` #### Redis Connection Failed **Symptoms**: Warning about Redis connection **Solution**: ```bash # Check Redis status docker exec redis redis-cli ping # Expected: PONG # If fails, restart Redis docker-compose restart redis ``` #### Authentication Errors **Symptoms**: 401 Unauthorized for private repos **Solution**: - Verify GitHub OAuth credentials - Check if user has completed OAuth flow - Verify token is stored in database --- ## Performance Optimization ### Analysis Speed | Configuration | Time for 100 Files | API Calls | |--------------|-------------------|-----------| | No optimization | 50-90 minutes | 100 | | With caching (60% hit) | 20-35 minutes | 40 | | With rate limiting | 2-4 minutes slower | Same | | With content optimization | Same | 70% smaller payloads | ### Best Practices 1. **Use Caching**: Enable Redis for repeated analyses 2. **Optimize Content**: Keep 8000 token threshold 3. **Respect Rate Limits**: Don't increase beyond Claude limits 4. **Batch Processing**: Analyze during off-peak hours 5. **Monitor Resources**: Watch CPU, memory, and network usage --- ## Security Considerations ### API Keys - Store in environment variables only - Never commit to version control - Rotate regularly - Use different keys for dev/prod ### OAuth Tokens - Encrypted at rest in PostgreSQL - Secure transmission (HTTPS in production) - Automatic expiration handling - User-specific token isolation ### Network Security - Internal Docker network for service communication - API Gateway as single entry point - CORS configuration for frontend - Rate limiting to prevent abuse --- ## Monitoring and Logging ### Log Locations ```bash # AI Analysis Service docker logs ai-analysis -f # API Gateway docker logs api-gateway -f # Git Integration docker logs git-integration -f ``` ### Key Metrics - **Analysis Success Rate**: Track successful vs failed analyses - **Cache Hit Rate**: Monitor Redis cache effectiveness - **API Response Times**: Track latency for each service - **Rate Limit Usage**: Monitor Claude API usage ### Health Checks ```bash # All services curl http://localhost:8000/health curl http://localhost:8022/health curl http://localhost:8012/health # Database docker exec postgres pg_isready # Cache docker exec redis redis-cli ping ``` --- ## Development ### Local Development Setup ```bash # AI Analysis Service cd services/ai-analysis-service python3 -m venv venv source venv/bin/activate pip install -r requirements.txt python server.py # API Gateway cd services/api-gateway npm install npm run dev # Git Integration cd services/git-integration npm install npm run dev # Frontend cd fronend/codenuk_frontend_mine npm install npm run dev ``` ### Testing ```bash # Test AI Analysis directly curl -X POST http://localhost:8022/analyze-repository \ -H "Content-Type: application/json" \ -d '{"repository_id": "test", "user_id": "test", "output_format": "json", "max_files": 5}' # Test through Gateway curl -X POST http://localhost:8000/api/ai-analysis/analyze-repository \ -H "Content-Type: application/json" \ -d '{"repository_id": "test", "user_id": "test", "output_format": "json", "max_files": 5}' ``` ### Debugging ```bash # Enable debug mode export DEBUG=* export LOG_LEVEL=debug export PYTHONUNBUFFERED=1 # Watch logs in real-time docker-compose logs -f ai-analysis | grep "ERROR" # Inspect container docker exec -it ai-analysis bash ``` --- ## Deployment ### Production Checklist - [ ] Set secure environment variables - [ ] Configure HTTPS - [ ] Set up SSL certificates - [ ] Enable production logging - [ ] Configure monitoring (Prometheus, Grafana) - [ ] Set up backup strategy - [ ] Configure auto-scaling (if needed) - [ ] Test failover scenarios - [ ] Document recovery procedures - [ ] Set up alerts ### Docker Compose Production ```yaml services: ai-analysis: image: codenuk/ai-analysis:latest restart: always environment: - NODE_ENV=production - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8022/health"] interval: 30s timeout: 10s retries: 3 deploy: replicas: 2 resources: limits: cpus: '2' memory: 4G ``` --- ## Additional Resources ### Documentation Files 1. **[SERVICE_COMMUNICATION_ARCHITECTURE.md](./SERVICE_COMMUNICATION_ARCHITECTURE.md)** - Complete architecture documentation - Detailed service descriptions - Request/response examples - Error handling strategies - Deployment configuration 2. **[QUICK_REFERENCE.md](./QUICK_REFERENCE.md)** - Quick start commands - Common API calls - Troubleshooting commands - Performance tuning tips - Development shortcuts 3. **[FLOW_DIAGRAMS.md](./FLOW_DIAGRAMS.md)** - Visual request flow - Service communication diagrams - Data flow illustrations - Authentication flow - Error handling flow - Caching strategy 4. **[INTEGRATION_EXAMPLE.md](./INTEGRATION_EXAMPLE.md)** - Frontend integration code - API usage examples - React hooks - Error handling patterns 5. **[README.md](./README.md)** - Service overview - Installation instructions - Basic usage - API reference ### External Links - [Anthropic Claude API Documentation](https://docs.anthropic.com/) - [FastAPI Documentation](https://fastapi.tiangolo.com/) - [Express.js Documentation](https://expressjs.com/) - [Docker Compose Documentation](https://docs.docker.com/compose/) - [Redis Documentation](https://redis.io/docs/) - [PostgreSQL Documentation](https://www.postgresql.org/docs/) --- ## Support ### Getting Help 1. Check the troubleshooting guide 2. Review service logs 3. Test endpoints individually 4. Verify environment variables 5. Check Docker network connectivity ### Common Questions **Q: How long does analysis take?** A: Typically 2-4 minutes for 100 files with caching, 30-60 minutes without. **Q: Can I analyze private repositories?** A: Yes, users need to authenticate via GitHub OAuth. **Q: What happens if Claude API is down?** A: Service will return appropriate errors. Cached results still work. **Q: How much does it cost?** A: Depends on Claude API usage. Caching reduces costs by 60-70%. **Q: Can I increase max_files beyond 100?** A: Yes, but consider rate limits and timeout settings. --- ## Version History - **v1.0.0** (December 2024) - Initial release - Claude AI integration - Redis caching - Rate limiting - Content optimization - Multi-service architecture --- ## Contributing For contributions or improvements to this documentation: 1. Ensure accuracy by testing commands 2. Follow existing format and style 3. Update version history 4. Add examples where helpful 5. Keep diagrams up to date --- **Last Updated**: December 2024 **Version**: 1.0.0 **Maintained By**: CodeNuk Team