611 lines
14 KiB
Markdown
611 lines
14 KiB
Markdown
# AI Analysis Service - Documentation Index
|
|
|
|
Welcome to the AI Analysis Service documentation. This service analyzes code repositories using Claude AI and integrates with the Git Integration Service and API Gateway.
|
|
|
|
---
|
|
|
|
## Quick Navigation
|
|
|
|
### Getting Started
|
|
- [Quick Reference Guide](./QUICK_REFERENCE.md) - Fast commands and common operations
|
|
- [Architecture Overview](#architecture-overview-below)
|
|
- [Environment Setup](#environment-setup-below)
|
|
|
|
### In-Depth Documentation
|
|
- [Complete Architecture Guide](./SERVICE_COMMUNICATION_ARCHITECTURE.md) - Comprehensive documentation
|
|
- [Flow Diagrams](./FLOW_DIAGRAMS.md) - Visual representations of data flow
|
|
- [Integration Examples](./INTEGRATION_EXAMPLE.md) - Code examples and usage patterns
|
|
|
|
### Technical Reference
|
|
- [API Documentation](#api-endpoints-below)
|
|
- [Service Configuration](#configuration-below)
|
|
- [Troubleshooting Guide](#troubleshooting-below)
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
### System Components
|
|
|
|
```
|
|
┌──────────┐ ┌──────────────┐ ┌────────────────┐ ┌─────────────┐
|
|
│ Frontend │────▶│ API Gateway │────▶│ AI Analysis │◀───▶│ Git │
|
|
│ (Next.js)│ │ (Express.js) │ │ (FastAPI) │ │ Integration │
|
|
│ :3000 │ │ :8000 │ │ :8022 │ │ :8012 │
|
|
└──────────┘ └──────────────┘ └───────┬────────┘ └─────┬───────┘
|
|
│ │
|
|
▼ ▼
|
|
┌─────────┐ ┌──────────┐
|
|
│ Redis │ │PostgreSQL│
|
|
│ :6379 │ │ :5432 │
|
|
└─────────┘ └──────────┘
|
|
```
|
|
|
|
### Key Features
|
|
|
|
1. **AI-Powered Analysis**: Uses Claude API for intelligent code review
|
|
2. **Rate Limiting**: Manages Claude API limits (90 requests/minute)
|
|
3. **Smart Caching**: Redis-based caching reduces API calls by 60-70%
|
|
4. **Content Optimization**: Intelligently truncates large files
|
|
5. **Report Generation**: Creates PDF and JSON reports
|
|
6. **Multi-Service Integration**: Seamless communication between services
|
|
|
|
---
|
|
|
|
## Environment Setup
|
|
|
|
### Prerequisites
|
|
|
|
- Docker & Docker Compose
|
|
- Node.js 18+ (for local development)
|
|
- Python 3.11+ (for local development)
|
|
- Anthropic API Key
|
|
- GitHub OAuth credentials
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
# 1. Clone repository
|
|
git clone https://github.com/your-org/codenuk.git
|
|
cd codenuk
|
|
|
|
# 2. Set up environment variables
|
|
cp backend/codenuk_backend_mine/services/ai-analysis-service/.env.example \
|
|
backend/codenuk_backend_mine/services/ai-analysis-service/.env
|
|
|
|
# 3. Configure .env files
|
|
# Edit .env files with your API keys and credentials
|
|
|
|
# 4. Start services
|
|
docker-compose up -d
|
|
|
|
# 5. Verify services
|
|
curl http://localhost:8000/health
|
|
curl http://localhost:8022/health
|
|
curl http://localhost:8012/health
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
#### AI Analysis Service
|
|
```bash
|
|
ANTHROPIC_API_KEY=sk-ant-api03-...
|
|
GIT_INTEGRATION_SERVICE_URL=http://git-integration:8012
|
|
REDIS_HOST=redis
|
|
REDIS_PORT=6379
|
|
PORT=8022
|
|
```
|
|
|
|
#### API Gateway
|
|
```bash
|
|
AI_ANALYSIS_URL=http://localhost:8022
|
|
GIT_INTEGRATION_URL=http://localhost:8012
|
|
PORT=8000
|
|
```
|
|
|
|
#### Git Integration
|
|
```bash
|
|
GITHUB_CLIENT_ID=your_client_id
|
|
GITHUB_CLIENT_SECRET=your_client_secret
|
|
PUBLIC_BASE_URL=https://backend.codenuk.com
|
|
POSTGRES_HOST=postgres
|
|
PORT=8012
|
|
```
|
|
|
|
---
|
|
|
|
## API Endpoints
|
|
|
|
### AI Analysis Service
|
|
|
|
#### Analyze Repository
|
|
```http
|
|
POST /analyze-repository
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"repository_id": "uuid",
|
|
"user_id": "user-uuid",
|
|
"output_format": "pdf",
|
|
"max_files": 100
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"success": true,
|
|
"analysis_id": "repo_analysis_uuid_timestamp",
|
|
"report_path": "/app/reports/..._analysis.pdf",
|
|
"stats": {
|
|
"total_files": 85,
|
|
"code_quality_score": 7.8,
|
|
"total_issues": 23
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Get Repository Info
|
|
```http
|
|
GET /repository/{id}/info?user_id={userId}
|
|
```
|
|
|
|
#### Download Report
|
|
```http
|
|
GET /reports/{filename}
|
|
```
|
|
|
|
#### Health Check
|
|
```http
|
|
GET /health
|
|
```
|
|
|
|
### Via API Gateway
|
|
|
|
All endpoints are accessible through the API Gateway:
|
|
|
|
```
|
|
Direct: http://localhost:8022/analyze-repository
|
|
Via Gateway: http://localhost:8000/api/ai-analysis/analyze-repository
|
|
```
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### Service Ports
|
|
|
|
| Service | Port | Protocol |
|
|
|---------|------|----------|
|
|
| Frontend | 3000 | HTTP |
|
|
| API Gateway | 8000 | HTTP |
|
|
| AI Analysis | 8022 | HTTP |
|
|
| Git Integration | 8012 | HTTP |
|
|
| PostgreSQL | 5432 | TCP |
|
|
| Redis | 6379 | TCP |
|
|
|
|
### Rate Limiting
|
|
|
|
- **Claude API**: 90 requests per minute (configurable)
|
|
- **Sliding Window**: Tracks requests over 60-second window
|
|
- **Automatic Waiting**: Delays requests to prevent rate limit violations
|
|
|
|
### Caching
|
|
|
|
- **Storage**: Redis
|
|
- **TTL**: 24 hours (configurable)
|
|
- **Key Format**: `analysis:{file_hash}`
|
|
- **Hash Algorithm**: SHA-256
|
|
|
|
### Content Optimization
|
|
|
|
- **Threshold**: 8000 tokens (~32KB)
|
|
- **Strategy**: Extract imports, functions, classes
|
|
- **Truncation**: Intelligent context preservation
|
|
|
|
---
|
|
|
|
## Communication Flow
|
|
|
|
### 1. Repository Analysis Request
|
|
|
|
```
|
|
Frontend → API Gateway → AI Analysis → Git Integration
|
|
```
|
|
|
|
1. User clicks "Analyze Repository" in frontend
|
|
2. Frontend sends POST request to API Gateway
|
|
3. Gateway forwards to AI Analysis Service
|
|
4. AI Analysis requests repository info from Git Integration
|
|
5. Git Integration returns file tree and metadata
|
|
6. AI Analysis processes each file:
|
|
- Check Redis cache
|
|
- Apply rate limiting
|
|
- Optimize content
|
|
- Send to Claude API
|
|
- Cache result
|
|
7. Generate repository-level analysis
|
|
8. Create PDF/JSON report
|
|
9. Return results through Gateway to Frontend
|
|
|
|
### 2. File Content Retrieval
|
|
|
|
```
|
|
AI Analysis → Git Integration → File System
|
|
```
|
|
|
|
1. AI Analysis requests file content
|
|
2. Git Integration resolves file path (case-insensitive)
|
|
3. Reads content from local storage
|
|
4. Returns content + metadata
|
|
|
|
### 3. OAuth Authentication
|
|
|
|
```
|
|
Frontend → API Gateway → Git Integration → GitHub → Git Integration → Frontend
|
|
```
|
|
|
|
1. User attempts to access private repository
|
|
2. Git Integration detects authentication requirement
|
|
3. Returns OAuth URL
|
|
4. Frontend redirects to GitHub OAuth
|
|
5. User approves access
|
|
6. GitHub redirects back with code
|
|
7. Git Integration exchanges code for token
|
|
8. Token stored in PostgreSQL
|
|
9. User can now access private repository
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
#### Service Connection Failed
|
|
|
|
**Symptoms**: "Failed to get repository info" error
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Check service status
|
|
docker ps | grep git-integration
|
|
|
|
# Check network connectivity
|
|
docker network inspect backend-network
|
|
|
|
# Restart service
|
|
docker-compose restart git-integration
|
|
```
|
|
|
|
#### Rate Limit Exceeded
|
|
|
|
**Symptoms**: Analysis fails with rate limit error
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Option 1: Reduce max_files
|
|
{
|
|
"max_files": 50 # Instead of 100
|
|
}
|
|
|
|
# Option 2: Lower rate limit
|
|
CLAUDE_REQUESTS_PER_MINUTE=50 # In .env
|
|
docker-compose restart ai-analysis
|
|
```
|
|
|
|
#### Redis Connection Failed
|
|
|
|
**Symptoms**: Warning about Redis connection
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Check Redis status
|
|
docker exec redis redis-cli ping
|
|
|
|
# Expected: PONG
|
|
|
|
# If fails, restart Redis
|
|
docker-compose restart redis
|
|
```
|
|
|
|
#### Authentication Errors
|
|
|
|
**Symptoms**: 401 Unauthorized for private repos
|
|
|
|
**Solution**:
|
|
- Verify GitHub OAuth credentials
|
|
- Check if user has completed OAuth flow
|
|
- Verify token is stored in database
|
|
|
|
---
|
|
|
|
## Performance Optimization
|
|
|
|
### Analysis Speed
|
|
|
|
| Configuration | Time for 100 Files | API Calls |
|
|
|--------------|-------------------|-----------|
|
|
| No optimization | 50-90 minutes | 100 |
|
|
| With caching (60% hit) | 20-35 minutes | 40 |
|
|
| With rate limiting | 2-4 minutes slower | Same |
|
|
| With content optimization | Same | 70% smaller payloads |
|
|
|
|
### Best Practices
|
|
|
|
1. **Use Caching**: Enable Redis for repeated analyses
|
|
2. **Optimize Content**: Keep 8000 token threshold
|
|
3. **Respect Rate Limits**: Don't increase beyond Claude limits
|
|
4. **Batch Processing**: Analyze during off-peak hours
|
|
5. **Monitor Resources**: Watch CPU, memory, and network usage
|
|
|
|
---
|
|
|
|
## Security Considerations
|
|
|
|
### API Keys
|
|
|
|
- Store in environment variables only
|
|
- Never commit to version control
|
|
- Rotate regularly
|
|
- Use different keys for dev/prod
|
|
|
|
### OAuth Tokens
|
|
|
|
- Encrypted at rest in PostgreSQL
|
|
- Secure transmission (HTTPS in production)
|
|
- Automatic expiration handling
|
|
- User-specific token isolation
|
|
|
|
### Network Security
|
|
|
|
- Internal Docker network for service communication
|
|
- API Gateway as single entry point
|
|
- CORS configuration for frontend
|
|
- Rate limiting to prevent abuse
|
|
|
|
---
|
|
|
|
## Monitoring and Logging
|
|
|
|
### Log Locations
|
|
|
|
```bash
|
|
# AI Analysis Service
|
|
docker logs ai-analysis -f
|
|
|
|
# API Gateway
|
|
docker logs api-gateway -f
|
|
|
|
# Git Integration
|
|
docker logs git-integration -f
|
|
```
|
|
|
|
### Key Metrics
|
|
|
|
- **Analysis Success Rate**: Track successful vs failed analyses
|
|
- **Cache Hit Rate**: Monitor Redis cache effectiveness
|
|
- **API Response Times**: Track latency for each service
|
|
- **Rate Limit Usage**: Monitor Claude API usage
|
|
|
|
### Health Checks
|
|
|
|
```bash
|
|
# All services
|
|
curl http://localhost:8000/health
|
|
curl http://localhost:8022/health
|
|
curl http://localhost:8012/health
|
|
|
|
# Database
|
|
docker exec postgres pg_isready
|
|
|
|
# Cache
|
|
docker exec redis redis-cli ping
|
|
```
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
### Local Development Setup
|
|
|
|
```bash
|
|
# AI Analysis Service
|
|
cd services/ai-analysis-service
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
pip install -r requirements.txt
|
|
python server.py
|
|
|
|
# API Gateway
|
|
cd services/api-gateway
|
|
npm install
|
|
npm run dev
|
|
|
|
# Git Integration
|
|
cd services/git-integration
|
|
npm install
|
|
npm run dev
|
|
|
|
# Frontend
|
|
cd fronend/codenuk_frontend_mine
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
### Testing
|
|
|
|
```bash
|
|
# Test AI Analysis directly
|
|
curl -X POST http://localhost:8022/analyze-repository \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"repository_id": "test", "user_id": "test", "output_format": "json", "max_files": 5}'
|
|
|
|
# Test through Gateway
|
|
curl -X POST http://localhost:8000/api/ai-analysis/analyze-repository \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"repository_id": "test", "user_id": "test", "output_format": "json", "max_files": 5}'
|
|
```
|
|
|
|
### Debugging
|
|
|
|
```bash
|
|
# Enable debug mode
|
|
export DEBUG=*
|
|
export LOG_LEVEL=debug
|
|
export PYTHONUNBUFFERED=1
|
|
|
|
# Watch logs in real-time
|
|
docker-compose logs -f ai-analysis | grep "ERROR"
|
|
|
|
# Inspect container
|
|
docker exec -it ai-analysis bash
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### Production Checklist
|
|
|
|
- [ ] Set secure environment variables
|
|
- [ ] Configure HTTPS
|
|
- [ ] Set up SSL certificates
|
|
- [ ] Enable production logging
|
|
- [ ] Configure monitoring (Prometheus, Grafana)
|
|
- [ ] Set up backup strategy
|
|
- [ ] Configure auto-scaling (if needed)
|
|
- [ ] Test failover scenarios
|
|
- [ ] Document recovery procedures
|
|
- [ ] Set up alerts
|
|
|
|
### Docker Compose Production
|
|
|
|
```yaml
|
|
services:
|
|
ai-analysis:
|
|
image: codenuk/ai-analysis:latest
|
|
restart: always
|
|
environment:
|
|
- NODE_ENV=production
|
|
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
|
|
healthcheck:
|
|
test: ["CMD", "curl", "-f", "http://localhost:8022/health"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
deploy:
|
|
replicas: 2
|
|
resources:
|
|
limits:
|
|
cpus: '2'
|
|
memory: 4G
|
|
```
|
|
|
|
---
|
|
|
|
## Additional Resources
|
|
|
|
### Documentation Files
|
|
|
|
1. **[SERVICE_COMMUNICATION_ARCHITECTURE.md](./SERVICE_COMMUNICATION_ARCHITECTURE.md)**
|
|
- Complete architecture documentation
|
|
- Detailed service descriptions
|
|
- Request/response examples
|
|
- Error handling strategies
|
|
- Deployment configuration
|
|
|
|
2. **[QUICK_REFERENCE.md](./QUICK_REFERENCE.md)**
|
|
- Quick start commands
|
|
- Common API calls
|
|
- Troubleshooting commands
|
|
- Performance tuning tips
|
|
- Development shortcuts
|
|
|
|
3. **[FLOW_DIAGRAMS.md](./FLOW_DIAGRAMS.md)**
|
|
- Visual request flow
|
|
- Service communication diagrams
|
|
- Data flow illustrations
|
|
- Authentication flow
|
|
- Error handling flow
|
|
- Caching strategy
|
|
|
|
4. **[INTEGRATION_EXAMPLE.md](./INTEGRATION_EXAMPLE.md)**
|
|
- Frontend integration code
|
|
- API usage examples
|
|
- React hooks
|
|
- Error handling patterns
|
|
|
|
5. **[README.md](./README.md)**
|
|
- Service overview
|
|
- Installation instructions
|
|
- Basic usage
|
|
- API reference
|
|
|
|
### External Links
|
|
|
|
- [Anthropic Claude API Documentation](https://docs.anthropic.com/)
|
|
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
|
|
- [Express.js Documentation](https://expressjs.com/)
|
|
- [Docker Compose Documentation](https://docs.docker.com/compose/)
|
|
- [Redis Documentation](https://redis.io/docs/)
|
|
- [PostgreSQL Documentation](https://www.postgresql.org/docs/)
|
|
|
|
---
|
|
|
|
## Support
|
|
|
|
### Getting Help
|
|
|
|
1. Check the troubleshooting guide
|
|
2. Review service logs
|
|
3. Test endpoints individually
|
|
4. Verify environment variables
|
|
5. Check Docker network connectivity
|
|
|
|
### Common Questions
|
|
|
|
**Q: How long does analysis take?**
|
|
A: Typically 2-4 minutes for 100 files with caching, 30-60 minutes without.
|
|
|
|
**Q: Can I analyze private repositories?**
|
|
A: Yes, users need to authenticate via GitHub OAuth.
|
|
|
|
**Q: What happens if Claude API is down?**
|
|
A: Service will return appropriate errors. Cached results still work.
|
|
|
|
**Q: How much does it cost?**
|
|
A: Depends on Claude API usage. Caching reduces costs by 60-70%.
|
|
|
|
**Q: Can I increase max_files beyond 100?**
|
|
A: Yes, but consider rate limits and timeout settings.
|
|
|
|
---
|
|
|
|
## Version History
|
|
|
|
- **v1.0.0** (December 2024)
|
|
- Initial release
|
|
- Claude AI integration
|
|
- Redis caching
|
|
- Rate limiting
|
|
- Content optimization
|
|
- Multi-service architecture
|
|
|
|
---
|
|
|
|
## Contributing
|
|
|
|
For contributions or improvements to this documentation:
|
|
1. Ensure accuracy by testing commands
|
|
2. Follow existing format and style
|
|
3. Update version history
|
|
4. Add examples where helpful
|
|
5. Keep diagrams up to date
|
|
|
|
---
|
|
|
|
**Last Updated**: December 2024
|
|
**Version**: 1.0.0
|
|
**Maintained By**: CodeNuk Team
|
|
|