codenuk_backend_mine/services/ai-analysis-service/DOCUMENTATION_INDEX.md

# AI Analysis Service - Documentation Index

Welcome to the AI Analysis Service documentation. This service analyzes code repositories using Claude AI and integrates with the Git Integration Service and API Gateway.

---

## Quick Navigation

### Getting Started
- [Quick Reference Guide](./QUICK_REFERENCE.md) - Fast commands and common operations
- [Architecture Overview](#architecture-overview-below)
- [Environment Setup](#environment-setup-below)

### In-Depth Documentation
- [Complete Architecture Guide](./SERVICE_COMMUNICATION_ARCHITECTURE.md) - Comprehensive documentation
- [Flow Diagrams](./FLOW_DIAGRAMS.md) - Visual representations of data flow
- [Integration Examples](./INTEGRATION_EXAMPLE.md) - Code examples and usage patterns

### Technical Reference
- [API Documentation](#api-endpoints-below)
- [Service Configuration](#configuration-below)
- [Troubleshooting Guide](#troubleshooting-below)

---

## Architecture Overview

### System Components

```
┌──────────┐     ┌──────────────┐     ┌────────────────┐     ┌─────────────┐
│ Frontend │────▶│ API Gateway  │────▶│ AI Analysis    │◀───▶│ Git         │
│ (Next.js)│     │ (Express.js) │     │ (FastAPI)      │     │ Integration │
│ :3000    │     │ :8000        │     │ :8022          │     │ :8012       │
└──────────┘     └──────────────┘     └───────┬────────┘     └─────┬───────┘
                                              │                     │
                                              ▼                     ▼
                                        ┌─────────┐          ┌──────────┐
                                        │  Redis  │          │PostgreSQL│
                                        │  :6379  │          │  :5432   │
                                        └─────────┘          └──────────┘
```

### Key Features

1. **AI-Powered Analysis**: Uses Claude API for intelligent code review
2. **Rate Limiting**: Manages Claude API limits (90 requests/minute)
3. **Smart Caching**: Redis-based caching reduces API calls by 60-70%
4. **Content Optimization**: Intelligently truncates large files
5. **Report Generation**: Creates PDF and JSON reports
6. **Multi-Service Integration**: Seamless communication between services

---

## Environment Setup

### Prerequisites

- Docker & Docker Compose
- Node.js 18+ (for local development)
- Python 3.11+ (for local development)
- Anthropic API Key
- GitHub OAuth credentials

### Installation

```bash
# 1. Clone repository
git clone https://github.com/your-org/codenuk.git
cd codenuk

# 2. Set up environment variables
cp backend/codenuk_backend_mine/services/ai-analysis-service/.env.example \
   backend/codenuk_backend_mine/services/ai-analysis-service/.env

# 3. Configure .env files
# Edit .env files with your API keys and credentials

# 4. Start services
docker-compose up -d

# 5. Verify services
curl http://localhost:8000/health
curl http://localhost:8022/health
curl http://localhost:8012/health
```

### Environment Variables

#### AI Analysis Service
```bash
ANTHROPIC_API_KEY=sk-ant-api03-...
GIT_INTEGRATION_SERVICE_URL=http://git-integration:8012
REDIS_HOST=redis
REDIS_PORT=6379
PORT=8022
```

#### API Gateway
```bash
AI_ANALYSIS_URL=http://localhost:8022
GIT_INTEGRATION_URL=http://localhost:8012
PORT=8000
```

#### Git Integration
```bash
GITHUB_CLIENT_ID=your_client_id
GITHUB_CLIENT_SECRET=your_client_secret
PUBLIC_BASE_URL=https://backend.codenuk.com
POSTGRES_HOST=postgres
PORT=8012
```

---

## API Endpoints

### AI Analysis Service

#### Analyze Repository
```http
POST /analyze-repository
Content-Type: application/json

{
  "repository_id": "uuid",
  "user_id": "user-uuid",
  "output_format": "pdf",
  "max_files": 100
}
```

**Response:**
```json
{
  "success": true,
  "analysis_id": "repo_analysis_uuid_timestamp",
  "report_path": "/app/reports/..._analysis.pdf",
  "stats": {
    "total_files": 85,
    "code_quality_score": 7.8,
    "total_issues": 23
  }
}
```

#### Get Repository Info
```http
GET /repository/{id}/info?user_id={userId}
```

#### Download Report
```http
GET /reports/{filename}
```

#### Health Check
```http
GET /health
```

### Via API Gateway

All endpoints are accessible through the API Gateway:

```
Direct: http://localhost:8022/analyze-repository
Via Gateway: http://localhost:8000/api/ai-analysis/analyze-repository
```

---

## Configuration

### Service Ports

| Service | Port | Protocol |
|---------|------|----------|
| Frontend | 3000 | HTTP |
| API Gateway | 8000 | HTTP |
| AI Analysis | 8022 | HTTP |
| Git Integration | 8012 | HTTP |
| PostgreSQL | 5432 | TCP |
| Redis | 6379 | TCP |

### Rate Limiting

- **Claude API**: 90 requests per minute (configurable)
- **Sliding Window**: Tracks requests over 60-second window
- **Automatic Waiting**: Delays requests to prevent rate limit violations

### Caching

- **Storage**: Redis
- **TTL**: 24 hours (configurable)
- **Key Format**: `analysis:{file_hash}`
- **Hash Algorithm**: SHA-256

### Content Optimization

- **Threshold**: 8000 tokens (~32KB)
- **Strategy**: Extract imports, functions, classes
- **Truncation**: Intelligent context preservation

---

## Communication Flow

### 1. Repository Analysis Request

```
Frontend → API Gateway → AI Analysis → Git Integration
```

1. User clicks "Analyze Repository" in frontend
2. Frontend sends POST request to API Gateway
3. Gateway forwards to AI Analysis Service
4. AI Analysis requests repository info from Git Integration
5. Git Integration returns file tree and metadata
6. AI Analysis processes each file:
   - Check Redis cache
   - Apply rate limiting
   - Optimize content
   - Send to Claude API
   - Cache result
7. Generate repository-level analysis
8. Create PDF/JSON report
9. Return results through Gateway to Frontend

### 2. File Content Retrieval

```
AI Analysis → Git Integration → File System
```

1. AI Analysis requests file content
2. Git Integration resolves file path (case-insensitive)
3. Reads content from local storage
4. Returns content + metadata

### 3. OAuth Authentication

```
Frontend → API Gateway → Git Integration → GitHub → Git Integration → Frontend
```

1. User attempts to access private repository
2. Git Integration detects authentication requirement
3. Returns OAuth URL
4. Frontend redirects to GitHub OAuth
5. User approves access
6. GitHub redirects back with code
7. Git Integration exchanges code for token
8. Token stored in PostgreSQL
9. User can now access private repository

---

## Troubleshooting

### Common Issues

#### Service Connection Failed

**Symptoms**: "Failed to get repository info" error

**Solution**:
```bash
# Check service status
docker ps | grep git-integration

# Check network connectivity
docker network inspect backend-network

# Restart service
docker-compose restart git-integration
```

#### Rate Limit Exceeded

**Symptoms**: Analysis fails with rate limit error

**Solution**:
```bash
# Option 1: Reduce max_files
{
  "max_files": 50  # Instead of 100
}

# Option 2: Lower rate limit
CLAUDE_REQUESTS_PER_MINUTE=50  # In .env
docker-compose restart ai-analysis
```

#### Redis Connection Failed

**Symptoms**: Warning about Redis connection

**Solution**:
```bash
# Check Redis status
docker exec redis redis-cli ping

# Expected: PONG

# If fails, restart Redis
docker-compose restart redis
```

#### Authentication Errors

**Symptoms**: 401 Unauthorized for private repos

**Solution**:
- Verify GitHub OAuth credentials
- Check if user has completed OAuth flow
- Verify token is stored in database

---

## Performance Optimization

### Analysis Speed

| Configuration | Time for 100 Files | API Calls |
|--------------|-------------------|-----------|
| No optimization | 50-90 minutes | 100 |
| With caching (60% hit) | 20-35 minutes | 40 |
| With rate limiting | 2-4 minutes slower | Same |
| With content optimization | Same | 70% smaller payloads |

### Best Practices

1. **Use Caching**: Enable Redis for repeated analyses
2. **Optimize Content**: Keep 8000 token threshold
3. **Respect Rate Limits**: Don't increase beyond Claude limits
4. **Batch Processing**: Analyze during off-peak hours
5. **Monitor Resources**: Watch CPU, memory, and network usage

---

## Security Considerations

### API Keys

- Store in environment variables only
- Never commit to version control
- Rotate regularly
- Use different keys for dev/prod

### OAuth Tokens

- Encrypted at rest in PostgreSQL
- Secure transmission (HTTPS in production)
- Automatic expiration handling
- User-specific token isolation

### Network Security

- Internal Docker network for service communication
- API Gateway as single entry point
- CORS configuration for frontend
- Rate limiting to prevent abuse

---

## Monitoring and Logging

### Log Locations

```bash
# AI Analysis Service
docker logs ai-analysis -f

# API Gateway
docker logs api-gateway -f

# Git Integration
docker logs git-integration -f
```

### Key Metrics

- **Analysis Success Rate**: Track successful vs failed analyses
- **Cache Hit Rate**: Monitor Redis cache effectiveness
- **API Response Times**: Track latency for each service
- **Rate Limit Usage**: Monitor Claude API usage

### Health Checks

```bash
# All services
curl http://localhost:8000/health
curl http://localhost:8022/health
curl http://localhost:8012/health

# Database
docker exec postgres pg_isready

# Cache
docker exec redis redis-cli ping
```

---

## Development

### Local Development Setup

```bash
# AI Analysis Service
cd services/ai-analysis-service
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python server.py

# API Gateway
cd services/api-gateway
npm install
npm run dev

# Git Integration
cd services/git-integration
npm install
npm run dev

# Frontend
cd fronend/codenuk_frontend_mine
npm install
npm run dev
```

### Testing

```bash
# Test AI Analysis directly
curl -X POST http://localhost:8022/analyze-repository \
  -H "Content-Type: application/json" \
  -d '{"repository_id": "test", "user_id": "test", "output_format": "json", "max_files": 5}'

# Test through Gateway
curl -X POST http://localhost:8000/api/ai-analysis/analyze-repository \
  -H "Content-Type: application/json" \
  -d '{"repository_id": "test", "user_id": "test", "output_format": "json", "max_files": 5}'
```

### Debugging

```bash
# Enable debug mode
export DEBUG=*
export LOG_LEVEL=debug
export PYTHONUNBUFFERED=1

# Watch logs in real-time
docker-compose logs -f ai-analysis | grep "ERROR"

# Inspect container
docker exec -it ai-analysis bash
```

---

## Deployment

### Production Checklist

- [ ] Set secure environment variables
- [ ] Configure HTTPS
- [ ] Set up SSL certificates
- [ ] Enable production logging
- [ ] Configure monitoring (Prometheus, Grafana)
- [ ] Set up backup strategy
- [ ] Configure auto-scaling (if needed)
- [ ] Test failover scenarios
- [ ] Document recovery procedures
- [ ] Set up alerts

### Docker Compose Production

```yaml
services:
  ai-analysis:
    image: codenuk/ai-analysis:latest
    restart: always
    environment:
      - NODE_ENV=production
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8022/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: '2'
          memory: 4G
```

---

## Additional Resources

### Documentation Files

1. **[SERVICE_COMMUNICATION_ARCHITECTURE.md](./SERVICE_COMMUNICATION_ARCHITECTURE.md)**
   - Complete architecture documentation
   - Detailed service descriptions
   - Request/response examples
   - Error handling strategies
   - Deployment configuration

2. **[QUICK_REFERENCE.md](./QUICK_REFERENCE.md)**
   - Quick start commands
   - Common API calls
   - Troubleshooting commands
   - Performance tuning tips
   - Development shortcuts

3. **[FLOW_DIAGRAMS.md](./FLOW_DIAGRAMS.md)**
   - Visual request flow
   - Service communication diagrams
   - Data flow illustrations
   - Authentication flow
   - Error handling flow
   - Caching strategy

4. **[INTEGRATION_EXAMPLE.md](./INTEGRATION_EXAMPLE.md)**
   - Frontend integration code
   - API usage examples
   - React hooks
   - Error handling patterns

5. **[README.md](./README.md)**
   - Service overview
   - Installation instructions
   - Basic usage
   - API reference

### External Links

- [Anthropic Claude API Documentation](https://docs.anthropic.com/)
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
- [Express.js Documentation](https://expressjs.com/)
- [Docker Compose Documentation](https://docs.docker.com/compose/)
- [Redis Documentation](https://redis.io/docs/)
- [PostgreSQL Documentation](https://www.postgresql.org/docs/)

---

## Support

### Getting Help

1. Check the troubleshooting guide
2. Review service logs
3. Test endpoints individually
4. Verify environment variables
5. Check Docker network connectivity

### Common Questions

**Q: How long does analysis take?**
A: Typically 2-4 minutes for 100 files with caching, 30-60 minutes without.

**Q: Can I analyze private repositories?**
A: Yes, users need to authenticate via GitHub OAuth.

**Q: What happens if Claude API is down?**
A: Service will return appropriate errors. Cached results still work.

**Q: How much does it cost?**
A: Depends on Claude API usage. Caching reduces costs by 60-70%.

**Q: Can I increase max_files beyond 100?**
A: Yes, but consider rate limits and timeout settings.

---

## Version History

- **v1.0.0** (December 2024)
  - Initial release
  - Claude AI integration
  - Redis caching
  - Rate limiting
  - Content optimization
  - Multi-service architecture

---

## Contributing

For contributions or improvements to this documentation:
1. Ensure accuracy by testing commands
2. Follow existing format and style
3. Update version history
4. Add examples where helpful
5. Keep diagrams up to date

---

**Last Updated**: December 2024
**Version**: 1.0.0
**Maintained By**: CodeNuk Team