codenuk_backend_mine/services/ai-analysis-service/README.md
2025-10-16 10:52:33 +05:30

203 lines
5.8 KiB
Markdown

# Complete AI Repository Analysis Service
A comprehensive AI-powered repository analysis tool that automatically analyzes **ALL files** in a repository without any limits or user queries required.
## 🚀 Features
- **Complete Analysis**: Analyzes ALL files in the repository (no max-files limit)
- **Fully Automated**: No user query required - runs completely automatically
- **Memory-Enhanced**: Learns from previous analyses using advanced memory systems
- **Comprehensive Reports**: Generates detailed PDF reports with executive summaries
- **Multi-Database Support**: Uses PostgreSQL, MongoDB, and Redis for optimal performance
- **Security Focus**: Identifies security vulnerabilities and code quality issues
- **Architecture Assessment**: Provides architectural insights and recommendations
## 📋 Requirements
### System Dependencies
- Python 3.8+
- PostgreSQL with pgvector extension
- MongoDB
- Redis
### Python Dependencies
```bash
pip install anthropic python-dotenv git redis pymongo psycopg2-binary numpy reportlab
```
## 🛠️ Setup
1. **Install Dependencies**:
```bash
pip install -r requirements.txt
```
2. **Database Setup**:
```bash
# Run the database migration
psql -U postgres -d repo_vectors -f 001-schema.sql
```
3. **Environment Variables**:
Create a `.env` file with:
```env
ANTHROPIC_API_KEY=your_api_key_here
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0
MONGODB_URL=mongodb://localhost:27017/
MONGODB_DB=repo_analyzer
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=repo_vectors
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your_password
```
## 🎯 Usage
### Basic Usage
```bash
python ai-analyze.py /path/to/repository
```
### With Custom Output
```bash
python ai-analyze.py /path/to/repository --output my_analysis.pdf
```
### With API Key Override
```bash
python ai-analyze.py /path/to/repository --api-key your_api_key
```
## 📊 What It Analyzes
### File Types Supported
- **Programming Languages**: Python, JavaScript, TypeScript, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin
- **Web Technologies**: HTML, CSS, SCSS, SASS
- **Configuration Files**: JSON, YAML, XML, SQL
- **Build Files**: Dockerfile, Makefile, CMake, package.json, requirements.txt, Cargo.toml, pom.xml, build.gradle
- **Documentation**: README.md, Markdown files
### Analysis Coverage
- **Code Quality**: Complexity, maintainability, best practices
- **Security**: Vulnerabilities, injection attacks, authentication issues
- **Architecture**: Project structure, scalability, design patterns
- **Performance**: Optimization opportunities, bottlenecks
- **Documentation**: Completeness and quality
## 📈 Output
### Console Output
- Real-time analysis progress
- Repository statistics
- Quality breakdown by file
- Language distribution
- Memory system statistics
### PDF Report
- Executive summary for leadership
- Repository overview with metrics
- Detailed file-by-file analysis
- Security assessment
- Architecture evaluation
- Recommendations and next steps
## 🧠 Memory System
The tool uses a sophisticated three-tier memory system:
1. **Working Memory (Redis)**: Temporary, fast access for current analysis
2. **Episodic Memory (MongoDB)**: User interactions and analysis sessions
3. **Persistent Memory (PostgreSQL)**: Long-term knowledge and best practices
This allows the tool to learn from previous analyses and provide increasingly accurate insights.
## 🔧 Configuration
### File Size Limits
- Default: 2MB per file (configurable in code)
- Large files are skipped with notification
### Excluded Directories
- `.git`, `node_modules`, `__pycache__`, `build`, `dist`, `target`
- `venv`, `env`, `.next`, `coverage`, `vendor`
- `bower_components`, `.gradle`, `.m2`, `.cargo`
### Rate Limiting
- 0.1 second delay between file analyses to avoid API rate limits
- Configurable in the code
## 📝 Example Output
```
🚀 Starting Complete AI Repository Analysis
============================================================
Repository: /path/to/my-project
Output: complete_repository_analysis.pdf
Mode: Complete automated analysis of ALL files
============================================================
Scanning repository: /path/to/my-project
Found 127 files to analyze
Starting comprehensive analysis of 127 files...
Analyzing file 1/127: main.py
Analyzing file 2/127: config.js
...
🎯 COMPLETE ANALYSIS FINISHED
============================================================
📊 Repository Statistics:
• Files Analyzed: 127
• Lines of Code: 15,432
• Languages: 8
• Code Quality: 7.2/10
📈 Quality Breakdown:
• High Quality Files (8-10): 45
• Medium Quality Files (5-7): 67
• Low Quality Files (1-4): 15
• Total Issues Found: 89
🔤 Language Distribution:
• Python: 45 files
• JavaScript: 32 files
• TypeScript: 28 files
• HTML: 12 files
• CSS: 10 files
📄 Complete PDF Report: complete_repository_analysis.pdf
✅ Complete analysis finished successfully!
```
## 🚨 Troubleshooting
### Common Issues
1. **Database Connection Errors**:
- Ensure PostgreSQL, MongoDB, and Redis are running
- Check connection credentials in `.env` file
2. **API Key Issues**:
- Verify Anthropic API key is valid and has sufficient credits
- Check rate limits if analysis fails
3. **Memory Issues**:
- Large repositories may require more RAM
- Consider increasing system memory or processing in batches
4. **File Permission Errors**:
- Ensure read access to repository files
- Check write permissions for output directory
## 🤝 Contributing
This is a complete automated analysis system. The tool will:
- Analyze every file in the repository
- Generate comprehensive reports
- Learn from previous analyses
- Provide actionable insights
No user interaction required - just run and get results!