codenuk_backend_mine/services/ai-analysis-service
2025-10-31 08:34:11 +05:30
..
ai-analysis 50% done the cost it right 2025-10-31 08:34:11 +05:30
.env.backup done the ai analysis 2025-10-24 13:02:49 +05:30
001-schema.sql implemented direct file passes 2025-10-17 10:33:14 +05:30
ai-analyze.py 50% done the cost it right 2025-10-31 08:34:11 +05:30
CHUNKING_PROCESS_DIAGRAM.md done the ai analysis 2025-10-24 13:02:49 +05:30
Dockerfile chenges in the frontend 2025-10-16 10:52:33 +05:30
DOCUMENTATION_INDEX.md implemented direct file passes 2025-10-17 10:33:14 +05:30
enhanced_analyzer.py 50% done the cost it right 2025-10-31 08:34:11 +05:30
enhanced_chunking.py 50% done the cost it right 2025-10-31 08:34:11 +05:30
enhanced_config.py done the ai analysis 2025-10-24 13:02:49 +05:30
ENHANCED_DEPLOYMENT_GUIDE.md done the ai analysis 2025-10-24 13:02:49 +05:30
env.example implemented direct file passes 2025-10-17 10:33:14 +05:30
FILE_FLOW_ANALYSIS.md done the ai analysis 2025-10-24 13:02:49 +05:30
FLOW_DIAGRAMS.md implemented direct file passes 2025-10-17 10:33:14 +05:30
git-integration-client.py done the ai analysis 2025-10-24 13:02:49 +05:30
IMPLEMENTATION_SUMMARY.md done the ai analysis 2025-10-24 13:02:49 +05:30
INTEGRATION_EXAMPLE.md implemented direct file passes 2025-10-17 10:33:14 +05:30
migrate_database.py chenges in the frontend 2025-10-16 10:52:33 +05:30
migrate.sh chenges in the frontend 2025-10-16 10:52:33 +05:30
MULTI_FILE_CHUNKING_DIAGRAM.md done the ai analysis 2025-10-24 13:02:49 +05:30
PERFORMANCE_ENHANCEMENTS.md done the ai analysis 2025-10-24 13:02:49 +05:30
progress_manager.py fix the analysis timing issue 2025-10-28 09:01:50 +05:30
QUICK_REFERENCE.md implemented direct file passes 2025-10-17 10:33:14 +05:30
README_DOCUMENTATION.md implemented direct file passes 2025-10-17 10:33:14 +05:30
README.md chenges in the frontend 2025-10-16 10:52:33 +05:30
requirements.txt implemented direct file passes 2025-10-17 10:33:14 +05:30
run_migration.py chenges in the frontend 2025-10-16 10:52:33 +05:30
server.py 50% done the cost it right 2025-10-31 08:34:11 +05:30
SERVICE_COMMUNICATION_ARCHITECTURE.md implemented direct file passes 2025-10-17 10:33:14 +05:30
simple-schema.sql done the ai analysis 2025-10-24 13:02:49 +05:30
test_analyze.py done the ai analysis 2025-10-24 13:02:49 +05:30
test_data_storage.py done the ai analysis 2025-10-24 13:02:49 +05:30
test_db_connections.py done the ai analysis 2025-10-24 13:02:49 +05:30
test_enhanced_system.py done the ai analysis 2025-10-24 13:02:49 +05:30

Complete AI Repository Analysis Service

A comprehensive AI-powered repository analysis tool that automatically analyzes ALL files in a repository without any limits or user queries required.

🚀 Features

  • Complete Analysis: Analyzes ALL files in the repository (no max-files limit)
  • Fully Automated: No user query required - runs completely automatically
  • Memory-Enhanced: Learns from previous analyses using advanced memory systems
  • Comprehensive Reports: Generates detailed PDF reports with executive summaries
  • Multi-Database Support: Uses PostgreSQL, MongoDB, and Redis for optimal performance
  • Security Focus: Identifies security vulnerabilities and code quality issues
  • Architecture Assessment: Provides architectural insights and recommendations

📋 Requirements

System Dependencies

  • Python 3.8+
  • PostgreSQL with pgvector extension
  • MongoDB
  • Redis

Python Dependencies

pip install anthropic python-dotenv git redis pymongo psycopg2-binary numpy reportlab

🛠️ Setup

  1. Install Dependencies:

    pip install -r requirements.txt
    
  2. Database Setup:

    # Run the database migration
    psql -U postgres -d repo_vectors -f 001-schema.sql
    
  3. Environment Variables: Create a .env file with:

    ANTHROPIC_API_KEY=your_api_key_here
    REDIS_HOST=localhost
    REDIS_PORT=6379
    REDIS_DB=0
    MONGODB_URL=mongodb://localhost:27017/
    MONGODB_DB=repo_analyzer
    POSTGRES_HOST=localhost
    POSTGRES_PORT=5432
    POSTGRES_DB=repo_vectors
    POSTGRES_USER=postgres
    POSTGRES_PASSWORD=your_password
    

🎯 Usage

Basic Usage

python ai-analyze.py /path/to/repository

With Custom Output

python ai-analyze.py /path/to/repository --output my_analysis.pdf

With API Key Override

python ai-analyze.py /path/to/repository --api-key your_api_key

📊 What It Analyzes

File Types Supported

  • Programming Languages: Python, JavaScript, TypeScript, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin
  • Web Technologies: HTML, CSS, SCSS, SASS
  • Configuration Files: JSON, YAML, XML, SQL
  • Build Files: Dockerfile, Makefile, CMake, package.json, requirements.txt, Cargo.toml, pom.xml, build.gradle
  • Documentation: README.md, Markdown files

Analysis Coverage

  • Code Quality: Complexity, maintainability, best practices
  • Security: Vulnerabilities, injection attacks, authentication issues
  • Architecture: Project structure, scalability, design patterns
  • Performance: Optimization opportunities, bottlenecks
  • Documentation: Completeness and quality

📈 Output

Console Output

  • Real-time analysis progress
  • Repository statistics
  • Quality breakdown by file
  • Language distribution
  • Memory system statistics

PDF Report

  • Executive summary for leadership
  • Repository overview with metrics
  • Detailed file-by-file analysis
  • Security assessment
  • Architecture evaluation
  • Recommendations and next steps

🧠 Memory System

The tool uses a sophisticated three-tier memory system:

  1. Working Memory (Redis): Temporary, fast access for current analysis
  2. Episodic Memory (MongoDB): User interactions and analysis sessions
  3. Persistent Memory (PostgreSQL): Long-term knowledge and best practices

This allows the tool to learn from previous analyses and provide increasingly accurate insights.

🔧 Configuration

File Size Limits

  • Default: 2MB per file (configurable in code)
  • Large files are skipped with notification

Excluded Directories

  • .git, node_modules, __pycache__, build, dist, target
  • venv, env, .next, coverage, vendor
  • bower_components, .gradle, .m2, .cargo

Rate Limiting

  • 0.1 second delay between file analyses to avoid API rate limits
  • Configurable in the code

📝 Example Output

🚀 Starting Complete AI Repository Analysis
============================================================
Repository: /path/to/my-project
Output: complete_repository_analysis.pdf
Mode: Complete automated analysis of ALL files
============================================================

Scanning repository: /path/to/my-project
Found 127 files to analyze
Starting comprehensive analysis of 127 files...
Analyzing file 1/127: main.py
Analyzing file 2/127: config.js
...

🎯 COMPLETE ANALYSIS FINISHED
============================================================
📊 Repository Statistics:
   • Files Analyzed: 127
   • Lines of Code: 15,432
   • Languages: 8
   • Code Quality: 7.2/10

📈 Quality Breakdown:
   • High Quality Files (8-10): 45
   • Medium Quality Files (5-7): 67
   • Low Quality Files (1-4): 15
   • Total Issues Found: 89

🔤 Language Distribution:
   • Python: 45 files
   • JavaScript: 32 files
   • TypeScript: 28 files
   • HTML: 12 files
   • CSS: 10 files

📄 Complete PDF Report: complete_repository_analysis.pdf
✅ Complete analysis finished successfully!

🚨 Troubleshooting

Common Issues

  1. Database Connection Errors:

    • Ensure PostgreSQL, MongoDB, and Redis are running
    • Check connection credentials in .env file
  2. API Key Issues:

    • Verify Anthropic API key is valid and has sufficient credits
    • Check rate limits if analysis fails
  3. Memory Issues:

    • Large repositories may require more RAM
    • Consider increasing system memory or processing in batches
  4. File Permission Errors:

    • Ensure read access to repository files
    • Check write permissions for output directory

🤝 Contributing

This is a complete automated analysis system. The tool will:

  • Analyze every file in the repository
  • Generate comprehensive reports
  • Learn from previous analyses
  • Provide actionable insights

No user interaction required - just run and get results!