3.6 KiB
3.6 KiB
Rebuild Instructions - Multi-Document Upload Service
Issue: Empty Graph in Neo4j
Problem: Query returns "(no changes, no records)" because the job completed with 0 relations.
Root Cause: PDF extraction failed due to missing dependencies (unstructured[pdf]).
Fixes Applied
- ✅ Added PDF dependencies (
unstructured[pdf],unstructured[docx], etc.) - ✅ Added fallback extractors (pdfplumber, python-docx, python-pptx)
- ✅ Improved error handling and logging
- ✅ Fixed Neo4j query syntax
- ✅ Better status messages
Rebuild Steps
Step 1: Rebuild the Service
cd /home/tech4biz/Desktop/prakash/codenuk/backend_new1/codenuk_backend_mine
# Stop the service
docker-compose stop multi-document-upload-service
# Rebuild with new dependencies
docker-compose build --no-cache multi-document-upload-service
# Start the service
docker-compose up -d multi-document-upload-service
# Check logs to verify it's starting correctly
docker-compose logs -f multi-document-upload-service
Step 2: Verify Dependencies
# Check if unstructured[pdf] is installed
docker-compose exec multi-document-upload-service pip list | grep unstructured
# You should see:
# unstructured
# unstructured-pdf
# unstructured-docx
# etc.
Step 3: Test the Service
# Check health endpoint
curl http://localhost:8024/health
# Should return:
# {
# "status": "ok",
# "claude_model": "claude-3-5-haiku-latest",
# ...
# }
Step 4: Re-upload Documents
- Open frontend:
http://localhost:3001/project-builder - Go to Step 1: Project Type
- Find "Upload Documents for Knowledge Graph" section
- Upload a PDF or other document
- Wait for processing to complete
- Check status - should show relation count > 0
Step 5: Verify in Neo4j
Run these queries in Neo4j Browser (http://localhost:7474):
// Check if any nodes exist
MATCH (n)
RETURN count(n) as node_count
// Check for CAUSES relationships
MATCH (n:Concept)-[r:CAUSES]->(m:Concept)
RETURN n.name as cause,
m.name as effect,
r.confidence as confidence,
r.job_id as job_id
LIMIT 50
Expected Results
After rebuilding and re-uploading:
- PDF extraction succeeds ✅
- Text is extracted ✅
- Relations are extracted ✅
- Relations are written to Neo4j ✅
- Query returns results ✅
Troubleshooting
If you still see 0 relations:
-
Check service logs:
docker-compose logs multi-document-upload-service | tail -50 -
Check extraction logs:
docker-compose logs multi-document-upload-service | grep -i "extract\|pdf" -
Check Claude analysis:
docker-compose logs multi-document-upload-service | grep -i "claude\|analyze\|relation" -
Check Neo4j connection:
docker-compose logs multi-document-upload-service | grep -i "neo4j\|graph\|write" -
Verify document has causal language:
- Not all documents contain causal relationships
- Try uploading a document with clear cause-effect statements
- Example: "Smoking causes lung cancer"
Quick Test
Test with a simple text file:
-
Create a test file
test_causal.txt:Smoking cigarettes causes lung cancer. Heavy rain causes flooding. Exercise improves health. -
Upload it via the frontend
-
Check Neo4j for relationships
-
Should see 3 causal relationships
Next Steps
- Rebuild the service
- Re-upload documents
- Check Neo4j for relationships
- If still no results, check service logs
- Verify the document contains causal language