dld_backend/CONTEXT_AWARE_QUERIES.md
2025-10-30 12:13:02 +05:30

250 lines
6.6 KiB
Markdown

# Context-Aware Query Processing
## Overview
The Dubai DLD Analytics API now supports context-aware query processing, allowing users to refine their queries progressively in a conversation-like manner.
## How It Works
### Example Conversation Flow
**Q1: Initial Query**
```json
POST /api/query
{
"query": "Give me the last 6 months rental price trend for Business Bay",
"sessionId": "user123"
}
```
**Response:** Returns monthly rental disposition for Business Bay for the last 6 months.
---
**Q2: Refinement - Change Grouping**
```json
POST /api/query
{
"query": "Summarise by week",
"sessionId": "user123"
}
```
**Response:** Returns the **same data** (Business Bay, last 6 months) but with **weekly grouping** instead of monthly.
---
**Q3: Refinement - Add Filter**
```json
POST /api/query
{
"query": "Apartments only",
"sessionId": "user123"
}
```
**Response:** Returns Business Bay data from the last 6 months, grouped by week, showing **only apartments**.
## Implementation Details
### Session Management
Each session has a unique `sessionId` that maintains context for 30 minutes:
- First query: Establishes context (area, time period, property type)
- Follow-up queries: Refine existing context
- Session expires after 30 minutes of inactivity
### Supported Refinements
#### Time Grouping
- `"Summarise by week"` or `"weekly"` → Weekly grouping
- `"Summarise by month"` or `"monthly"` → Monthly grouping
- `"Summarise by year"` or `"yearly"` → Yearly grouping
#### Property Filters
- `"Apartments only"` or `"apartment only"` → Filter to flats
- `"Villas only"` or `"villa only"` → Filter to villas
- `"Commercial only"` → Filter to commercial properties
- `"Residential only"` → Filter to residential properties
#### Room Type Filters
- `"3BHK"` or `"3 bhk"` → Filter to 3 bedroom properties
- `"2BHK"` or `"2 bhk"` → Filter to 2 bedroom properties
- `"Studio"` → Filter to studio apartments
#### Limits
- `"Top 5 areas"` → Limit to top 5 results
- `"Top 10 areas"` → Limit to top 10 results
## Technical Architecture
### Components
1. **ContextManager** (`src/services/contextManager.js`)
- Stores and retrieves session context
- Detects follow-up queries
- Extracts refinements from queries
- Manages session TTL
2. **ContextAwareSQLGenerator** (`src/services/contextAwareSQLGenerator.js`)
- Generates SQL with context awareness
- Merges refinements with existing context
- Handles both new and follow-up queries
3. **QueryTemplates** (`src/services/queryTemplates.js`)
- Contains hardcoded SQL templates
- Matches queries to templates
- Provides context-aware query template
### Flow Diagram
```
User Query → NLP Parser → Context Check
Is Follow-up?
/ \
Yes No
↓ ↓
Get Context Create Context
↓ ↓
Extract Refinements
Merge Context
Generate SQL
Execute Query
Format Response
```
## API Usage Examples
### Example 1: Basic Context Flow
```bash
# Q1: Initial query
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "Give me the last 6 months rental price trend for Business Bay",
"sessionId": "user123"
}'
# Q2: Refine to weekly
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "Summarise by week",
"sessionId": "user123"
}'
# Q3: Filter to apartments
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "Apartments only",
"sessionId": "user123"
}'
```
### Example 2: Multiple Refinements
```bash
# Q1: Initial query
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "Which area is having more rental transactions?",
"sessionId": "user456"
}'
# Q2: Limit to top 5
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "Top 5 areas only",
"sessionId": "user456"
}'
# Q3: Show only residential
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "Residential only",
"sessionId": "user456"
}'
```
## Supported Query Patterns
### 1. Rental Price Trends (with context)
- Initial: "Give me the last 6 months rental price trend for [Area]"
- Refinement: "Summarise by week"
- Refinement: "Apartments only"
### 2. Area Comparison (with context)
- Initial: "Which area is having more rental transactions?"
- Refinement: "Top 5 areas only"
- Refinement: "Commercial only"
### 3. Project Analysis (with context)
- Initial: "Brief about the Project"
- Refinement: "Show only active projects"
- Refinement: "Top 10 only"
## Benefits
1. **Natural Conversation**: Users can interact naturally, refining queries progressively
2. **Efficiency**: No need to repeat entire queries
3. **Context Retention**: System remembers what the user asked
4. **Flexibility**: Multiple refinements can be applied in sequence
5. **User-Friendly**: Reduces cognitive load on users
## Session Management
### Creating Sessions
- Sessions are created automatically with a unique `sessionId`
- Each session maintains context for 30 minutes
- Sessions can be shared across devices with the same ID
### Clearing Sessions
To start fresh, simply:
- Use a different `sessionId`
- Wait 30 minutes for automatic expiration
- Make a query without `sessionId` (creates new context)
## Best Practices
1. **Always provide sessionId** for follow-up queries
2. **Use clear refinement language** (e.g., "Summarise by week")
3. **Start with specific queries** for better context
4. **Verify context** by checking metadata in response
5. **Clear context** when starting a new analysis thread
## Troubleshooting
### Issue: Follow-up not working
**Solution**: Ensure `sessionId` matches the initial query
### Issue: Context lost
**Solution**: Session expired (30 min TTL). Create new session.
### Issue: Wrong refinements applied
**Solution**: Use clear refinement phrases. Check extracted refinements in response metadata.
## Future Enhancements
- [ ] Multi-turn conversation support
- [ ] Context-based suggestions
- [ ] Query history per session
- [ ] Custom context timeouts
- [ ] Context export/import
- [ ] Voice-based refinements
---
**Status**: ✅ Fully Implemented
**Last Updated**: 2024