# n8n Bulk Read Integration Guide ## Overview The bulk read functionality now uses **n8n as the orchestration layer** to simplify the backend and make it completely provider-agnostic. The backend simply sends job requests to n8n, and n8n handles all provider-specific API interactions and callbacks. --- ## Architecture Flow ``` ┌─────────────┐ ┌─────────────┐ ┌──────────────┐ ┌─────────────┐ │ Backend │─────▶│ n8n │─────▶│ Provider │ │ Database │ │ API Server │ │ Workflow │ │ (Salesforce/ │ │ (MySQL) │ │ │◀─────│ │◀─────│ Zoho) │ │ │ └─────────────┘ └─────────────┘ └──────────────┘ └─────────────┘ │ │ └──────────────────────────────────────────────────────────────────┘ Stores job & processes data ``` ### Process Steps: 1. **User initiates bulk read** → Backend API 2. **Backend sends request** → n8n webhook 3. **n8n calls provider API** (Salesforce, Zoho, etc.) 4. **Provider processes job** asynchronously 5. **n8n receives data** from provider 6. **n8n calls backend webhook** with processed data 7. **Backend stores data** in MySQL tables --- ## 1. Backend → n8n Request ### Endpoint Configuration ```javascript // Environment Variable N8N_BULK_READ_WEBHOOK_URL=https://workflows.tech4bizsolutions.com/webhook-test/48b613f6-1bb8-4e9c-b35a-a93748acddb3 // Or hardcoded in BulkReadService constructor this.n8nWebhookUrl = 'https://workflows.tech4bizsolutions.com/webhook-test/48b613f6-1bb8-4e9c-b35a-a93748acddb3'; ``` ### Request Format #### URL ``` POST https://workflows.tech4bizsolutions.com/webhook-test/48b613f6-1bb8-4e9c-b35a-a93748acddb3 ``` #### Query Parameters (Salesforce Only) ```javascript ?instance_url=https://yourorg.my.salesforce.com ``` #### Request Body ```json { "provider": "salesforce", "service": "crm", "module": "contacts", "fields": ["Id", "FirstName", "LastName", "Email", "Phone"], "access_token": "00D5g000008XXXX!AQEAQXXX...", "callback_url": "https://your-backend.com/api/v1/bulk-read/webhook/callback?access_token=backend_jwt_token", "job_id": "salesforce_contacts_1698765432_abc123xyz", "user_id": "550e8400-e29b-41d4-a716-446655440000", "options": { "page": 1, "limit": 10000 } } ``` ### Field Descriptions | Field | Type | Required | Description | |-------|------|----------|-------------| | `provider` | string | ✅ | Provider name: `salesforce`, `zoho`, `hubspot`, etc. | | `service` | string | ✅ | Service type: `crm`, `books`, `hr`, `accounting` | | `module` | string | ✅ | Module name: `contacts`, `leads`, `accounts`, etc. | | `fields` | array | ✅ | Array of field names to fetch from provider | | `access_token` | string | ✅ | Decrypted provider access token | | `callback_url` | string | ✅ | Backend webhook URL to call when complete | | `job_id` | string | ✅ | Unique job identifier (generated by backend) | | `user_id` | string | ✅ | User UUID from backend | | `options` | object | ❌ | Additional options (page, limit, filters) | ### Query Parameters | Parameter | Required | When | Description | |-----------|----------|------|-------------| | `instance_url` | ✅ | Salesforce only | User's Salesforce instance URL | --- ## 2. n8n → Provider API ### What n8n Should Do 1. **Receive the webhook request** from backend 2. **Extract parameters** from body and query 3. **Format provider-specific request**: - Salesforce: Create bulk query job - Zoho: Create bulk read job - Others: Provider-specific format 4. **Call provider API** with access token 5. **Poll for job completion** (if needed) 6. **Fetch results** when ready 7. **Call backend callback** with processed data ### Example: Salesforce Flow in n8n ```javascript // Step 1: Parse webhook input const { provider, service, module, fields, access_token, callback_url, job_id, user_id, options } = $input.item.json; const instance_url = $input.item.query.instance_url; // Step 2: Create Salesforce bulk query const salesforceObject = mapModuleToObject(module); // contacts → Contact const query = `SELECT ${fields.join(',')} FROM ${salesforceObject}`; const bulkJobResponse = await axios.post( `${instance_url}/services/data/v57.0/jobs/query`, { operation: 'query', query: query }, { headers: { 'Authorization': `Bearer ${access_token}`, 'Content-Type': 'application/json' } } ); const salesforceJobId = bulkJobResponse.data.id; // Step 3: Poll for job completion let jobComplete = false; while (!jobComplete) { const statusResponse = await axios.get( `${instance_url}/services/data/v57.0/jobs/query/${salesforceJobId}`, { headers: { 'Authorization': `Bearer ${access_token}` } } ); if (statusResponse.data.state === 'JobComplete') { jobComplete = true; } else { await sleep(5000); // Wait 5 seconds } } // Step 4: Fetch results const resultsResponse = await axios.get( `${instance_url}/services/data/v57.0/jobs/query/${salesforceJobId}/results`, { headers: { 'Authorization': `Bearer ${access_token}`, 'Accept': 'application/json' } } ); const records = resultsResponse.data.records; // Step 5: Call backend callback await axios.post(callback_url, { job_id: job_id, status: 'completed', provider: provider, service: service, module: module, records: records, metadata: { salesforce_job_id: salesforceJobId, state: 'JobComplete', total_records: records.length, processing_time: processingTime } }); ``` --- ## 3. n8n → Backend Callback ### When Job Completes n8n should call the backend callback URL with the following format: ### Callback URL ``` POST https://your-backend.com/api/v1/bulk-read/webhook/callback?access_token=backend_jwt_token ``` ### Expected Request Body Format #### ✅ Success Response ```json { "job_id": "salesforce_contacts_1698765432_abc123xyz", "status": "completed", "provider": "salesforce", "service": "crm", "module": "contacts", "records": [ { "Id": "0035g00000XXXXX", "FirstName": "John", "LastName": "Doe", "Email": "john.doe@example.com", "Phone": "+1234567890", "Account": { "Name": "Acme Corp" }, "CreatedDate": "2024-01-15T10:30:00.000Z", "LastModifiedDate": "2024-01-20T14:45:00.000Z" }, { "Id": "0035g00000YYYYY", "FirstName": "Jane", "LastName": "Smith", "Email": "jane.smith@example.com", "Phone": "+0987654321", "Account": { "Name": "Tech Solutions" }, "CreatedDate": "2024-01-16T09:15:00.000Z", "LastModifiedDate": "2024-01-21T11:30:00.000Z" } ], "metadata": { "salesforce_job_id": "7504x00000AbCdEf", "state": "JobComplete", "total_records": 2, "processing_time": "45 seconds", "query_executed": "SELECT Id,FirstName,LastName,Email,Phone FROM Contact" } } ``` #### ❌ Failure Response ```json { "job_id": "salesforce_contacts_1698765432_abc123xyz", "status": "failed", "provider": "salesforce", "service": "crm", "module": "contacts", "records": [], "error_message": "INVALID_SESSION_ID: Session expired or invalid", "metadata": { "salesforce_job_id": "7504x00000AbCdEf", "state": "Failed", "error_code": "INVALID_SESSION_ID", "failed_at": "2024-01-15T10:35:00.000Z" } } ``` ### Field Descriptions | Field | Type | Required | Description | |-------|------|----------|-------------| | `job_id` | string | ✅ | The job_id sent by backend (same as request) | | `status` | string | ✅ | Job status: `completed`, `failed`, `in_progress` | | `provider` | string | ✅ | Provider name (same as request) | | `service` | string | ✅ | Service name (same as request) | | `module` | string | ✅ | Module name (same as request) | | `records` | array | ✅ | Array of record objects (empty if failed) | | `metadata` | object | ❌ | Additional metadata about job execution | | `error_message` | string | ❌ | Error message (required if status is `failed`) | ### Record Format Each record in the `records` array should contain: - **All requested fields** from the provider API - **Original field names** from provider (e.g., `FirstName`, not `first_name`) - **Nested objects** preserved (e.g., `Account.Name`) - **Date fields** in ISO 8601 format The backend will automatically map these to the standardized database schema. --- ## 4. Backend Processing ### What Backend Does After Receiving Callback 1. **Updates job status** in `bulk_read_jobs` table 2. **Maps provider fields** to standardized schema 3. **Inserts records** into module-specific table (e.g., `contacts_bulk`) 4. **Updates processed count** 5. **Sends response** to n8n ### Automatic Field Mapping The backend automatically maps provider-specific fields to standardized fields: ```javascript // Salesforce → Database { "FirstName": "John" → "first_name": "John" "LastName": "Doe" → "last_name": "Doe" "Email": "john@example.com" → "email": "john@example.com" "Phone": "+1234567890" → "phone": "+1234567890" "Account": { "Name": "Acme" } → "account_name": "Acme" "CreatedDate": "2024-01-15..." → "created_time": "2024-01-15..." } ``` ### Database Storage Records are stored in module-specific tables with: - `external_id`: Provider's record ID - `user_uuid`: User identifier - `provider`: Provider name - `service`: Service name - `raw_data`: Original JSON from provider - `bulk_job_id`: Job identifier - All mapped standardized fields --- ## 5. Example: Complete Salesforce Flow ### User Request ```bash POST /api/v1/bulk-read/initiate Authorization: Bearer Content-Type: application/json { "provider": "salesforce", "service": "crm", "module": "contacts", "fields": ["Id", "FirstName", "LastName", "Email", "Phone"] } ``` ### Backend → n8n ```bash POST https://workflows.tech4bizsolutions.com/webhook-test/48b613f6-1bb8-4e9c-b35a-a93748acddb3?instance_url=https://yourorg.my.salesforce.com { "provider": "salesforce", "service": "crm", "module": "contacts", "fields": ["Id", "FirstName", "LastName", "Email", "Phone"], "access_token": "00D5g000008XXXX!AQEAQXXX...", "callback_url": "https://backend.com/api/v1/bulk-read/webhook/callback?access_token=jwt_123", "job_id": "salesforce_contacts_1698765432_abc123", "user_id": "550e8400-e29b-41d4-a716-446655440000", "options": {} } ``` ### n8n Processing 1. Creates Salesforce bulk query job 2. Polls for completion 3. Fetches results ### n8n → Backend Callback ```bash POST https://backend.com/api/v1/bulk-read/webhook/callback?access_token=jwt_123 { "job_id": "salesforce_contacts_1698765432_abc123", "status": "completed", "provider": "salesforce", "service": "crm", "module": "contacts", "records": [...], "metadata": {...} } ``` ### Backend Response to User ```json { "status": "success", "message": "Bulk read job initiated via n8n for salesforce crm contacts. Processing will be handled asynchronously.", "data": { "jobId": "salesforce_contacts_1698765432_abc123", "status": "initiated", "provider": "salesforce", "service": "crm", "estimatedTime": "2 minutes" } } ``` --- ## 6. Error Handling ### n8n Should Handle 1. **API Errors**: Catch provider API errors and send failed status 2. **Timeout Errors**: Set maximum processing time (e.g., 30 minutes) 3. **Token Expiry**: Detect and report authentication errors 4. **Rate Limiting**: Handle rate limits with retries ### Example Error Callback ```json { "job_id": "salesforce_contacts_1698765432_abc123", "status": "failed", "provider": "salesforce", "service": "crm", "module": "contacts", "records": [], "error_message": "INVALID_SESSION_ID: Session expired or invalid", "metadata": { "error_code": "INVALID_SESSION_ID", "failed_at": "2024-01-15T10:35:00.000Z" } } ``` --- ## 7. Provider-Specific Notes ### Salesforce - Requires `instance_url` in query parameter - Uses bulk query API v2.0 - Field names are PascalCase - Maximum 50,000 records per job ### Zoho - Uses bulk read API v2 - Field names can vary (First_Name, First Name) - Supports callback URLs natively - Maximum 200,000 records per job ### HubSpot - Uses CRM API v3 - Pagination with `after` parameter - Property-based queries - Maximum 100 records per page --- ## 8. Configuration ### Environment Variables ```bash # n8n webhook URL N8N_BULK_READ_WEBHOOK_URL=https://workflows.tech4bizsolutions.com/webhook-test/48b613f6-1bb8-4e9c-b35a-a93748acddb3 # Backend callback base URL API_BASE_URL=https://your-backend.com ``` --- ## Benefits of n8n Integration ✅ **Simplified Backend**: No provider-specific code in backend ✅ **Centralized Logic**: All provider integrations in n8n ✅ **Easy Updates**: Update workflows without deploying backend ✅ **Visual Workflows**: See and debug flows in n8n UI ✅ **Error Handling**: n8n handles retries and error workflows ✅ **Scalability**: n8n can handle high volume processing --- ## Testing ### Test n8n Webhook ```bash curl -X POST 'https://workflows.tech4bizsolutions.com/webhook-test/48b613f6-1bb8-4e9c-b35a-a93748acddb3?instance_url=https://test.my.salesforce.com' \ -H 'Content-Type: application/json' \ -d '{ "provider": "salesforce", "service": "crm", "module": "contacts", "fields": ["Id", "FirstName", "LastName"], "access_token": "test_token", "callback_url": "https://backend.com/api/v1/bulk-read/webhook/callback", "job_id": "test_job_123", "user_id": "test_user_456" }' ``` ### Test Backend Callback ```bash curl -X POST 'https://your-backend.com/api/v1/bulk-read/webhook/callback' \ -H 'Content-Type: application/json' \ -d '{ "job_id": "test_job_123", "status": "completed", "provider": "salesforce", "service": "crm", "module": "contacts", "records": [ {"Id": "001", "FirstName": "Test", "LastName": "User"} ] }' ```