Re_Backend/docs/GCP_PRODUCTION_SETUP.md

670 lines
20 KiB
Markdown

# GCP Cloud Storage - Production Setup Guide
## Overview
This guide provides step-by-step instructions for setting up Google Cloud Storage (GCS) for the **Royal Enfield Workflow System** in **Production** environment. This document focuses specifically on production deployment requirements, folder structure, and environment configuration.
---
## Table of Contents
1. [Production Requirements](#1-production-requirements)
2. [GCP Bucket Configuration](#2-gcp-bucket-configuration)
3. [Service Account Setup](#3-service-account-setup)
4. [Environment Variables Configuration](#4-environment-variables-configuration)
5. [Folder Structure in GCS](#5-folder-structure-in-gcs)
6. [Security & Access Control](#6-security--access-control)
7. [CORS Configuration](#7-cors-configuration)
8. [Lifecycle Management](#8-lifecycle-management)
9. [Monitoring & Alerts](#9-monitoring--alerts)
10. [Verification & Testing](#10-verification--testing)
---
## 1. Production Requirements
### 1.1 Application Details
| Item | Production Value |
|------|------------------|
| **Application** | Royal Enfield Workflow System |
| **Environment** | Production |
| **Domain** | `https://reflow.royalenfield.com` |
| **Purpose** | Store workflow documents, attachments, invoices, and credit notes |
| **Storage Type** | Google Cloud Storage (GCS) |
| **Region** | `asia-south1` (Mumbai) |
### 1.2 Storage Requirements
The application stores:
- **Workflow Documents**: Initial documents uploaded during request creation
- **Work Note Attachments**: Files attached during approval workflow
- **Invoice Files**: Generated e-invoice PDFs
- **Credit Note Files**: Generated credit note PDFs
- **Dealer Claim Documents**: Proposal documents, completion documents
---
## 2. GCP Bucket Configuration
### 2.1 Production Bucket Settings
| Setting | Production Value |
|---------|------------------|
| **Bucket Name** | `reflow-documents-prod` |
| **Location Type** | Region |
| **Region** | `asia-south1` (Mumbai) |
| **Storage Class** | Standard (for active files) |
| **Access Control** | Uniform bucket-level access |
| **Public Access Prevention** | Enforced (Block all public access) |
| **Versioning** | Enabled (for recovery) |
| **Lifecycle Rules** | Configured (see section 8) |
### 2.2 Create Production Bucket
```bash
# Create production bucket
gcloud storage buckets create gs://reflow-documents-prod \
--project=re-platform-workflow-dealer \
--location=asia-south1 \
--uniform-bucket-level-access \
--public-access-prevention
# Enable versioning
gcloud storage buckets update gs://reflow-documents-prod \
--versioning
# Verify bucket creation
gcloud storage buckets describe gs://reflow-documents-prod
```
### 2.3 Bucket Naming Convention
| Environment | Bucket Name | Purpose |
|-------------|-------------|---------|
| Development | `reflow-documents-dev` | Development testing |
| UAT | `reflow-documents-uat` | User acceptance testing |
| Production | `reflow-documents-prod` | Live production data |
---
## 3. Service Account Setup
### 3.1 Create Production Service Account
```bash
# Create service account for production
gcloud iam service-accounts create reflow-storage-prod-sa \
--display-name="RE Workflow Production Storage Service Account" \
--description="Service account for production file storage operations" \
--project=re-platform-workflow-dealer
```
### 3.2 Assign Required Roles
The service account needs the following IAM roles:
| Role | Purpose | Required For |
|------|---------|--------------|
| `roles/storage.objectAdmin` | Full control over objects | Upload, delete, update files |
| `roles/storage.objectViewer` | Read objects | Download and preview files |
| `roles/storage.legacyBucketReader` | Read bucket metadata | List files and check bucket status |
```bash
# Grant Storage Object Admin role
gcloud projects add-iam-policy-binding re-platform-workflow-dealer \
--member="serviceAccount:reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
# Grant Storage Object Viewer role (for read operations)
gcloud projects add-iam-policy-binding re-platform-workflow-dealer \
--member="serviceAccount:reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com" \
--role="roles/storage.objectViewer"
```
### 3.3 Generate Service Account Key
```bash
# Generate JSON key file for production
gcloud iam service-accounts keys create ./config/gcp-key-prod.json \
--iam-account=reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com \
--project=re-platform-workflow-dealer
```
⚠️ **Security Warning:**
- Store the key file securely (not in Git)
- Use secure file transfer methods
- Rotate keys periodically (every 90 days recommended)
- Restrict file permissions: `chmod 600 ./config/gcp-key-prod.json`
---
## 4. Environment Variables Configuration
### 4.1 Required Environment Variables
Add the following environment variables to your production `.env` file:
```env
# ============================================
# Google Cloud Storage (GCP) Configuration
# ============================================
# GCP Project ID - Must match the project_id in your service account key file
GCP_PROJECT_ID=re-platform-workflow-dealer
# GCP Bucket Name - Production bucket name
GCP_BUCKET_NAME=reflow-documents-prod
# GCP Service Account Key File Path
# Can be relative to project root or absolute path
# Example: ./config/gcp-key-prod.json
# Example: /etc/reflow/config/gcp-key-prod.json
GCP_KEY_FILE=./config/gcp-key-prod.json
```
### 4.2 Environment Variable Details
| Variable | Description | Example Value | Required |
|----------|-------------|---------------|----------|
| `GCP_PROJECT_ID` | Your GCP project ID. Must match the `project_id` field in the service account JSON key file. | `re-platform-workflow-dealer` | ✅ Yes |
| `GCP_BUCKET_NAME` | Name of the GCS bucket where files will be stored. Must exist in your GCP project. | `reflow-documents-prod` | ✅ Yes |
| `GCP_KEY_FILE` | Path to the service account JSON key file. Can be relative (from project root) or absolute path. | `./config/gcp-key-prod.json` | ✅ Yes |
### 4.3 File Path Configuration
**Relative Path (Recommended for Development):**
```env
GCP_KEY_FILE=./config/gcp-key-prod.json
```
**Absolute Path (Recommended for Production):**
```env
GCP_KEY_FILE=/etc/reflow/config/gcp-key-prod.json
```
### 4.4 Verification
After setting environment variables, verify the configuration:
```bash
# Check if variables are set
echo $GCP_PROJECT_ID
echo $GCP_BUCKET_NAME
echo $GCP_KEY_FILE
# Verify key file exists
ls -la $GCP_KEY_FILE
# Verify key file permissions (should be 600)
stat -c "%a %n" $GCP_KEY_FILE
```
---
## 5. Folder Structure in GCS
### 5.1 Production Bucket Structure
```
reflow-documents-prod/
├── requests/ # All workflow-related files
│ ├── REQ-2025-12-0001/ # Request-specific folder
│ │ ├── documents/ # Initial request documents
│ │ │ ├── 1701234567890-abc123-proposal.pdf
│ │ │ ├── 1701234567891-def456-specification.docx
│ │ │ └── 1701234567892-ghi789-budget.xlsx
│ │ │
│ │ ├── attachments/ # Work note attachments
│ │ │ ├── 1701234567893-jkl012-approval_note.pdf
│ │ │ ├── 1701234567894-mno345-signature.png
│ │ │ └── 1701234567895-pqr678-supporting_doc.pdf
│ │ │
│ │ ├── invoices/ # Generated invoice files
│ │ │ └── 1701234567896-stu901-invoice_REQ-2025-12-0001.pdf
│ │ │
│ │ └── credit-notes/ # Generated credit note files
│ │ └── 1701234567897-vwx234-credit_note_REQ-2025-12-0001.pdf
│ │
│ ├── REQ-2025-12-0002/
│ │ ├── documents/
│ │ ├── attachments/
│ │ ├── invoices/
│ │ └── credit-notes/
│ │
│ └── REQ-2025-12-0003/
│ └── ...
└── temp/ # Temporary uploads (auto-deleted after 24h)
└── (temporary files before processing)
```
### 5.2 File Path Patterns
| File Type | Path Pattern | Example |
|-----------|--------------|---------|
| **Documents** | `requests/{requestNumber}/documents/{timestamp}-{hash}-{filename}` | `requests/REQ-2025-12-0001/documents/1701234567890-abc123-proposal.pdf` |
| **Attachments** | `requests/{requestNumber}/attachments/{timestamp}-{hash}-{filename}` | `requests/REQ-2025-12-0001/attachments/1701234567893-jkl012-approval_note.pdf` |
| **Invoices** | `requests/{requestNumber}/invoices/{timestamp}-{hash}-{filename}` | `requests/REQ-2025-12-0001/invoices/1701234567896-stu901-invoice_REQ-2025-12-0001.pdf` |
| **Credit Notes** | `requests/{requestNumber}/credit-notes/{timestamp}-{hash}-{filename}` | `requests/REQ-2025-12-0001/credit-notes/1701234567897-vwx234-credit_note_REQ-2025-12-0001.pdf` |
### 5.3 File Naming Convention
Files are automatically renamed with the following pattern:
```
{timestamp}-{randomHash}-{sanitizedOriginalName}
```
**Example:**
- Original: `My Proposal Document (Final).pdf`
- Stored: `1701234567890-abc123-My_Proposal_Document__Final_.pdf`
**Benefits:**
- Prevents filename conflicts
- Maintains original filename for reference
- Ensures unique file identifiers
- Safe for URL encoding
---
## 6. Security & Access Control
### 6.1 Bucket Security Settings
```bash
# Enforce public access prevention
gcloud storage buckets update gs://reflow-documents-prod \
--public-access-prevention
# Enable uniform bucket-level access
gcloud storage buckets update gs://reflow-documents-prod \
--uniform-bucket-level-access
```
### 6.2 Access Control Strategy
**Production Approach:**
- **Private Bucket**: All files are private by default
- **Signed URLs**: Generate time-limited signed URLs for file access (recommended)
- **Service Account**: Only service account has direct access
- **IAM Policies**: Restrict access to specific service accounts only
### 6.3 Signed URL Configuration (Recommended)
For production, use signed URLs instead of public URLs:
```typescript
// Example: Generate signed URL (valid for 1 hour)
const [url] = await file.getSignedUrl({
action: 'read',
expires: Date.now() + 60 * 60 * 1000, // 1 hour
});
```
### 6.4 Security Checklist
- [ ] Public access prevention enabled
- [ ] Uniform bucket-level access enabled
- [ ] Service account has minimal required permissions
- [ ] JSON key file stored securely (not in Git)
- [ ] Key file permissions set to 600
- [ ] CORS configured for specific domains only
- [ ] Bucket versioning enabled
- [ ] Access logging enabled
- [ ] Signed URLs used for file access (if applicable)
---
## 7. CORS Configuration
### 7.1 Production CORS Policy
Create `cors-config-prod.json`:
```json
[
{
"origin": [
"https://reflow.royalenfield.com",
"https://www.royalenfield.com"
],
"method": ["GET", "PUT", "POST", "DELETE", "HEAD", "OPTIONS"],
"responseHeader": [
"Content-Type",
"Content-Disposition",
"Content-Length",
"Cache-Control",
"x-goog-meta-*"
],
"maxAgeSeconds": 3600
}
]
```
### 7.2 Apply CORS Configuration
```bash
gcloud storage buckets update gs://reflow-documents-prod \
--cors-file=cors-config-prod.json
```
### 7.3 Verify CORS
```bash
# Check CORS configuration
gcloud storage buckets describe gs://reflow-documents-prod \
--format="value(cors)"
```
---
## 8. Lifecycle Management
### 8.1 Lifecycle Rules Configuration
Create `lifecycle-config-prod.json`:
```json
{
"lifecycle": {
"rule": [
{
"action": { "type": "Delete" },
"condition": {
"age": 1,
"matchesPrefix": ["temp/"]
},
"description": "Delete temporary files after 24 hours"
},
{
"action": { "type": "SetStorageClass", "storageClass": "NEARLINE" },
"condition": {
"age": 90,
"matchesPrefix": ["requests/"]
},
"description": "Move old files to Nearline storage after 90 days"
},
{
"action": { "type": "SetStorageClass", "storageClass": "COLDLINE" },
"condition": {
"age": 365,
"matchesPrefix": ["requests/"]
},
"description": "Move archived files to Coldline storage after 1 year"
}
]
}
}
```
### 8.2 Apply Lifecycle Rules
```bash
gcloud storage buckets update gs://reflow-documents-prod \
--lifecycle-file=lifecycle-config-prod.json
```
### 8.3 Lifecycle Rule Benefits
| Rule | Purpose | Cost Savings |
|------|---------|--------------|
| Delete temp files | Remove temporary uploads after 24h | Prevents storage bloat |
| Move to Nearline | Archive files older than 90 days | ~50% cost reduction |
| Move to Coldline | Archive files older than 1 year | ~70% cost reduction |
---
## 9. Monitoring & Alerts
### 9.1 Enable Access Logging
```bash
# Create logging bucket (if not exists)
gcloud storage buckets create gs://reflow-logs-prod \
--project=re-platform-workflow-dealer \
--location=asia-south1
# Enable access logging
gcloud storage buckets update gs://reflow-documents-prod \
--log-bucket=gs://reflow-logs-prod \
--log-object-prefix=reflow-storage-logs/
```
### 9.2 Set Up Monitoring Alerts
**Recommended Alerts:**
1. **Storage Quota Alert**
- Trigger: Storage exceeds 80% of quota
- Action: Notify DevOps team
2. **Unusual Access Patterns**
- Trigger: Unusual download patterns detected
- Action: Security team notification
3. **Failed Access Attempts**
- Trigger: Multiple failed authentication attempts
- Action: Immediate security alert
4. **High Upload Volume**
- Trigger: Upload volume exceeds normal threshold
- Action: Performance team notification
### 9.3 Cost Monitoring
Monitor storage costs via:
- GCP Console → Billing → Reports
- Set up budget alerts at 50%, 75%, 90% of monthly budget
- Review storage class usage (Standard vs Nearline vs Coldline)
---
## 10. Verification & Testing
### 10.1 Pre-Deployment Verification
```bash
# 1. Verify bucket exists
gcloud storage buckets describe gs://reflow-documents-prod
# 2. Verify service account has access
gcloud storage ls gs://reflow-documents-prod \
--impersonate-service-account=reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com
# 3. Test file upload
echo "test file" > test-upload.txt
gcloud storage cp test-upload.txt gs://reflow-documents-prod/temp/test-upload.txt
# 4. Test file download
gcloud storage cp gs://reflow-documents-prod/temp/test-upload.txt ./test-download.txt
# 5. Test file delete
gcloud storage rm gs://reflow-documents-prod/temp/test-upload.txt
# 6. Clean up
rm test-upload.txt test-download.txt
```
### 10.2 Application-Level Testing
1. **Upload Test:**
- Upload a document via API
- Verify file appears in GCS bucket
- Check database `storage_url` field contains GCS URL
2. **Download Test:**
- Download file via API
- Verify file is accessible
- Check response headers
3. **Delete Test:**
- Delete file via API
- Verify file is removed from GCS
- Check database record is updated
### 10.3 Production Readiness Checklist
- [ ] Bucket created and configured
- [ ] Service account created with correct permissions
- [ ] JSON key file generated and stored securely
- [ ] Environment variables configured in `.env`
- [ ] CORS policy applied
- [ ] Lifecycle rules configured
- [ ] Versioning enabled
- [ ] Access logging enabled
- [ ] Monitoring alerts configured
- [ ] Upload/download/delete operations tested
- [ ] Backup and recovery procedures documented
---
## 11. Troubleshooting
### 11.1 Common Issues
**Issue: Files not uploading to GCS**
- ✅ Check `.env` configuration matches credentials
- ✅ Verify service account has correct permissions
- ✅ Check bucket name exists and is accessible
- ✅ Review application logs for GCS errors
- ✅ Verify key file path is correct
**Issue: Files uploading but not accessible**
- ✅ Verify bucket permissions (private vs public)
- ✅ Check CORS configuration if accessing from browser
- ✅ Ensure `storage_url` is being saved correctly in database
- ✅ Verify signed URL generation (if using private bucket)
**Issue: Permission denied errors**
- ✅ Verify service account has `roles/storage.objectAdmin`
- ✅ Check bucket IAM policies
- ✅ Verify key file is valid and not expired
### 11.2 Log Analysis
Check application logs for GCS-related messages:
```bash
# Search for GCS initialization
grep "GCS.*Initialized" logs/app.log
# Search for GCS errors
grep "GCS.*Error" logs/app.log
# Search for upload failures
grep "GCS.*upload.*failed" logs/app.log
```
---
## 12. Production Deployment Steps
### 12.1 Deployment Checklist
1. **Pre-Deployment:**
- [ ] Create production bucket
- [ ] Create production service account
- [ ] Generate and secure key file
- [ ] Configure environment variables
- [ ] Test upload/download operations
2. **Deployment:**
- [ ] Deploy application with new environment variables
- [ ] Verify GCS initialization in logs
- [ ] Test file upload functionality
- [ ] Monitor for errors
3. **Post-Deployment:**
- [ ] Verify files are being stored in GCS
- [ ] Check database `storage_url` fields
- [ ] Monitor storage costs
- [ ] Review access logs
---
## 13. Cost Estimation (Production)
| Item | Monthly Estimate | Notes |
|------|------------------|-------|
| **Storage (500GB)** | ~$10.00 | Standard storage class |
| **Operations (100K)** | ~$0.50 | Upload/download operations |
| **Network Egress** | Variable | Depends on download volume |
| **Nearline Storage** | ~$5.00 | Files older than 90 days |
| **Coldline Storage** | ~$2.00 | Files older than 1 year |
**Total Estimated Monthly Cost:** ~$17.50 (excluding network egress)
---
## 14. Support & Contacts
| Role | Responsibility | Contact |
|------|----------------|---------|
| **DevOps Team** | GCP infrastructure setup | [DevOps Email] |
| **Application Team** | Application configuration | [App Team Email] |
| **Security Team** | Access control and permissions | [Security Email] |
---
## 15. Quick Reference
### 15.1 Essential Commands
```bash
# Create bucket
gcloud storage buckets create gs://reflow-documents-prod \
--project=re-platform-workflow-dealer \
--location=asia-south1 \
--uniform-bucket-level-access \
--public-access-prevention
# Create service account
gcloud iam service-accounts create reflow-storage-prod-sa \
--display-name="RE Workflow Production Storage" \
--project=re-platform-workflow-dealer
# Generate key
gcloud iam service-accounts keys create ./config/gcp-key-prod.json \
--iam-account=reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com
# Set CORS
gcloud storage buckets update gs://reflow-documents-prod \
--cors-file=cors-config-prod.json
# Enable versioning
gcloud storage buckets update gs://reflow-documents-prod \
--versioning
```
### 15.2 Environment Variables Template
```env
# Production GCP Configuration
GCP_PROJECT_ID=re-platform-workflow-dealer
GCP_BUCKET_NAME=reflow-documents-prod
GCP_KEY_FILE=./config/gcp-key-prod.json
```
---
## Appendix: File Structure Reference
### Database Storage Fields
The application stores file information in the database:
| Table | Field | Description |
|-------|-------|-------------|
| `documents` | `file_path` | GCS path: `requests/{requestNumber}/documents/{filename}` |
| `documents` | `storage_url` | Full GCS URL: `https://storage.googleapis.com/bucket/path` |
| `work_note_attachments` | `file_path` | GCS path: `requests/{requestNumber}/attachments/{filename}` |
| `work_note_attachments` | `storage_url` | Full GCS URL |
| `claim_invoices` | `invoice_file_path` | GCS path: `requests/{requestNumber}/invoices/{filename}` |
| `claim_credit_notes` | `credit_note_file_path` | GCS path: `requests/{requestNumber}/credit-notes/{filename}` |
---
**Document Version:** 1.0
**Last Updated:** December 2024
**Maintained By:** RE Workflow Development Team