Re_Backend/docs/GCP_PRODUCTION_SETUP.md

20 KiB

GCP Cloud Storage - Production Setup Guide

Overview

This guide provides step-by-step instructions for setting up Google Cloud Storage (GCS) for the Royal Enfield Workflow System in Production environment. This document focuses specifically on production deployment requirements, folder structure, and environment configuration.


Table of Contents

  1. Production Requirements
  2. GCP Bucket Configuration
  3. Service Account Setup
  4. Environment Variables Configuration
  5. Folder Structure in GCS
  6. Security & Access Control
  7. CORS Configuration
  8. Lifecycle Management
  9. Monitoring & Alerts
  10. Verification & Testing

1. Production Requirements

1.1 Application Details

Item Production Value
Application Royal Enfield Workflow System
Environment Production
Domain https://reflow.royalenfield.com
Purpose Store workflow documents, attachments, invoices, and credit notes
Storage Type Google Cloud Storage (GCS)
Region asia-south1 (Mumbai)

1.2 Storage Requirements

The application stores:

  • Workflow Documents: Initial documents uploaded during request creation
  • Work Note Attachments: Files attached during approval workflow
  • Invoice Files: Generated e-invoice PDFs
  • Credit Note Files: Generated credit note PDFs
  • Dealer Claim Documents: Proposal documents, completion documents

2. GCP Bucket Configuration

2.1 Production Bucket Settings

Setting Production Value
Bucket Name reflow-documents-prod
Location Type Region
Region asia-south1 (Mumbai)
Storage Class Standard (for active files)
Access Control Uniform bucket-level access
Public Access Prevention Enforced (Block all public access)
Versioning Enabled (for recovery)
Lifecycle Rules Configured (see section 8)

2.2 Create Production Bucket

# Create production bucket
gcloud storage buckets create gs://reflow-documents-prod \
  --project=re-platform-workflow-dealer \
  --location=asia-south1 \
  --uniform-bucket-level-access \
  --public-access-prevention

# Enable versioning
gcloud storage buckets update gs://reflow-documents-prod \
  --versioning

# Verify bucket creation
gcloud storage buckets describe gs://reflow-documents-prod

2.3 Bucket Naming Convention

Environment Bucket Name Purpose
Development reflow-documents-dev Development testing
UAT reflow-documents-uat User acceptance testing
Production reflow-documents-prod Live production data

3. Service Account Setup

3.1 Create Production Service Account

# Create service account for production
gcloud iam service-accounts create reflow-storage-prod-sa \
  --display-name="RE Workflow Production Storage Service Account" \
  --description="Service account for production file storage operations" \
  --project=re-platform-workflow-dealer

3.2 Assign Required Roles

The service account needs the following IAM roles:

Role Purpose Required For
roles/storage.objectAdmin Full control over objects Upload, delete, update files
roles/storage.objectViewer Read objects Download and preview files
roles/storage.legacyBucketReader Read bucket metadata List files and check bucket status
# Grant Storage Object Admin role
gcloud projects add-iam-policy-binding re-platform-workflow-dealer \
  --member="serviceAccount:reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# Grant Storage Object Viewer role (for read operations)
gcloud projects add-iam-policy-binding re-platform-workflow-dealer \
  --member="serviceAccount:reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com" \
  --role="roles/storage.objectViewer"

3.3 Generate Service Account Key

# Generate JSON key file for production
gcloud iam service-accounts keys create ./config/gcp-key-prod.json \
  --iam-account=reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com \
  --project=re-platform-workflow-dealer

⚠️ Security Warning:

  • Store the key file securely (not in Git)
  • Use secure file transfer methods
  • Rotate keys periodically (every 90 days recommended)
  • Restrict file permissions: chmod 600 ./config/gcp-key-prod.json

4. Environment Variables Configuration

4.1 Required Environment Variables

Add the following environment variables to your production .env file:

# ============================================
# Google Cloud Storage (GCP) Configuration
# ============================================
# GCP Project ID - Must match the project_id in your service account key file
GCP_PROJECT_ID=re-platform-workflow-dealer

# GCP Bucket Name - Production bucket name
GCP_BUCKET_NAME=reflow-documents-prod

# GCP Service Account Key File Path
# Can be relative to project root or absolute path
# Example: ./config/gcp-key-prod.json
# Example: /etc/reflow/config/gcp-key-prod.json
GCP_KEY_FILE=./config/gcp-key-prod.json

4.2 Environment Variable Details

Variable Description Example Value Required
GCP_PROJECT_ID Your GCP project ID. Must match the project_id field in the service account JSON key file. re-platform-workflow-dealer Yes
GCP_BUCKET_NAME Name of the GCS bucket where files will be stored. Must exist in your GCP project. reflow-documents-prod Yes
GCP_KEY_FILE Path to the service account JSON key file. Can be relative (from project root) or absolute path. ./config/gcp-key-prod.json Yes

4.3 File Path Configuration

Relative Path (Recommended for Development):

GCP_KEY_FILE=./config/gcp-key-prod.json

Absolute Path (Recommended for Production):

GCP_KEY_FILE=/etc/reflow/config/gcp-key-prod.json

4.4 Verification

After setting environment variables, verify the configuration:

# Check if variables are set
echo $GCP_PROJECT_ID
echo $GCP_BUCKET_NAME
echo $GCP_KEY_FILE

# Verify key file exists
ls -la $GCP_KEY_FILE

# Verify key file permissions (should be 600)
stat -c "%a %n" $GCP_KEY_FILE

5. Folder Structure in GCS

5.1 Production Bucket Structure

reflow-documents-prod/
│
├── requests/                          # All workflow-related files
│   ├── REQ-2025-12-0001/             # Request-specific folder
│   │   ├── documents/                 # Initial request documents
│   │   │   ├── 1701234567890-abc123-proposal.pdf
│   │   │   ├── 1701234567891-def456-specification.docx
│   │   │   └── 1701234567892-ghi789-budget.xlsx
│   │   │
│   │   ├── attachments/                # Work note attachments
│   │   │   ├── 1701234567893-jkl012-approval_note.pdf
│   │   │   ├── 1701234567894-mno345-signature.png
│   │   │   └── 1701234567895-pqr678-supporting_doc.pdf
│   │   │
│   │   ├── invoices/                   # Generated invoice files
│   │   │   └── 1701234567896-stu901-invoice_REQ-2025-12-0001.pdf
│   │   │
│   │   └── credit-notes/               # Generated credit note files
│   │       └── 1701234567897-vwx234-credit_note_REQ-2025-12-0001.pdf
│   │
│   ├── REQ-2025-12-0002/
│   │   ├── documents/
│   │   ├── attachments/
│   │   ├── invoices/
│   │   └── credit-notes/
│   │
│   └── REQ-2025-12-0003/
│       └── ...
│
└── temp/                               # Temporary uploads (auto-deleted after 24h)
    └── (temporary files before processing)

5.2 File Path Patterns

File Type Path Pattern Example
Documents requests/{requestNumber}/documents/{timestamp}-{hash}-{filename} requests/REQ-2025-12-0001/documents/1701234567890-abc123-proposal.pdf
Attachments requests/{requestNumber}/attachments/{timestamp}-{hash}-{filename} requests/REQ-2025-12-0001/attachments/1701234567893-jkl012-approval_note.pdf
Invoices requests/{requestNumber}/invoices/{timestamp}-{hash}-{filename} requests/REQ-2025-12-0001/invoices/1701234567896-stu901-invoice_REQ-2025-12-0001.pdf
Credit Notes requests/{requestNumber}/credit-notes/{timestamp}-{hash}-{filename} requests/REQ-2025-12-0001/credit-notes/1701234567897-vwx234-credit_note_REQ-2025-12-0001.pdf

5.3 File Naming Convention

Files are automatically renamed with the following pattern:

{timestamp}-{randomHash}-{sanitizedOriginalName}

Example:

  • Original: My Proposal Document (Final).pdf
  • Stored: 1701234567890-abc123-My_Proposal_Document__Final_.pdf

Benefits:

  • Prevents filename conflicts
  • Maintains original filename for reference
  • Ensures unique file identifiers
  • Safe for URL encoding

6. Security & Access Control

6.1 Bucket Security Settings

# Enforce public access prevention
gcloud storage buckets update gs://reflow-documents-prod \
  --public-access-prevention

# Enable uniform bucket-level access
gcloud storage buckets update gs://reflow-documents-prod \
  --uniform-bucket-level-access

6.2 Access Control Strategy

Production Approach:

  • Private Bucket: All files are private by default
  • Signed URLs: Generate time-limited signed URLs for file access (recommended)
  • Service Account: Only service account has direct access
  • IAM Policies: Restrict access to specific service accounts only

For production, use signed URLs instead of public URLs:

// Example: Generate signed URL (valid for 1 hour)
const [url] = await file.getSignedUrl({
  action: 'read',
  expires: Date.now() + 60 * 60 * 1000, // 1 hour
});

6.4 Security Checklist

  • Public access prevention enabled
  • Uniform bucket-level access enabled
  • Service account has minimal required permissions
  • JSON key file stored securely (not in Git)
  • Key file permissions set to 600
  • CORS configured for specific domains only
  • Bucket versioning enabled
  • Access logging enabled
  • Signed URLs used for file access (if applicable)

7. CORS Configuration

7.1 Production CORS Policy

Create cors-config-prod.json:

[
  {
    "origin": [
      "https://reflow.royalenfield.com",
      "https://www.royalenfield.com"
    ],
    "method": ["GET", "PUT", "POST", "DELETE", "HEAD", "OPTIONS"],
    "responseHeader": [
      "Content-Type",
      "Content-Disposition",
      "Content-Length",
      "Cache-Control",
      "x-goog-meta-*"
    ],
    "maxAgeSeconds": 3600
  }
]

7.2 Apply CORS Configuration

gcloud storage buckets update gs://reflow-documents-prod \
  --cors-file=cors-config-prod.json

7.3 Verify CORS

# Check CORS configuration
gcloud storage buckets describe gs://reflow-documents-prod \
  --format="value(cors)"

8. Lifecycle Management

8.1 Lifecycle Rules Configuration

Create lifecycle-config-prod.json:

{
  "lifecycle": {
    "rule": [
      {
        "action": { "type": "Delete" },
        "condition": {
          "age": 1,
          "matchesPrefix": ["temp/"]
        },
        "description": "Delete temporary files after 24 hours"
      },
      {
        "action": { "type": "SetStorageClass", "storageClass": "NEARLINE" },
        "condition": {
          "age": 90,
          "matchesPrefix": ["requests/"]
        },
        "description": "Move old files to Nearline storage after 90 days"
      },
      {
        "action": { "type": "SetStorageClass", "storageClass": "COLDLINE" },
        "condition": {
          "age": 365,
          "matchesPrefix": ["requests/"]
        },
        "description": "Move archived files to Coldline storage after 1 year"
      }
    ]
  }
}

8.2 Apply Lifecycle Rules

gcloud storage buckets update gs://reflow-documents-prod \
  --lifecycle-file=lifecycle-config-prod.json

8.3 Lifecycle Rule Benefits

Rule Purpose Cost Savings
Delete temp files Remove temporary uploads after 24h Prevents storage bloat
Move to Nearline Archive files older than 90 days ~50% cost reduction
Move to Coldline Archive files older than 1 year ~70% cost reduction

9. Monitoring & Alerts

9.1 Enable Access Logging

# Create logging bucket (if not exists)
gcloud storage buckets create gs://reflow-logs-prod \
  --project=re-platform-workflow-dealer \
  --location=asia-south1

# Enable access logging
gcloud storage buckets update gs://reflow-documents-prod \
  --log-bucket=gs://reflow-logs-prod \
  --log-object-prefix=reflow-storage-logs/

9.2 Set Up Monitoring Alerts

Recommended Alerts:

  1. Storage Quota Alert

    • Trigger: Storage exceeds 80% of quota
    • Action: Notify DevOps team
  2. Unusual Access Patterns

    • Trigger: Unusual download patterns detected
    • Action: Security team notification
  3. Failed Access Attempts

    • Trigger: Multiple failed authentication attempts
    • Action: Immediate security alert
  4. High Upload Volume

    • Trigger: Upload volume exceeds normal threshold
    • Action: Performance team notification

9.3 Cost Monitoring

Monitor storage costs via:

  • GCP Console → Billing → Reports
  • Set up budget alerts at 50%, 75%, 90% of monthly budget
  • Review storage class usage (Standard vs Nearline vs Coldline)

10. Verification & Testing

10.1 Pre-Deployment Verification

# 1. Verify bucket exists
gcloud storage buckets describe gs://reflow-documents-prod

# 2. Verify service account has access
gcloud storage ls gs://reflow-documents-prod \
  --impersonate-service-account=reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com

# 3. Test file upload
echo "test file" > test-upload.txt
gcloud storage cp test-upload.txt gs://reflow-documents-prod/temp/test-upload.txt

# 4. Test file download
gcloud storage cp gs://reflow-documents-prod/temp/test-upload.txt ./test-download.txt

# 5. Test file delete
gcloud storage rm gs://reflow-documents-prod/temp/test-upload.txt

# 6. Clean up
rm test-upload.txt test-download.txt

10.2 Application-Level Testing

  1. Upload Test:

    • Upload a document via API
    • Verify file appears in GCS bucket
    • Check database storage_url field contains GCS URL
  2. Download Test:

    • Download file via API
    • Verify file is accessible
    • Check response headers
  3. Delete Test:

    • Delete file via API
    • Verify file is removed from GCS
    • Check database record is updated

10.3 Production Readiness Checklist

  • Bucket created and configured
  • Service account created with correct permissions
  • JSON key file generated and stored securely
  • Environment variables configured in .env
  • CORS policy applied
  • Lifecycle rules configured
  • Versioning enabled
  • Access logging enabled
  • Monitoring alerts configured
  • Upload/download/delete operations tested
  • Backup and recovery procedures documented

11. Troubleshooting

11.1 Common Issues

Issue: Files not uploading to GCS

  • Check .env configuration matches credentials
  • Verify service account has correct permissions
  • Check bucket name exists and is accessible
  • Review application logs for GCS errors
  • Verify key file path is correct

Issue: Files uploading but not accessible

  • Verify bucket permissions (private vs public)
  • Check CORS configuration if accessing from browser
  • Ensure storage_url is being saved correctly in database
  • Verify signed URL generation (if using private bucket)

Issue: Permission denied errors

  • Verify service account has roles/storage.objectAdmin
  • Check bucket IAM policies
  • Verify key file is valid and not expired

11.2 Log Analysis

Check application logs for GCS-related messages:

# Search for GCS initialization
grep "GCS.*Initialized" logs/app.log

# Search for GCS errors
grep "GCS.*Error" logs/app.log

# Search for upload failures
grep "GCS.*upload.*failed" logs/app.log

12. Production Deployment Steps

12.1 Deployment Checklist

  1. Pre-Deployment:

    • Create production bucket
    • Create production service account
    • Generate and secure key file
    • Configure environment variables
    • Test upload/download operations
  2. Deployment:

    • Deploy application with new environment variables
    • Verify GCS initialization in logs
    • Test file upload functionality
    • Monitor for errors
  3. Post-Deployment:

    • Verify files are being stored in GCS
    • Check database storage_url fields
    • Monitor storage costs
    • Review access logs

13. Cost Estimation (Production)

Item Monthly Estimate Notes
Storage (500GB) ~$10.00 Standard storage class
Operations (100K) ~$0.50 Upload/download operations
Network Egress Variable Depends on download volume
Nearline Storage ~$5.00 Files older than 90 days
Coldline Storage ~$2.00 Files older than 1 year

Total Estimated Monthly Cost: ~$17.50 (excluding network egress)


14. Support & Contacts

Role Responsibility Contact
DevOps Team GCP infrastructure setup [DevOps Email]
Application Team Application configuration [App Team Email]
Security Team Access control and permissions [Security Email]

15. Quick Reference

15.1 Essential Commands

# Create bucket
gcloud storage buckets create gs://reflow-documents-prod \
  --project=re-platform-workflow-dealer \
  --location=asia-south1 \
  --uniform-bucket-level-access \
  --public-access-prevention

# Create service account
gcloud iam service-accounts create reflow-storage-prod-sa \
  --display-name="RE Workflow Production Storage" \
  --project=re-platform-workflow-dealer

# Generate key
gcloud iam service-accounts keys create ./config/gcp-key-prod.json \
  --iam-account=reflow-storage-prod-sa@re-platform-workflow-dealer.iam.gserviceaccount.com

# Set CORS
gcloud storage buckets update gs://reflow-documents-prod \
  --cors-file=cors-config-prod.json

# Enable versioning
gcloud storage buckets update gs://reflow-documents-prod \
  --versioning

15.2 Environment Variables Template

# Production GCP Configuration
GCP_PROJECT_ID=re-platform-workflow-dealer
GCP_BUCKET_NAME=reflow-documents-prod
GCP_KEY_FILE=./config/gcp-key-prod.json

Appendix: File Structure Reference

Database Storage Fields

The application stores file information in the database:

Table Field Description
documents file_path GCS path: requests/{requestNumber}/documents/{filename}
documents storage_url Full GCS URL: https://storage.googleapis.com/bucket/path
work_note_attachments file_path GCS path: requests/{requestNumber}/attachments/{filename}
work_note_attachments storage_url Full GCS URL
claim_invoices invoice_file_path GCS path: requests/{requestNumber}/invoices/{filename}
claim_credit_notes credit_note_file_path GCS path: requests/{requestNumber}/credit-notes/{filename}

Document Version: 1.0
Last Updated: December 2024
Maintained By: RE Workflow Development Team