port-nimara-client-portal/docs/502-error-fixes-deployment.md

5.8 KiB

502 Error Fixes - Deployment Guide

This guide covers the comprehensive fixes implemented to resolve the 502 errors in the Port Nimara Client Portal.

Overview of Changes

1. Health Check System

  • New Files:
    • server/api/health.ts - Health check endpoint
    • server/utils/health-checker.ts - Application readiness management
    • server/utils/service-checks.ts - Service availability checks
    • server/plugins/startup-health.ts - Startup health monitoring

2. Process Management (PM2)

  • New Files:
    • ecosystem.config.js - PM2 configuration with cluster mode
    • docker-entrypoint.sh - Custom entrypoint with service checks
  • Modified Files:
    • Dockerfile - Added PM2, health checks, and tini init process

3. Nginx Configuration

  • New Files:
    • nginx/upstream.conf - Upstream configuration with health checks
    • nginx/client.portnimara.dev.conf - Enhanced Nginx configuration
    • nginx/error-pages/error-502.html - Custom 502 error page with auto-retry

4. External Service Resilience

  • New Files:
    • server/utils/resilient-http.ts - HTTP client with retry logic and circuit breaker
    • server/utils/documeso.ts - Enhanced Documenso client
  • Modified Files:
    • server/utils/nocodb.ts - Updated to use resilient HTTP client

5. Memory & Performance Optimization

  • New Files:
    • server/plugins/memory-monitor.ts - Memory usage monitoring
    • server/utils/cleanup-manager.ts - Resource cleanup utilities
    • server/utils/request-queue.ts - Request queue system

Deployment Steps

Step 1: Build Docker Image

# Build the new Docker image with PM2 support
docker build -t port-nimara-client-portal:resilient .

Step 2: Update Nginx Configuration

  1. Copy the new Nginx configuration files to your server:
# Copy upstream configuration
sudo cp nginx/upstream.conf /etc/nginx/conf.d/

# Copy site configuration
sudo cp nginx/client.portnimara.dev.conf /etc/nginx/sites-available/client.portnimara.dev

# Create error pages directory
sudo mkdir -p /etc/nginx/error-pages
sudo cp nginx/error-pages/error-502.html /etc/nginx/error-pages/

# Test Nginx configuration
sudo nginx -t

# Reload Nginx
sudo nginx -s reload

Step 3: Update Environment Variables

Ensure all required environment variables are set:

# Required for health checks
NUXT_NOCODB_URL=your-nocodb-url
NUXT_NOCODB_TOKEN=your-nocodb-token
NUXT_MINIO_ACCESS_KEY=your-minio-key
NUXT_MINIO_SECRET_KEY=your-minio-secret
NUXT_DOCUMENSO_API_KEY=your-documenso-key
NUXT_DOCUMENSO_BASE_URL=https://signatures.portnimara.dev

# Optional - for enabling garbage collection monitoring
NODE_OPTIONS="--max-old-space-size=8192 --expose-gc"

Step 4: Deploy with Docker Compose

Create or update your docker-compose.yml:

version: '3.8'

services:
  client-portal:
    image: port-nimara-client-portal:resilient
    ports:
      - "3028:3000"
    environment:
      - NODE_ENV=production
      - NUXT_NOCODB_URL=${NUXT_NOCODB_URL}
      - NUXT_NOCODB_TOKEN=${NUXT_NOCODB_TOKEN}
      - NUXT_MINIO_ACCESS_KEY=${NUXT_MINIO_ACCESS_KEY}
      - NUXT_MINIO_SECRET_KEY=${NUXT_MINIO_SECRET_KEY}
      - NUXT_DOCUMENSO_API_KEY=${NUXT_DOCUMENSO_API_KEY}
      - NUXT_DOCUMENSO_BASE_URL=${NUXT_DOCUMENSO_BASE_URL}
    volumes:
      - ./logs:/app/logs
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s
    restart: unless-stopped
    networks:
      - port-nimara-network

networks:
  port-nimara-network:
    external: true

Deploy:

docker-compose up -d

Step 5: Verify Deployment

  1. Check health endpoint:
curl https://client.portnimara.dev/api/health
  1. Monitor PM2 processes:
docker exec -it <container-id> pm2 list
docker exec -it <container-id> pm2 monit
  1. Check logs:
# PM2 logs
docker exec -it <container-id> pm2 logs

# Container logs
docker logs -f <container-id>

Monitoring & Maintenance

Health Check Monitoring

The health endpoint provides comprehensive information:

  • Service availability (NocoDB, Directus, MinIO, Documenso)
  • Memory usage
  • Response times
  • Circuit breaker status

PM2 Commands

# Inside the container
pm2 list              # List all processes
pm2 monit             # Real-time monitoring
pm2 reload all        # Zero-downtime reload
pm2 logs              # View logs
pm2 flush             # Clear logs

Troubleshooting

  1. If 502 errors persist:

    • Check /api/health to identify failing services
    • Review PM2 logs for crash information
    • Verify all environment variables are set
    • Check Nginx error logs: sudo tail -f /var/log/nginx/error.log
  2. High memory usage:

    • Monitor memory via health endpoint
    • PM2 will auto-restart at 7GB limit
    • Check for memory leaks in logs
  3. Slow responses:

    • Check circuit breaker status in health endpoint
    • Review queue status for backlogs
    • Monitor external service response times

Benefits of These Changes

  1. Automatic Recovery: PM2 cluster mode ensures the app restarts on crashes
  2. Graceful Degradation: Circuit breakers prevent cascading failures
  3. Better Error Handling: Retry logic handles temporary failures
  4. Improved Visibility: Health checks and monitoring provide insights
  5. User Experience: Custom 502 page with auto-retry improves UX
  6. Resource Management: Memory monitoring and cleanup prevent leaks

Rollback Plan

If issues arise:

  1. Keep the previous Docker image tagged
  2. Revert Nginx configuration:
    sudo cp /etc/nginx/sites-available/client.portnimara.dev.backup /etc/nginx/sites-available/client.portnimara.dev
    sudo nginx -s reload
    
  3. Deploy previous Docker image:
    docker-compose down
    docker-compose up -d