LetsBeBiz-Redesign/docs/architecture-proposal/claude/08-CICD-STRATEGY.md

22 KiB
Raw Permalink Blame History

LetsBe Biz — CI/CD Strategy

Date: February 27, 2026 Team: Claude Opus 4.6 Architecture Team Document: 08 of 09 Status: Proposal — Competing with independent team


Table of Contents

  1. CI/CD Overview
  2. Gitea Actions Pipelines
  3. Branch Strategy
  4. Build & Publish
  5. Deployment Workflows
  6. Rollback Procedures
  7. Secret Management in CI
  8. Quality Gates in CI
  9. Monitoring & Alerting

1. CI/CD Overview

Platform: Gitea Actions

Gitea Actions is the CI/CD platform (Architecture Brief §9.1). It uses GitHub Actions-compatible YAML workflow syntax, making migration straightforward if needed later.

Pipeline Architecture

Developer pushes code
        │
        ▼
┌──────────────────┐
│  Gitea Actions    │
│  Trigger: push    │
│                   │
│  1. Lint          │
│  2. Type Check    │
│  3. Unit Tests    │
│  4. Build         │
│  5. Security Scan │
└────────┬─────────┘
         │
    ┌────┴────┐
    │ Branch? │
    └────┬────┘
         │
    ┌────┼────────────┐
    │    │             │
feature  develop      main
    │    │             │
    │    ▼             ▼
    │  Build Docker  Build Docker
    │  Push :dev     Push :latest
    │    │             │
    │    ▼             ▼
    │  Deploy to     Deploy to
    │  staging       production
    │                  │
    │                  ▼
    │              Canary rollout
    │              (tenant servers)
    │
    └─► PR required to merge

Environments

Environment Branch Trigger Purpose
Local Any Manual Developer testing
CI Any push Automatic Lint, test, type check
Staging develop Automatic on merge Integration testing, dogfooding
Production main Manual approval Live customers

2. Gitea Actions Pipelines

2.1 Monorepo CI Pipeline (All Packages)

# .gitea/workflows/ci.yml
name: CI

on:
  push:
    branches: [main, develop, 'feature/**']
  pull_request:
    branches: [main, develop]

env:
  NODE_VERSION: '22'

jobs:
  lint-and-typecheck:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}

      - name: Install dependencies
        run: npm ci

      - name: Lint
        run: npx turbo run lint

      - name: Type check
        run: npx turbo run typecheck

  unit-tests:
    runs-on: ubuntu-latest
    needs: lint-and-typecheck
    strategy:
      matrix:
        package:
          - safety-wrapper
          - secrets-proxy
          - hub
          - shared-types
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}

      - name: Install dependencies
        run: npm ci

      - name: Run tests for ${{ matrix.package }}
        run: npx turbo run test --filter=${{ matrix.package }}

  security-scan:
    runs-on: ubuntu-latest
    needs: lint-and-typecheck
    steps:
      - uses: actions/checkout@v4

      - name: Check for secrets in code
        run: |
          npx @trufflesecurity/trufflehog git file://. --only-verified --fail          

      - name: Dependency audit
        run: npm audit --audit-level=high

2.2 Safety Wrapper Pipeline

# .gitea/workflows/safety-wrapper.yml
name: Safety Wrapper

on:
  push:
    paths:
      - 'packages/safety-wrapper/**'
      - 'packages/shared-types/**'
    branches: [main, develop]

jobs:
  p0-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '22'

      - run: npm ci

      - name: P0 Secrets Redaction Tests
        run: npx turbo run test:p0 --filter=secrets-proxy

      - name: P0 Command Classification Tests
        run: npx turbo run test:p0 --filter=safety-wrapper

      - name: P1 Autonomy Tests
        run: npx turbo run test:p1 --filter=safety-wrapper

  build-image:
    runs-on: ubuntu-latest
    needs: p0-tests
    if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/develop'
    steps:
      - uses: actions/checkout@v4

      - name: Set tag
        id: tag
        run: |
          if [ "${{ github.ref }}" = "refs/heads/main" ]; then
            echo "tag=latest" >> $GITHUB_OUTPUT
          else
            echo "tag=dev" >> $GITHUB_OUTPUT
          fi          

      - name: Build Safety Wrapper image
        run: |
          docker build \
            -f packages/safety-wrapper/Dockerfile \
            -t code.letsbe.solutions/letsbe/safety-wrapper:${{ steps.tag.outputs.tag }} \
            -t code.letsbe.solutions/letsbe/safety-wrapper:${{ github.sha }} \
            .          

      - name: Push to registry
        run: |
          echo "${{ secrets.REGISTRY_PASSWORD }}" | docker login code.letsbe.solutions -u ${{ secrets.REGISTRY_USER }} --password-stdin
          docker push code.letsbe.solutions/letsbe/safety-wrapper:${{ steps.tag.outputs.tag }}
          docker push code.letsbe.solutions/letsbe/safety-wrapper:${{ github.sha }}          

2.3 Secrets Proxy Pipeline

# .gitea/workflows/secrets-proxy.yml
name: Secrets Proxy

on:
  push:
    paths:
      - 'packages/secrets-proxy/**'
      - 'packages/shared-types/**'
    branches: [main, develop]

jobs:
  p0-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22' }
      - run: npm ci

      - name: P0 Redaction Tests (must pass 100%)
        run: npx turbo run test:p0 --filter=secrets-proxy

      - name: Performance Benchmark
        run: npx turbo run test:benchmark --filter=secrets-proxy

  build-image:
    runs-on: ubuntu-latest
    needs: p0-tests
    if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/develop'
    steps:
      - uses: actions/checkout@v4

      - name: Build Secrets Proxy image
        run: |
          docker build \
            -f packages/secrets-proxy/Dockerfile \
            -t code.letsbe.solutions/letsbe/secrets-proxy:${{ github.ref == 'refs/heads/main' && 'latest' || 'dev' }} \
            .          

      - name: Push to registry
        run: |
          echo "${{ secrets.REGISTRY_PASSWORD }}" | docker login code.letsbe.solutions -u ${{ secrets.REGISTRY_USER }} --password-stdin
          docker push code.letsbe.solutions/letsbe/secrets-proxy --all-tags          

2.4 Hub Pipeline

# .gitea/workflows/hub.yml
name: Hub

on:
  push:
    paths:
      - 'packages/hub/**'
      - 'packages/shared-prisma/**'
    branches: [main, develop]

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_DB: hub_test
          POSTGRES_USER: hub
          POSTGRES_PASSWORD: testpass
        ports: ['5432:5432']
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22' }
      - run: npm ci

      - name: Run Prisma migrations
        run: npx turbo run db:push --filter=hub
        env:
          DATABASE_URL: postgresql://hub:testpass@localhost:5432/hub_test

      - name: Run tests
        run: npx turbo run test --filter=hub
        env:
          DATABASE_URL: postgresql://hub:testpass@localhost:5432/hub_test

  build-image:
    runs-on: ubuntu-latest
    needs: test
    if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/develop'
    steps:
      - uses: actions/checkout@v4
      - name: Build Hub image
        run: |
          docker build \
            -f packages/hub/Dockerfile \
            -t code.letsbe.solutions/letsbe/hub:${{ github.ref == 'refs/heads/main' && 'latest' || 'dev' }} \
            .          
      - name: Push to registry
        run: |
          echo "${{ secrets.REGISTRY_PASSWORD }}" | docker login code.letsbe.solutions -u ${{ secrets.REGISTRY_USER }} --password-stdin
          docker push code.letsbe.solutions/letsbe/hub --all-tags          

2.5 Integration Test Pipeline

# .gitea/workflows/integration.yml
name: Integration Tests

on:
  push:
    branches: [develop]
  workflow_dispatch:

jobs:
  integration:
    runs-on: ubuntu-latest
    timeout-minutes: 30
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22' }
      - run: npm ci

      - name: Start integration stack
        run: docker compose -f test/docker-compose.integration.yml up -d --wait
        timeout-minutes: 5

      - name: Wait for services
        run: |
          for i in $(seq 1 30); do
            curl -sf http://localhost:8200/health && break || sleep 2
          done          

      - name: Run integration tests
        run: npx turbo run test:integration

      - name: Collect logs on failure
        if: failure()
        run: docker compose -f test/docker-compose.integration.yml logs > integration-logs.txt

      - name: Upload logs
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: integration-logs
          path: integration-logs.txt

      - name: Teardown
        if: always()
        run: docker compose -f test/docker-compose.integration.yml down -v

3. Branch Strategy

Git Flow (Simplified)

main ─────────────────────────────────────────────────►
  │                                        ▲
  │                                        │ (merge via PR, requires approval)
  │                                        │
develop ──┬───────────┬───────────┬────────┤
           │           │           │
  feature/sw-skeleton  │  feature/hub-billing
           │           │
           │  feature/secrets-proxy
           │
  hotfix/critical-fix ──────────────────────► main (direct merge for critical fixes)

Branch Rules

Branch Protection Merge Requirements
main Protected; no direct pushes PR from develop; 1 approval; all CI checks pass; security scan pass
develop Protected; no direct pushes PR from feature branch; all CI checks pass
feature/* Unprotected Free to push; PR to develop when ready
hotfix/* Unprotected Can merge to both main and develop; 1 approval required

Naming Conventions

feature/sw-command-classification    # Safety Wrapper feature
feature/hub-tenant-api               # Hub feature
feature/mobile-chat-view             # Mobile app feature
feature/prov-step10-rewrite          # Provisioner feature
fix/secrets-proxy-jwt-detection      # Bug fix
hotfix/redaction-bypass-cve          # Critical security fix

Release Tagging

v0.1.0   # First internal milestone (M1)
v0.2.0   # M2
v0.3.0   # M3
v1.0.0   # Founding member launch (M4)
v1.0.1   # First patch
v1.1.0   # First feature update post-launch

4. Build & Publish

Docker Image Strategy

Image Registry Path Build Context Size Target
letsbe/safety-wrapper code.letsbe.solutions/letsbe/safety-wrapper packages/safety-wrapper/ <150MB
letsbe/secrets-proxy code.letsbe.solutions/letsbe/secrets-proxy packages/secrets-proxy/ <100MB
letsbe/hub code.letsbe.solutions/letsbe/hub packages/hub/ <500MB
letsbe/ansible-runner code.letsbe.solutions/letsbe/ansible-runner packages/provisioner/ Existing

Multi-Stage Dockerfile Pattern

# packages/safety-wrapper/Dockerfile
# Stage 1: Dependencies
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
COPY packages/safety-wrapper/package.json ./packages/safety-wrapper/
COPY packages/shared-types/package.json ./packages/shared-types/
RUN npm ci --workspace=packages/safety-wrapper --workspace=packages/shared-types

# Stage 2: Build
FROM node:22-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY packages/safety-wrapper/ ./packages/safety-wrapper/
COPY packages/shared-types/ ./packages/shared-types/
COPY turbo.json package.json ./
RUN npx turbo run build --filter=safety-wrapper

# Stage 3: Production
FROM node:22-alpine AS runner
WORKDIR /app
RUN addgroup -g 1001 -S letsbe && adduser -S letsbe -u 1001
COPY --from=builder /app/packages/safety-wrapper/dist ./dist
COPY --from=builder /app/packages/safety-wrapper/package.json ./
COPY --from=deps /app/node_modules ./node_modules
USER letsbe
EXPOSE 8200
CMD ["node", "dist/index.js"]

Image Tagging

Tag When Purpose
:dev On merge to develop Staging deployment
:latest On merge to main Production deployment
:<git-sha> On every build Immutable reference for debugging
:v1.0.0 On release tag Version-pinned deployment

5. Deployment Workflows

5.1 Central Platform (Hub) Deployment

# .gitea/workflows/deploy-hub.yml
name: Deploy Hub

on:
  push:
    branches: [main]
    paths: ['packages/hub/**', 'packages/shared-prisma/**']

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production
    steps:
      - name: Deploy to production
        run: |
          ssh -o StrictHostKeyChecking=no deploy@hub.letsbe.biz << 'EOF'
            cd /opt/letsbe/hub
            docker compose pull hub
            docker compose up -d hub
            # Wait for health check
            for i in $(seq 1 30); do
              curl -sf http://localhost:3847/api/health && break || sleep 2
            done
            # Run migrations
            docker compose exec hub npx prisma migrate deploy
          EOF          

5.2 Tenant Server Update Pipeline

Tenant servers are updated via the Hub push mechanism (see 03-DEPLOYMENT-STRATEGY §7).

# .gitea/workflows/tenant-update.yml
name: Tenant Server Update

on:
  workflow_dispatch:
    inputs:
      component:
        description: 'Component to update'
        required: true
        type: choice
        options: [safety-wrapper, secrets-proxy, openclaw]
      strategy:
        description: 'Rollout strategy'
        required: true
        type: choice
        options: [staging-only, canary-5pct, canary-25pct, full-rollout]

jobs:
  prepare:
    runs-on: ubuntu-latest
    steps:
      - name: Verify image exists
        run: |
          docker manifest inspect code.letsbe.solutions/letsbe/${{ inputs.component }}:latest          

  rollout:
    runs-on: ubuntu-latest
    needs: prepare
    steps:
      - name: Trigger Hub rollout API
        run: |
          curl -X POST https://hub.letsbe.biz/api/v1/admin/rollout \
            -H "Authorization: Bearer ${{ secrets.HUB_ADMIN_TOKEN }}" \
            -H "Content-Type: application/json" \
            -d '{
              "component": "${{ inputs.component }}",
              "tag": "latest",
              "strategy": "${{ inputs.strategy }}"
            }'          

5.3 Staging Deployment (Automatic)

# .gitea/workflows/deploy-staging.yml
name: Deploy Staging

on:
  push:
    branches: [develop]

jobs:
  deploy-staging:
    runs-on: ubuntu-latest
    environment: staging
    steps:
      - name: Deploy Hub to staging
        run: |
          ssh deploy@staging.letsbe.biz << 'EOF'
            cd /opt/letsbe/hub
            docker compose pull
            docker compose up -d
            docker compose exec hub npx prisma migrate deploy
          EOF          

      - name: Deploy tenant stack to staging VPS
        run: |
          ssh deploy@staging-tenant.letsbe.biz << 'EOF'
            cd /opt/letsbe
            docker compose -f docker-compose.letsbe.yml pull
            docker compose -f docker-compose.letsbe.yml up -d
          EOF          

      - name: Run smoke tests
        run: |
          curl -sf https://staging.letsbe.biz/api/health
          curl -sf https://staging-tenant.letsbe.biz:8200/health
          curl -sf https://staging-tenant.letsbe.biz:8100/health          

6. Rollback Procedures

6.1 Hub Rollback

# Rollback Hub to previous version
ssh deploy@hub.letsbe.biz << 'EOF'
  cd /opt/letsbe/hub

  # Find previous image
  PREVIOUS=$(docker compose images hub --format '{{.Tag}}' | head -1)

  # Pull and deploy previous
  docker compose pull hub  # Uses previous :latest from registry
  docker compose up -d hub

  # Verify health
  for i in $(seq 1 30); do
    curl -sf http://localhost:3847/api/health && break || sleep 2
  done

  # Note: Prisma migrations are forward-only.
  # If a migration needs reverting, use prisma migrate resolve.
EOF

6.2 Tenant Component Rollback

# Rollback Safety Wrapper on a specific tenant
ssh deploy@tenant-server << 'EOF'
  cd /opt/letsbe

  # Roll back to pinned SHA
  docker compose -f docker-compose.letsbe.yml \
    -e SAFETY_WRAPPER_TAG=<previous-sha> \
    up -d safety-wrapper

  # Verify health
  curl -sf http://127.0.0.1:8200/health
EOF

6.3 Rollback Decision Matrix

Symptom Action Automatic?
Health check fails after deploy Rollback to previous image Yes (Docker restart policy pulls previous on repeated failure)
P0 tests fail in CI Block merge; no deployment Yes (CI gate)
Secrets redaction miss detected EMERGENCY: rollback all tenants immediately Manual (requires admin trigger)
Hub API errors >5% Rollback Hub to previous version Manual (monitoring alert)
Billing discrepancy Investigate first; rollback billing code if confirmed Manual

6.4 Emergency Rollback Checklist

For critical security issues (e.g., redaction bypass):

  1. STOP all tenant updates immediately (disable Hub rollout API)
  2. ROLLBACK all affected components to last known-good version
  3. VERIFY rollback successful (health checks, P0 tests)
  4. INVESTIGATE root cause
  5. FIX and add test case for the specific failure
  6. AUDIT all tenants for potential exposure during the window
  7. NOTIFY affected customers if secrets were potentially exposed
  8. POST-MORTEM within 24 hours

7. Secret Management in CI

Gitea Secrets Configuration

Secret Scope Purpose
REGISTRY_USER Organization Docker registry login
REGISTRY_PASSWORD Organization Docker registry password
HUB_ADMIN_TOKEN Repository Hub API authentication for deployments
STAGING_SSH_KEY Repository SSH key for staging deployment
PRODUCTION_SSH_KEY Repository SSH key for production deployment
STRIPE_TEST_KEY Repository Stripe test mode for integration tests

Rules

  1. Never put secrets in workflow YAML files
  2. Never echo secrets in CI logs (use ::add-mask::)
  3. Never pass secrets as command-line arguments (use environment variables)
  4. SSH keys: use deploy keys with minimal permissions (read-only for CI, write for deploy)
  5. Rotate all CI secrets quarterly

8. Quality Gates in CI

Gate Configuration

# In each pipeline, quality gates are enforced as job dependencies:

jobs:
  # Gate 1: Code quality
  lint:
    # Must pass before tests run
    ...

  typecheck:
    # Must pass before tests run
    ...

  # Gate 2: Correctness
  unit-tests:
    needs: [lint, typecheck]
    # Must pass before build
    ...

  # Gate 3: Security
  security-scan:
    needs: [lint]
    # Must pass before deploy
    ...

  # Gate 4: Build
  build:
    needs: [unit-tests, security-scan]
    # Must succeed before deploy
    ...

  # Gate 5: Deploy (only on protected branches)
  deploy:
    needs: [build]
    if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/develop'
    ...

PR Merge Requirements

Requirement Enforcement
All CI checks pass Gitea branch protection rule
At least 1 approval Gitea branch protection rule
No unresolved review comments Convention (not enforced by Gitea)
P0 tests pass if security code changed CI pipeline condition
No secrets detected in diff trufflehog scan

9. Monitoring & Alerting

CI Pipeline Monitoring

Metric Alert Threshold Action
Build duration >15 min Investigate; optimize caching
Test suite duration >10 min Investigate; parallelize tests
Failed builds on develop >3 consecutive Freeze merges; investigate
Failed deploys Any Automatic rollback; notify team
Security scan findings Any critical Block merge; assign to Security Lead

Deployment Monitoring

Metric Alert Threshold Action
Hub health after deploy Unhealthy for >60s Automatic rollback
Tenant health after update Unhealthy for >120s Rollback specific tenant; pause rollout
Error rate post-deploy >5% increase Alert team; investigate
Latency post-deploy >2× baseline Alert team; investigate

Notification Channels

Event Channel
CI failure on main Team chat (immediate)
Security scan finding Team chat + email to Security Lead
Deployment success Team chat (informational)
Deployment failure Team chat + email to on-call
Emergency rollback Team chat + phone call to on-call

End of Document — 08 CI/CD Strategy