# LetsBe Biz — Infrastructure Runbook **Version:** 1.0 **Date:** February 26, 2026 **Authors:** Matt (Founder), Claude (Architecture) **Status:** Engineering Spec — Ready for Implementation **Companion docs:** Technical Architecture v1.2, Tool Catalog v2.2, Security & GDPR Framework v1.1 **Decision refs:** Foundation Document Decisions #18, #27 --- ## 1. Purpose This runbook is the operational reference for provisioning, managing, monitoring, and maintaining LetsBe Biz infrastructure. It covers the full lifecycle: from ordering a VPS through Netcup to deprovisioning a customer's server at account termination. **Target audience:** Matt (operations), future engineering team, and the IT Admin AI agent (for self-referencing operational procedures). --- ## 2. Infrastructure Overview ### 2.1 Hosting Provider: Netcup | Item | Detail | |------|--------| | **Provider** | Netcup GmbH (Karlsruhe, Germany) | | **Product line** | VPS (Virtual Private Server) | | **EU data center** | Netcup Nürnberg/Karlsruhe, Germany | | **NA data center** | Netcup Manassas, Virginia, USA | | **API** | SCP (Server Control Panel) REST API with OAuth2 Device Flow | | **Hub integration** | Full — server ordering, power actions, metrics, snapshots, rescue mode via `netcupService.ts` | ### 2.2 Server Tiers | Tier | vCPUs | RAM | Disk | Recommended Tools | Monthly Cost (est.) | |------|-------|-----|------|-------------------|---------------------| | Lite (€29) | 4 | 8 GB | 160 GB SSD | 5–8 tools | ~€8–12 | | Build (€45) | 8 | 16 GB | 320 GB SSD | 10–15 tools | ~€14–18 | | Scale (€75) | 12 | 32 GB | 640 GB SSD | 15–25 tools | ~€22–28 | | Enterprise (€109) | 16 | 64 GB | 1.2 TB SSD | 28+ tools | ~€35–45 | ### 2.3 Network Architecture ``` Internet │ ▼ Netcup VPS (public IP) │ ├── Port 80 (HTTP → 301 redirect to HTTPS) ├── Port 443 (HTTPS → nginx reverse proxy) ├── Port 22022 (SSH — hardened, key-only) │ ▼ nginx (Alpine container) │ ├── *.{{domain}} → Route by subdomain to tool containers │ ├── files.{{domain}} → 127.0.0.1:3023 (Nextcloud) │ ├── crm.{{domain}} → 127.0.0.1:3025 (Odoo) │ ├── chat.{{domain}} → 127.0.0.1:3026 (Chatwoot) │ ├── blog.{{domain}} → 127.0.0.1:3029 (Ghost) │ ├── mail.{{domain}} → 127.0.0.1:3031 (Stalwart Mail) │ ├── ... (33 nginx configs total) │ └── status.{{domain}} → 127.0.0.1:3008 (Uptime Kuma) │ └── Internal only (not exposed via nginx): ├── 127.0.0.1:18789 (OpenClaw Gateway) ├── 127.0.0.1:8100 (Secrets Proxy) └── Various internal tool ports ``` --- ## 3. Provisioning Pipeline ### 3.1 End-to-End Flow ``` Customer signs up → Stripe payment → Hub creates Order │ ▼ Hub Automation Worker (state machine) │ ├── PAYMENT_CONFIRMED → order VPS from Netcup (if AUTO mode) ├── AWAITING_SERVER → poll Netcup until VPS is ready ├── SERVER_READY → wait for DNS records ├── DNS_PENDING → verify A records for all subdomains ├── DNS_READY → trigger provisioning ├── PROVISIONING → spawn Docker provisioner container │ │ │ ▼ │ letsbe-provisioner (10-step pipeline via SSH) │ ├── Step 1: System packages (apt update, essentials) │ ├── Step 2: Docker CE installation │ ├── Step 3: Disable conflicting services │ ├── Step 4: nginx + fallback config │ ├── Step 5: UFW firewall (80, 443, 22022) │ ├── Step 6: Admin user + SSH key (optional) │ ├── Step 7: SSH hardening (port 22022, key-only) │ ├── Step 8: Unattended security updates │ ├── Step 9: Deploy tool stacks (docker-compose) │ └── Step 10: Deploy OpenClaw + Safety Wrapper + bootstrap │ ├── FULFILLED → server is live, customer notified └── FAILED → retry logic (1min / 5min / 15min backoff, max 3 attempts) ``` ### 3.2 Provisioner Detail (setup.sh) **Location:** `letsbe-provisioner/scripts/setup.sh` (~832 lines) #### Step 1: System Packages ```bash apt-get update && apt-get upgrade -y apt-get install -y curl wget gnupg2 ca-certificates lsb-release apt-transport-https \ software-properties-common unzip jq htop iotop net-tools dnsutils certbot \ python3-certbot-nginx fail2ban rclone ``` #### Step 2: Docker CE ```bash curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" > /etc/apt/sources.list.d/docker.list apt-get update && apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin systemctl enable --now docker ``` #### Step 3: Disable Conflicting Services ```bash systemctl stop apache2 2>/dev/null || true systemctl disable apache2 2>/dev/null || true systemctl stop postfix 2>/dev/null || true systemctl disable postfix 2>/dev/null || true ``` #### Step 4: nginx Deploy nginx Alpine container with initial fallback config. SSL certificates provisioned via certbot after DNS is verified. #### Step 5: UFW Firewall ```bash ufw default deny incoming ufw default allow outgoing ufw allow 80/tcp # HTTP ufw allow 443/tcp # HTTPS ufw allow 22022/tcp # SSH (hardened port) ufw allow 25/tcp # SMTP (Stalwart Mail) ufw allow 587/tcp # SMTP submission ufw allow 993/tcp # IMAPS ufw --force enable ``` #### Step 6: Admin User ```bash useradd -m -s /bin/bash -G docker letsbe-admin mkdir -p /home/letsbe-admin/.ssh echo "{{admin_ssh_public_key}}" > /home/letsbe-admin/.ssh/authorized_keys chmod 700 /home/letsbe-admin/.ssh chmod 600 /home/letsbe-admin/.ssh/authorized_keys chown -R letsbe-admin:letsbe-admin /home/letsbe-admin/.ssh ``` #### Step 7: SSH Hardening ```bash # /etc/ssh/sshd_config modifications: Port 22022 PermitRootLogin no PasswordAuthentication no PubkeyAuthentication yes MaxAuthTries 3 LoginGraceTime 30 AllowUsers letsbe-admin ``` #### Step 8: Unattended Security Updates ```bash apt-get install -y unattended-upgrades dpkg-reconfigure -plow unattended-upgrades # Configure /etc/apt/apt.conf.d/50unattended-upgrades for security-only updates ``` #### Step 9: Deploy Tool Stacks For each tool selected by the customer: ```bash # 1. Generate credentials (env_setup.sh) # 50+ secrets: database passwords, admin tokens, API keys, JWT secrets # Written to /opt/letsbe/env/credentials.env and per-tool .env files # 2. Deploy Docker Compose stacks for stack in {{selected_tools}}; do cd /opt/letsbe/stacks/$stack docker compose up -d done # 3. Deploy nginx configs per tool for conf in {{selected_nginx_configs}}; do cp /opt/letsbe/nginx/sites/$conf /etc/nginx/sites-enabled/ done nginx -t && nginx -s reload # 4. Request SSL certificates certbot --nginx -d "*.{{domain}}" --non-interactive --agree-tos -m "ssl@{{domain}}" ``` #### Step 10: Deploy OpenClaw + Safety Wrapper + Bootstrap ```bash # 1. Deploy OpenClaw container with Safety Wrapper extension pre-installed cd /opt/letsbe/stacks/openclaw docker compose up -d # 2. Deploy Secrets Proxy cd /opt/letsbe/stacks/secrets-proxy docker compose up -d # 3. Seed secrets registry from credentials.env docker exec letsbe-openclaw /opt/letsbe/scripts/seed-secrets.sh # 4. Generate tool-registry.json from deployed tools docker exec letsbe-openclaw /opt/letsbe/scripts/generate-tool-registry.sh # 5. Deploy SOUL.md files for each agent # (generated from templates with tenant variables substituted) # 6. Run initial setup browser automations # (Cal.com, Chatwoot, Keycloak, Nextcloud, Stalwart Mail, Umami, Uptime Kuma) # 7. Register with Hub docker exec letsbe-openclaw /opt/letsbe/scripts/hub-register.sh # 8. Clean up config.json (CRITICAL: remove plaintext passwords) rm -f /opt/letsbe/config.json ``` ### 3.3 Credential Generation (env_setup.sh) **Location:** `letsbe-provisioner/scripts/env_setup.sh` (~678 lines) Generates 50+ unique credentials per tenant: | Category | Count | Examples | |----------|-------|---------| | Database passwords | 18 | PostgreSQL passwords for each tool with a DB | | Admin passwords | 12 | Nextcloud admin, Keycloak admin, Odoo admin, etc. | | API tokens | 10 | NocoDB API token, Ghost admin API key, etc. | | JWT secrets | 5 | Chatwoot, Cal.com, OpenClaw, etc. | | Encryption keys | 3 | Safety Wrapper registry key, backup encryption key | | SSH keys | 2 | Admin key pair, Hub communication key | | SMTP credentials | 2 | Stalwart Mail admin, relay credentials | **Generation method:** `openssl rand -base64 32` for passwords, `openssl rand -hex 32` for tokens, `ssh-keygen -t ed25519` for SSH keys. **Template rendering:** All `{{ variable }}` placeholders in Docker Compose files and nginx configs are substituted with generated values. ### 3.4 Post-Provisioning Verification After step 10 completes, the provisioner runs health checks: ```bash # 1. Verify all containers are running docker ps --format "{{.Names}}: {{.Status}}" | grep -v "Up" && exit 1 # 2. Verify nginx is serving curl -sf https://{{domain}} > /dev/null || exit 1 # 3. Verify each tool's health endpoint for tool in {{health_check_urls}}; do curl -sf "$tool" > /dev/null || echo "WARNING: $tool not responding" done # 4. Verify Safety Wrapper registered with Hub curl -sf http://127.0.0.1:8100/health || exit 1 # 5. Verify OpenClaw is responsive curl -sf http://127.0.0.1:18789/health || exit 1 # 6. Report success to Hub curl -X PATCH "{{hub_url}}/api/v1/jobs/{{job_id}}" \ -H "Authorization: Bearer {{runner_token}}" \ -d '{"status": "COMPLETED"}' ``` --- ## 4. Backup System ### 4.1 Backup Architecture **Location:** `letsbe-provisioner/scripts/backups.sh` (~473 lines) **Schedule:** Daily via cron at 02:00 server local time **Retention:** 7 daily backups + 4 weekly backups (rolling) ### 4.2 What Gets Backed Up | Component | Method | Target | |-----------|--------|--------| | PostgreSQL databases (18) | `pg_dump --format=custom` | `/opt/letsbe/backups/daily/` | | MySQL databases (2) | `mysqldump --single-transaction` | `/opt/letsbe/backups/daily/` | | MongoDB databases (1) | `mongodump --archive` | `/opt/letsbe/backups/daily/` | | Nextcloud files | rsync snapshot | `/opt/letsbe/backups/daily/nextcloud/` | | Docker volumes (critical) | `docker run --volumes-from` tar | `/opt/letsbe/backups/daily/volumes/` | | nginx configs | tar archive | `/opt/letsbe/backups/daily/nginx/` | | OpenClaw state | tar of `~/.openclaw/` | `/opt/letsbe/backups/daily/openclaw/` | | Safety Wrapper state | SQLite backup API | `/opt/letsbe/backups/daily/safety-wrapper/` | | Credentials | Encrypted tar | `/opt/letsbe/backups/daily/credentials.enc` | ### 4.3 Remote Backup After local backup completes, `rclone` syncs to a remote destination: ```bash rclone sync /opt/letsbe/backups/ remote:backups/{{tenant_id}}/ \ --transfers 4 \ --checkers 8 \ --fast-list \ --log-file /var/log/letsbe/rclone.log ``` Remote destination options (configured per tenant): - Netcup S3 (default) - Customer-provided S3 bucket - Customer-provided rclone remote ### 4.4 Backup Status Reporting After each backup run, `backups.sh` writes a `backup-status.json`: ```json { "timestamp": "2026-02-26T02:15:00Z", "status": "success", "duration_seconds": 847, "databases_backed_up": 21, "files_backed_up": true, "remote_sync": "success", "total_size_gb": 4.2, "errors": [] } ``` The Safety Wrapper monitors this file (Decision #27) and reports status to the Hub via heartbeat. ### 4.5 Backup Rotation ```bash # Daily: keep last 7 find /opt/letsbe/backups/daily/ -maxdepth 1 -mtime +7 -exec rm -rf {} \; # Weekly: copy Sunday's backup to weekly/, keep last 4 if [ "$(date +%u)" = "7" ]; then cp -a /opt/letsbe/backups/daily/ /opt/letsbe/backups/weekly/$(date +%Y-%m-%d)/ fi find /opt/letsbe/backups/weekly/ -maxdepth 1 -mtime +28 -exec rm -rf {} \; ``` --- ## 5. Restore Procedures ### 5.1 Per-Tool Restore **Location:** `letsbe-provisioner/scripts/restore.sh` (~512 lines) ```bash # Restore a specific tool's database from a daily backup ./restore.sh --tool nextcloud --date 2026-02-25 # Steps: # 1. Stop the tool container # 2. Restore database from backup # 3. Restore files (if applicable) # 4. Start the tool container # 5. Verify health check # 6. Report to Hub ``` ### 5.2 Full Server Restore For complete server recovery (e.g., VPS failure): ``` 1. Order new VPS from Netcup (same region, same tier) 2. Run provisioner with --restore flag - Steps 1-8: Standard server setup - Step 9: Deploy tool stacks (empty) - Step 10: Deploy OpenClaw + Safety Wrapper 3. Restore from remote backup: rclone sync remote:backups/{{tenant_id}}/latest/ /opt/letsbe/backups/daily/ 4. Run restore.sh --all - Restores all 21 databases - Restores all file volumes - Restores OpenClaw state - Restores Safety Wrapper secrets registry - Restores credentials 5. Verify all tools are healthy 6. Update DNS if IP changed 7. Hub updates server connection record ``` ### 5.3 Point-in-Time Recovery For accidental data deletion by a user: ``` 1. Identify the backup date that contains the needed data 2. Restore the specific tool to a temporary container: ./restore.sh --tool odoo --date 2026-02-23 --target temp 3. Extract the needed data from the temp container 4. Import the data into the production tool 5. Remove the temp container ``` --- ## 6. Monitoring ### 6.1 Uptime Kuma (On-Tenant) Each tenant VPS runs Uptime Kuma monitoring all local services: | Monitor | Type | Interval | Alert Threshold | |---------|------|----------|-----------------| | nginx | HTTP(S) | 60s | 3 failures | | Each tool container | HTTP | 120s | 3 failures | | OpenClaw Gateway | HTTP (port 18789) | 60s | 2 failures | | Secrets Proxy | HTTP (port 8100) | 60s | 2 failures | | SSL certificate expiry | Certificate | Daily | 14 days before expiry | | Disk usage | Push | 300s | >85% | ### 6.2 Hub-Level Monitoring The Hub monitors all tenant servers centrally: | Metric | Source | Check Interval | Alert | |--------|--------|---------------|-------| | Heartbeat received | Safety Wrapper | Expected every 5 min | Missing >15 min | | Token usage rate | Safety Wrapper heartbeat | Every heartbeat | >90% pool consumed | | Backup status | Safety Wrapper (reads backup-status.json) | Daily | Any backup failure | | Container health | Portainer API (via Hub) | Every 10 min | Container crash/OOM | | VPS metrics | Netcup SCP API | Every 15 min | CPU >90% sustained, disk >90% | | OpenClaw version | Safety Wrapper heartbeat | Every heartbeat | Version mismatch with expected | ### 6.3 GlitchTip (Error Tracking) GlitchTip runs on each tenant and captures application errors from: - OpenClaw (Node.js errors, unhandled rejections) - Safety Wrapper (hook errors, tool execution failures) - Tool containers that support Sentry-compatible error reporting ### 6.4 Diun (Container Update Notifications) Diun monitors all Docker images for new releases: ```yaml # /opt/letsbe/stacks/diun/docker-compose.yml watch: schedule: "0 6 * * *" # Check daily at 06:00 notif: webhook: endpoint: "http://127.0.0.1:8100/webhooks/diun" # Safety Wrapper method: POST ``` The Safety Wrapper receives update notifications and: 1. Logs the available update 2. Reports to Hub via heartbeat 3. Does NOT auto-update (updates require IT Admin agent or manual action) --- ## 7. Maintenance Procedures ### 7.1 Tool Updates Tool container updates are initiated by the IT Admin agent or manually: ```bash # 1. Pull new image cd /opt/letsbe/stacks/{{tool}} docker compose pull # 2. Backup the tool's database ./backups.sh --tool {{tool}} # 3. Rolling update docker compose up -d --force-recreate # 4. Verify health check curl -sf http://127.0.0.1:{{port}}/health # 5. If health check fails, rollback: docker compose down docker tag {{tool}}:previous {{tool}}:latest docker compose up -d ``` ### 7.2 OpenClaw Updates OpenClaw is pinned to a tested release tag. Update procedure: ```bash # 1. Check upstream changelog for breaking changes # 2. Test in staging VPS first # 3. On tenant VPS: cd /opt/letsbe/stacks/openclaw # 4. Backup OpenClaw state tar czf /opt/letsbe/backups/openclaw-pre-update.tar.gz ~/.openclaw/ # 5. Update image tag in docker-compose.yml sed -i 's/openclaw:v2026.2.1/openclaw:v2026.3.0/' docker-compose.yml # 6. Pull and recreate docker compose pull && docker compose up -d --force-recreate # 7. Verify curl -sf http://127.0.0.1:18789/health docker exec letsbe-openclaw openclaw --version # 8. If verification fails, rollback: docker compose down sed -i 's/openclaw:v2026.3.0/openclaw:v2026.2.1/' docker-compose.yml docker compose up -d tar xzf /opt/letsbe/backups/openclaw-pre-update.tar.gz -C / ``` **Update cadence:** Monthly review of upstream changelog. Update only for security fixes or features we need. Never update on Fridays. ### 7.3 SSL Certificate Renewal Let's Encrypt certificates auto-renew via certbot cron. Manual renewal if needed: ```bash certbot renew --nginx --force-renewal nginx -t && nginx -s reload ``` ### 7.4 Credential Rotation The IT Admin agent can rotate credentials for any tool: ```bash # 1. Generate new credential NEW_PASS=$(openssl rand -base64 32) # 2. Update the tool's .env file sed -i "s/DB_PASSWORD=.*/DB_PASSWORD=$NEW_PASS/" /opt/letsbe/stacks/{{tool}}/.env # 3. Update the database user's password docker exec {{tool}}-db psql -c "ALTER USER {{user}} PASSWORD '$NEW_PASS';" # 4. Restart the tool container docker compose -f /opt/letsbe/stacks/{{tool}}/docker-compose.yml restart # 5. Update the secrets registry # (Safety Wrapper detects .env change and updates registry automatically) # 6. Verify tool health curl -sf http://127.0.0.1:{{port}}/health ``` ### 7.5 Disk Space Management When disk usage exceeds 85%: ```bash # 1. Check disk usage by directory du -sh /opt/letsbe/stacks/* | sort -rh | head -20 du -sh /opt/letsbe/backups/* | sort -rh # 2. Clean Docker resources docker system prune -f # Remove stopped containers, unused networks docker image prune -a -f # Remove unused images docker volume prune -f # Remove unused volumes (CAREFUL: verify first) # 3. Clean old logs find /var/log -name "*.gz" -mtime +30 -delete docker container ls -a --format "{{.Names}}" | xargs -I {} docker logs {} --since 720h 2>/dev/null | wc -l # 4. Clean old backups (if rotation isn't catching them) find /opt/letsbe/backups/daily/ -maxdepth 1 -mtime +7 -exec rm -rf {} \; # 5. If still above 85%, recommend tier upgrade to user ``` --- ## 8. Deprovisioning ### 8.1 Customer Cancellation Flow ``` Customer requests cancellation │ ▼ Hub: 48-hour cooling-off period │ (Customer can cancel the cancellation) ▼ Hub: 30-day data export window begins │ Customer can: │ - Download files via Nextcloud │ - Export CRM data via Odoo │ - Export email via IMAP │ - SSH into server for full access │ - Request a full backup via Hub ▼ Hub: After 30 days → trigger deprovisioning │ ├── Revoke Safety Wrapper Hub API key ├── Stop all containers ├── Delete remote backups (rclone purge) ├── Request VPS deletion via Netcup API │ └── Netcup wipes disk and destroys VPS ├── Delete all Netcup snapshots ├── Remove DNS records └── Hub: soft-delete account data, retain billing records (7 years per HGB §257) ``` ### 8.2 Emergency Server Isolation If a tenant VPS is compromised or abusing the platform: ```bash # 1. Revoke Hub API key immediately (Hub admin panel) # 2. SSH into server (port 22022): ssh -p 22022 letsbe-admin@{{server_ip}} # 3. Stop the AI runtime docker stop letsbe-openclaw letsbe-secrets-proxy # 4. Block outbound traffic (except SSH) ufw deny out to any ufw allow out to any port 22022 # 5. Take a forensic snapshot via Netcup API # 6. Assess and decide: remediate or deprovision ``` --- ## 9. Disaster Recovery ### 9.1 Scenarios | Scenario | RTO | RPO | Procedure | |----------|-----|-----|-----------| | Single container crash | <5 min | 0 (no data loss) | Auto-restart via Docker restart policy | | Multiple container failure | <30 min | 0 | IT Admin agent investigates, restarts services | | VPS disk corruption | 2–4 hours | 24 hours (last backup) | New VPS + restore from remote backup | | VPS total loss | 2–4 hours | 24 hours | New VPS (same region) + restore | | Netcup data center outage | 4–8 hours | 24 hours | New VPS in alternate region + restore | | Hub outage | <1 hour | 0 (tenant VPS operates independently) | Hub restart/failover | | OpenRouter outage | <5 min | 0 | Model fallback chain engages automatically | ### 9.2 Tenant VPS Operates Independently A key architectural property: **tenant VPS continues operating even if the Hub is down.** The Safety Wrapper operates with its local config, the AI agents continue serving the user, and tools continue running. The Hub is needed only for: - Billing and subscription management - Config updates (new agents, autonomy changes) - Approval queue (if approvals are routed through Hub instead of local) - Monitoring dashboards ### 9.3 Recovery Testing **Monthly:** Restore a random tool's database from backup on a staging VPS to verify backup integrity. **Quarterly:** Full server restore drill — order a new VPS, run complete restore from remote backup, verify all tools and agents are functional. --- ## 10. Security Operations ### 10.1 SSH Access Audit ```bash # Review successful SSH logins journalctl -u sshd --since "7 days ago" | grep "Accepted" # Review failed SSH attempts journalctl -u sshd --since "7 days ago" | grep "Failed" # Check fail2ban status fail2ban-client status sshd ``` ### 10.2 Container Security ```bash # Check for containers running as root (should be minimal) docker ps --format "{{.Names}}" | xargs -I {} docker inspect {} --format "{{.Config.User}}" # Check for containers with excessive privileges docker ps --format "{{.Names}}" | xargs -I {} docker inspect {} --format "{{.HostConfig.Privileged}}" # Verify network isolation docker network ls docker network inspect bridge ``` ### 10.3 Vulnerability Scanning ```bash # Scan Docker images for known vulnerabilities (using Trivy) docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \ aquasec/trivy image --severity HIGH,CRITICAL {{image_name}} # Scan all running containers docker ps --format "{{.Image}}" | sort -u | while read img; do trivy image --severity HIGH,CRITICAL "$img" done ``` ### 10.4 Incident Response Checklist ``` [ ] 1. Contain: Isolate affected VPS (Section 8.2) [ ] 2. Assess: Determine scope (which data, which users affected) [ ] 3. Preserve: Take forensic snapshot before changes [ ] 4. Notify: Hub alerts → Matt → customer (within timelines per GDPR Art. 33/34) [ ] 5. Remediate: Fix the vulnerability, rotate compromised credentials [ ] 6. Restore: From clean backup if data was corrupted [ ] 7. Verify: Full health check on all services [ ] 8. Document: Post-mortem with root cause, timeline, actions taken [ ] 9. Improve: Update runbook/monitoring to prevent recurrence ``` --- ## 11. Common Operations Quick Reference | Task | Command / Procedure | |------|---------------------| | Check all containers | `docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"` | | Restart a tool | `cd /opt/letsbe/stacks/{{tool}} && docker compose restart` | | View tool logs | `docker logs --tail 100 -f {{container_name}}` | | Check disk usage | `df -h /opt/letsbe` | | Check RAM usage | `free -h` | | Run manual backup | `/opt/letsbe/scripts/backups.sh` | | Restore a tool | `/opt/letsbe/scripts/restore.sh --tool {{tool}} --date YYYY-MM-DD` | | Check SSL expiry | `certbot certificates` | | Renew SSL | `certbot renew --nginx` | | Check Safety Wrapper | `curl http://127.0.0.1:8100/health` | | Check OpenClaw | `curl http://127.0.0.1:18789/health` | | View backup status | `cat /opt/letsbe/backups/backup-status.json \| jq` | | Check firewall | `ufw status verbose` | | Check fail2ban | `fail2ban-client status sshd` | --- ## 12. Changelog | Version | Date | Changes | |---------|------|---------| | 1.0 | 2026-02-26 | Initial runbook. Covers: Netcup provisioning, 10-step pipeline, credential generation, backup/restore, monitoring stack, maintenance procedures, deprovisioning, disaster recovery, security operations, quick reference. |