Maintenance
Holden runs a nightly maintenance window that stops all containers, backs up data, and ensures all apps are reconciled with their latest configuration.
Why Stop Everything?
Section titled “Why Stop Everything?”Every app stops during the maintenance window, even if there’s nothing to backup:
- Consistent backups — Databases and volumes are backed up while stopped, ensuring data consistency
- Clears memory leaks — Periodic restarts prevent gradual memory bloat
- Forces stateless design — If your app can’t handle restarts, you’ll find out during maintenance, not during a crisis
- Predictable window — All restarts happen at 3am (or whenever you schedule), not randomly throughout the day
The Maintenance Cycle
Section titled “The Maintenance Cycle”Queue Coordination
Section titled “Queue Coordination”When maintenance starts:
- Set
isMaintenance = trueflag - Queue worker sees the flag and stops picking up new jobs
- Wait for any in-flight job to complete (e.g., a deployment mid-healthcheck)
- Proceed with maintenance
The flag is never cleared — this Holden instance will die at the end of maintenance anyway. The fresh instance starts with the flag unset.
Per-App Processing
Section titled “Per-App Processing”Holden processes apps one at a time to minimize total downtime:
- Stop containers
- Run backups
- Pull new image (if
update_policy: during_maintenance) — caches it for later - Start containers
After all apps are processed:
- Run cleanup (see below)
- Holden reboots via Overseer
The fresh Holden instance queues all apps on boot, triggering reconciliation. Apps with pre-pulled images (step 3) get updated; others are verified against their current config.
Timing
Section titled “Timing”Maintenance is triggered when the cron-scheduled time passes. With schedule 0 3 * * *, maintenance runs after 3am daily.
Configuration
Section titled “Configuration”Configure maintenance via environment variables:
HOLDEN_MAINTENANCE_SCHEDULE="0 3 * * *" # Cron format (default: 3am daily)HOLDEN_BACKUP_DIR=/mnt/backups # Where backups are storedHOLDEN_MAX_BACKUPS=10 # Keep this many per app (default: 10)| Field | Required | Description |
|---|---|---|
schedule | No | Cron expression (default: 0 3 * * *) |
backup_dir | No | Backup staging directory (backups disabled if not set) |
max_backups | No | Retention count per app (default: 10) |
Backups
Section titled “Backups”What Gets Backed Up
Section titled “What Gets Backed Up”Needs (automatic) - All needs containers (postgres, valkey, garage) are backed up by default.
App volumes (opt-in) - Only volumes listed in backup_volumes:
services: web: volumes: - ./uploads:/app/uploads - ./cache:/app/cache
backup_volumes: - ./uploads # Backed up # ./cache is not backed upbackup_volumes is defined at the app level (not per-service) and uses host paths (the left side of volume definitions).
Directory Structure
Section titled “Directory Structure”/mnt/backups/├── myapp/│ ├── 2024-01-15T03:00:00Z/│ │ ├── postgres/│ │ ├── valkey/│ │ └── uploads/│ └── 2024-01-14T03:00:00Z/│ └── ...└── other-app/ └── ...Offsite Sync
Section titled “Offsite Sync”Holden stages backups locally. Getting them offsite is up to you:
# /etc/cron.d/holden-offsite (runs after maintenance window)0 4 * * * rclone sync /mnt/backups remote:holden-backupsRestore
Section titled “Restore”To restore from a maintenance backup, see Backup & Restore.
Image Updates
Section titled “Image Updates”Apps with update_policy: during_maintenance get their images pre-pulled during step 3. This is the only time these apps check for new images—no surprise updates during the day.
Use during_maintenance for production apps where you want predictable update windows.
Cleanup
Section titled “Cleanup”After all apps have been processed, Holden removes resources that are no longer needed.
Stale Containers
Section titled “Stale Containers”Containers with Holden labels (holden.managed=true) that don’t match any registered app are removed. This happens when you remove an app with holden app remove.
Data directories are never touched. If you want to delete an app’s data, do it manually from HOLDEN_BASE_DATA_DIR.
Empty Networks
Section titled “Empty Networks”Holden attempts to remove all networks with the holden.managed=true label. Docker refuses to remove networks that have containers attached, so only empty networks are deleted.
Dangling Images
Section titled “Dangling Images”When HOLDEN_DANGLING_IMAGES is enabled, Holden prunes dangling Docker images after maintenance.
Holden Reboot
Section titled “Holden Reboot”After cleanup, Holden always spawns an Overseer to recreate itself. This happens even if Holden’s image hasn’t changed.
Why always reboot?
- Fresh state — Any accumulated state or memory is cleared
- Config refresh — Env var changes (like
HOLDEN_PUBLIC_DOMAIN) take effect - Exercises the Overseer — Code paths that only run occasionally tend to bit-rot
- Queues all apps — The fresh Holden queues all apps on boot, ensuring reconciliation
The reboot adds ~10-30 seconds of Holden unavailability (during health check). Since this happens at 3am during a maintenance window, the impact is minimal.