Maintenance

Holden runs a nightly maintenance window that stops all containers, backs up data, and ensures all apps are reconciled with their latest configuration.

Why Stop Everything?

Every app stops during the maintenance window, even if there’s nothing to backup:

Consistent backups — Databases and volumes are backed up while stopped, ensuring data consistency
Clears memory leaks — Periodic restarts prevent gradual memory bloat
Forces stateless design — If your app can’t handle restarts, you’ll find out during maintenance, not during a crisis
Predictable window — All restarts happen at 3am (or whenever you schedule), not randomly throughout the day

The Maintenance Cycle

Queue Coordination

When maintenance starts:

Set isMaintenance = true flag
Queue worker sees the flag and stops picking up new jobs
Wait for any in-flight job to complete (e.g., a deployment mid-healthcheck)
Proceed with maintenance

The flag is never cleared — this Holden instance will die at the end of maintenance anyway. The fresh instance starts with the flag unset.

Per-App Processing

Holden processes apps one at a time to minimize total downtime:

Stop containers
Run backups
Pull new image (if update_policy: during_maintenance) — caches it for later
Start containers

After all apps are processed:

Run cleanup (see below)
Holden reboots via Overseer

The fresh Holden instance queues all apps on boot, triggering reconciliation. Apps with pre-pulled images (step 3) get updated; others are verified against their current config.

Timing

Maintenance is triggered when the cron-scheduled time passes. With schedule 0 3 * * *, maintenance runs after 3am daily.

Configuration

Configure maintenance via environment variables:

HOLDEN_MAINTENANCE_SCHEDULE="0 3 * * *"   # Cron format (default: 3am daily)
HOLDEN_BACKUP_DIR=/mnt/backups            # Where backups are stored
HOLDEN_MAX_BACKUPS=10                     # Keep this many per app (default: 10)

Field	Required	Description
`schedule`	No	Cron expression (default: `0 3 * * *`)
`backup_dir`	No	Backup staging directory (backups disabled if not set)
`max_backups`	No	Retention count per app (default: 10)

Backups

What Gets Backed Up

Needs (automatic) - All needs containers (postgres, valkey, garage) are backed up by default.

App volumes (opt-in) - Only volumes listed in backup_volumes:

services:
  web:
    volumes:
      - ./uploads:/app/uploads
      - ./cache:/app/cache

backup_volumes:
  - ./uploads # Backed up
  # ./cache is not backed up

backup_volumes is defined at the app level (not per-service) and uses host paths (the left side of volume definitions).

Directory Structure

/mnt/backups/
├── myapp/
│   ├── 2024-01-15T03:00:00Z/
│   │   ├── postgres/
│   │   ├── valkey/
│   │   └── uploads/
│   └── 2024-01-14T03:00:00Z/
│       └── ...
└── other-app/
    └── ...

Offsite Sync

Holden stages backups locally. Getting them offsite is up to you:

# /etc/cron.d/holden-offsite (runs after maintenance window)
0 4 * * * rclone sync /mnt/backups remote:holden-backups

Kopia Fast and secure open-source backup/restore tool

Restore

To restore from a maintenance backup, see Backup & Restore.

Image Updates

Apps with update_policy: during_maintenance get their images pre-pulled during step 3. This is the only time these apps check for new images—no surprise updates during the day.

Use during_maintenance for production apps where you want predictable update windows.

Cleanup

After all apps have been processed, Holden removes resources that are no longer needed.

Stale Containers

Containers with Holden labels (holden.managed=true) that don’t match any registered app are removed. This happens when you remove an app with holden app remove.

Data directories are never touched. If you want to delete an app’s data, do it manually from HOLDEN_BASE_DATA_DIR.

Empty Networks

Holden attempts to remove all networks with the holden.managed=true label. Docker refuses to remove networks that have containers attached, so only empty networks are deleted.

Dangling Images

When HOLDEN_DANGLING_IMAGES is enabled, Holden prunes dangling Docker images after maintenance.

Holden Reboot

After cleanup, Holden always spawns an Overseer to recreate itself. This happens even if Holden’s image hasn’t changed.

Why always reboot?

Fresh state — Any accumulated state or memory is cleared
Config refresh — Env var changes (like HOLDEN_PUBLIC_DOMAIN) take effect
Exercises the Overseer — Code paths that only run occasionally tend to bit-rot
Queues all apps — The fresh Holden queues all apps on boot, ensuring reconciliation

The reboot adds ~10-30 seconds of Holden unavailability (during health check). Since this happens at 3am during a maintenance window, the impact is minimal.