Reconciliation
Reconciliation compares desired state (from config) to running state (from Docker) and makes them match. This is the core of what Holden does when processing an app from the queue.
The Process
Section titled “The Process”When an app is taken from the queue, Holden first checks whether anything has changed. If not, the app is skipped entirely.
flowchart TD
A["Quick check: containers, git, image digests"] --> B{Changed?}
B -->|No| C["Skip"]
B -->|Yes| D["Fetch config from git"]
D --> E["Resolve variables"]
E --> F["Compare desired vs running"]
F --> G["Apply changes"]
Quick Check
Section titled “Quick Check”Before doing a full reconcile, Holden checks for changes cheaply:
- Container state — Are all expected containers running? If any are missing or stopped, reconcile.
- Git changes —
git ls-remotecompares the remote HEAD to the last deployed commit. No clone needed. - Image digests — For apps with
update_policy: always(the default), check the registry for newer images.
If all checks pass, the app is skipped — no clone, no diff, no Docker API calls.
The quick check only runs when Holden has already reconciled the app at least once (it needs a previous state to compare against). First-time reconciliations always go through the full process.
Git network errors trigger a full reconcile (fail-open — don’t skip when you can’t verify). Image digest errors skip the check (fail-closed — don’t trigger a reconcile you might not need).
Full Reconcile
Section titled “Full Reconcile”When the quick check detects changes (or can’t run), Holden does a full reconcile:
- Fetch config from git (sparse checkout — only
holden.ymlandholden.vars.yml) - Reconcile needs (postgres, valkey, etc.) — health-checked before proceeding
- Resolve variables (
${needs.*},${config.*},${secret.*}) - Pull images — force-pull when
update_policyisalways, skip if image exists locally otherwise - Compare desired state to running state
- Apply changes
Webhooks, holden deploy, and holden deploy <app> always trigger a full reconcile with force-pull — they bypass the quick check.
Comparing State
Section titled “Comparing State”For each service in the config, Holden compares desired state to running state:
| Situation | Action |
|---|---|
| New service (not running) | Pull image, create container |
| Config changed | Zero-downtime update if health check defined, otherwise stop-and-recreate |
| New image available | Zero-downtime update if health check defined, otherwise stop-and-recreate |
| No changes | Nothing |
| Service removed from config | Remove container |
Running state is discovered via labels — containers with holden.managed=true.
Needs First
Section titled “Needs First”Needs containers (postgres, valkey, etc.) are reconciled first and must pass health checks before services start. Holden waits up to 45 seconds for each needs container to become healthy — if the timeout expires, the deploy fails. This ensures databases are ready before your app tries to connect.
Restart Policy
Section titled “Restart Policy”All containers are created with restart: unless-stopped — this isn’t configurable. Holden manages services that should be running; the restart policy handles what Docker does between reconciliation runs.
For one-shot tasks, use docker run directly.
Queue Isolation
Section titled “Queue Isolation”The queue worker processes apps sequentially. If app A takes a long time (slow image pull, hanging health check), apps B, C, D wait their turn.
If a webhook or poll timer pushes app A again while it’s still being processed, the push is a no-op (queue is deduplicated). Once the current reconciliation finishes, app A won’t be in the queue unless something pushed it again after completion.
Crash Recovery
Section titled “Crash Recovery”If Holden crashes mid-deploy, it may leave a -next container behind (from a zero-downtime deployment). On restart:
- Holden boots and queues all apps
- When the app is processed, reconciliation sees the
-nextcontainer - Removes it as a leftover
- Proceeds with a fresh deploy
The old container keeps running throughout — no downtime from the crash. The deploy just restarts from the beginning.
Polling
Section titled “Polling”Holden re-queues all apps every HOLDEN_POLL_INTERVAL seconds (default: 300). Set to 0 to disable.
For users with webhooks, polling acts as a safety net — catching anything webhooks might have missed (network issues, GitHub outages) and detecting drift from external changes (someone manually stopped a container).
For users without webhooks, polling is the primary trigger for detecting changes.