Multi-instance & auto-scaling

resources.instances and [resources.autoscale]

Ask your agent

Run more copies of my app so it handles more visitors.Read the guide at percher.app/docs/multi-instance

Make my app add capacity automatically when it's busy.Read the guide at percher.app/docs/multi-instance

For agents and developers

Run multiple containers behind a single Caddy load balancer with active health checks. Two ways to configure it in percher.toml:

Static fan-out

[resources]
instances = 2   # runs 2 containers, load-balanced

Plan caps: free=1, starter=1, maker=2, pro=4. Exceeding the cap is silently clamped with a note in the build log.

CPU-based autoscaling

[resources.autoscale]
min = 1
max = 4

Percher samples CPU every 30s, evaluates every 50s, and scales by ±1 when sustained average crosses the threshold (default: scale up at >80% for 2 min, scale down at <20% for 10 min). Scaling up is conservative — one instance per action — and a cooldown prevents thrashing.

Active Caddy health checks route traffic around any unhealthy instance. A single-instance crash in a multi-instance deploy stays at severity warning (the app continues serving); an app only becomes crashed when all instances are down simultaneously. Per-deploy scale history lives on the deploy detail card in the dashboard.

← PrevCost optimization insights Next →Event webhooks