PostgreSQL backup, done right.
pg_hardstorage is an enterprise-grade PostgreSQL backup tool — a
single static Go binary, PostgreSQL 15–18, Apache 2.0.
The core idea: continuous WAL streaming is the always-on data
plane. A long-running pg_hardstorage wal stream process holds a
physical replication slot and ships every byte of WAL into the
repository as PostgreSQL writes it. Periodic base backups are just
the anchor the stream rolls forward from. Daily base backup + a 24/7
stream = byte-precise point-in-time recovery with no gaps.
Everything happens over the PostgreSQL replication protocol on an ordinary libpq connection — the agent never needs OS access to the database host, and the same binary scales from a laptop to a 100 TB production fleet.
Maintained by CYBERTEC PostgreSQL International GmbH.
curl -sSL https://get.pghardstorage.org | sh
pg_hardstorage versionbrew install cybertec-postgresql/tap/pg_hardstorage./compile.sh # build bin/pg_hardstorage (needs Go 1.26+)Bring up a temporary PostgreSQL 18 and run the full backup/restore/verify flow in Docker:
pg_hardstorage demoThe demo prints progress and a result summary. No existing PostgreSQL or pg_hardstorage configuration is needed — just a running Docker daemon.
pg_hardstorage init --quickAuto-detects a local PostgreSQL 18, creates a file:// repo, takes the first backup, and prints the next steps. Zero prompts. Zero decisions.
make build-simple # build the interactive helper
./bin/pg_hardstorage_simpleNo flags, no subcommands — pick a number and answer prompts:
What would you like to do?
1. Set up backups for a database I haven't backed up before
2. Take a backup right now
3. Start continuous protection (base backup + WAL streaming)
4. See what's in my repository
5. Verify a backup is restorable
6. Restore a backup
q. quit
docker compose upBrings up PostgreSQL 18 + pg_hardstorage agent + MinIO (S3-compatible repo) + Prometheus + Grafana. Full evaluation environment in one command.
The agent is started with --metrics-listen 0.0.0.0:9187, so the
pg_hardstorage_* metrics are reachable and scraped out of the box:
| Service | URL | Notes |
|---|---|---|
Agent /metrics |
http://localhost:9187/metrics | Prometheus text exposition |
| Prometheus | http://localhost:9090 | scrapes the agent every 15s |
| Grafana | http://localhost:3000 | admin/admin; Prometheus datasource + a pg_hardstorage overview dashboard are provisioned at boot |
| MinIO console | http://localhost:9001 | minioadmin/minioadmin |
No PostgreSQL handy? Bring up a throwaway PG + agent + repo entirely on your local Docker daemon:
./scripts/devcluster.sh upPrefer the explicit path? The canonical loop is init the repo → start the streamer → take a base backup → restore:
# 1. create the repository
pg_hardstorage repo init file:///srv/pg-backups/db1
# 2. start continuous WAL streaming (long-running — supervise with systemd)
pg_hardstorage wal stream db1 \
--pg-connection "host=10.0.0.10 user=replicator dbname=postgres" \
--repo file:///srv/pg-backups/db1
# 3. take a base backup (runs concurrently with the streamer)
pg_hardstorage backup db1 \
--pg-connection "host=10.0.0.10 user=replicator dbname=postgres" \
--repo file:///srv/pg-backups/db1
# 4. verify it, then prove it restores
pg_hardstorage verify db1 latest --repo file:///srv/pg-backups/db1
pg_hardstorage restore db1 latest --repo file:///srv/pg-backups/db1 \
--target /var/lib/postgresql/restoredpg_hardstorage init runs the same connect → init-repo → first-backup
flow as a guided wizard. The
getting-started tutorial walks
the whole round-trip end to end.
- Native WAL streaming over the PostgreSQL replication protocol — the headline feature; everything else exists to serve it.
- Base backups that interleave with the live stream — the
streamer keeps running while
backupexecutes. - Point-in-time recovery to a time, an LSN, or a named restore
point — with a
--previewdry-run. - Patroni-aware failover — four cooperating mechanisms keep the stream gap-free across leader switches.
- Drop-in compatibility with pgBackRest, Barman and WAL-G — the same CLI surface, plus a config translator.
- Kubernetes — runs as a CronJob / Deployment; verified end-to-end against CloudNativePG.
- Content-addressed deduplication (FastCDC, page-aligned splits) — no incremental chains to break.
- AES-256-GCM envelope encryption; a FIPS / BoringCrypto build variant is available.
- 4 Tier-1 KMS providers — AWS KMS · GCP KMS · Azure Key Vault · HashiCorp Vault Transit. (A PKCS#11 / HSM provider is in progress.)
- 6 Tier-1 storage backends — filesystem · S3 · GCS · Azure Blob · SFTP · SCP.
- LLM-assisted operator surface for the 3am restore — read-only by default, every command previewed before it runs.
- Structured output everywhere (
--output json|ndjson|yaml|…, 11 renderers) and 14 notification sinks. - Schema-versioned wire formats with a 24-month backward-compatibility commitment.
Works against any PostgreSQL that exposes the replication protocol — bare metal, VMs, Patroni clusters, and PostgreSQL behind a Kubernetes operator. Fully-managed DBaaS offerings (RDS, Cloud SQL, …) that do not expose the replication protocol are not supported.
In production you run two pg_hardstorage processes side by
side, against the same repository:
┌────────── PostgreSQL ──────────┐
│ data dir + pg_wal/ │
└──┬──────────────────────────┬──┘
│ replication protocol │ replication protocol
▼ (BASE_BACKUP) ▼ (START_REPLICATION SLOT … PHYSICAL)
┌─────────────────────────┐ ┌────────────────────────────┐
│ pg_hardstorage backup │ │ pg_hardstorage wal stream │
│ scheduled (e.g. daily) │ │ always-on, never stops │
└────────────┬────────────┘ └──────────────┬─────────────┘
│ │
└─────► same repo URL ◄────────┘
The streamer holds a physical slot created with RESERVE_WAL, so
PostgreSQL retains every WAL segment from restart_lsn onwards from
the moment the slot exists — not from the moment the first stream
byte flows. A crash of the streamer is just a restart; no gaps. The
base backup runs concurrently; the two processes never coordinate
beyond both pointing at the same repository.
Before opening the stream, the agent runs a configuration
preflight — wal_level, max_replication_slots,
max_wal_senders, the role's REPLICATION attribute, plus
max_slot_wal_keep_size / idle_replication_slot_timeout warnings —
and a start-LSN safety check against the slot's restart_lsn. The
same preflight is reachable standalone via pg_hardstorage wal preflight <deployment> for setup runbooks and CI gates.
pg_hardstorage is built for highly-available clusters. When a
Patroni leader switch happens, four cooperating mechanisms keep the
WAL stream gap-free:
- permanent slots — the replication slot exists on every node, including the one that becomes the new leader;
- PG 17+ synced slots — PostgreSQL itself keeps the slot in sync across the cluster;
- recreate-on-detection — the streamer reconnects through Patroni's REST API (never a stale hostname) and recreates the slot if it has to;
- gap auditor — a periodic check emits a
wal.gap_detectedevent if any of the above is misconfigured.
pg_hardstorage patroni gives a read-only view of a Patroni-managed
cluster. See the Patroni tutorial
and the failover deep-dive.
Migrating off another tool does not mean rewriting your automation.
make build-compat produces multicall shim binaries that present the
same command-line surface as the tool they replace:
| Shim | Replaces |
|---|---|
pg-hardstorage-pgbackrest |
pgbackrest |
pg-hardstorage-barman / pg-hardstorage-barman-wal-archive |
barman / barman-wal-archive |
pg-hardstorage-walg |
wal-g |
Drop the shim in where the old tool was — in an archive_command, a
cron job, or an operator's container image — and pg_hardstorage
runs underneath. pg_hardstorage compat translate --from <tool> <config-path> reads an existing pgbackrest.conf / barman.conf and
emits a ready-to-review pg_hardstorage.yml. See the
migration how-to guides.
PostgreSQL behind a Kubernetes operator is ordinary PostgreSQL in a
pod — pg_hardstorage backs it up over the replication protocol like
any other instance:
- Run it in its own pod — a CronJob for scheduled base backups, a Deployment for continuous WAL streaming — pointed at the database Service. No sidecar, no operator plugin required.
- Verified end-to-end against CloudNativePG: backup → verify →
restore round-trips against a CNPG cluster. The reproducible script
lives at
test/k8s/. - An in-tree Helm chart (
charts/pg-hardstorage-sidecar) deploys the agent as a StatefulSet. - The pgBackRest / Barman / WAL-G compat shims slot into operator images that expect those tools.
A native CloudNativePG-I provider is on the roadmap; today the CronJob / Deployment model above is the verified path. See the Kubernetes how-to guides and the CNPG tutorial.
pg_hardstorage is a single static binary. Requires Go 1.26+ to
build.
./compile.sh # downloads deps, builds bin/pg_hardstorage
./compile.sh --testkit # also build bin/pg_hardstorage_testkit
./compile.sh --fips # FIPS / BoringCrypto variant (Linux/amd64)
./compile.sh --pkcs11 # PKCS#11 / HSM variant (cgo + libpkcs11; in progress)
./compile.sh --firecracker # microVM verifier-sandbox variant
./compile.sh --help # full optionsOr via the canonical Makefile:
make # build bin/pg_hardstorage + bin/pg_hardstorage_testkit
make build-simple # the interactive quick-start helper
make build-compat # the pgBackRest / Barman / WAL-G shims
make build-fips # FIPS / BoringCrypto variant
make build-firecracker # microVM verifier-sandbox variant
make test # go test -race -count=1 ./...
make test-integration # adds -tags=integration; requires DockerOther ways to run it:
- Containers — build from
deploy/docker/Dockerfile(distroless, runs asnonroot); see the Kubernetes guides. - systemd —
deploy/systemd/shipspg_hardstorage.serviceplus apg_hardstorage@<deployment>.servicetemplate for multi-instance hosts. - Linux packages — the release pipeline produces signed
.deband.rpmartefacts; see the packaging guide.
Release artefacts are cosign-signed (keyless / Sigstore) and ship an SPDX SBOM.
The most-cited commands. See the operator guide for the full reference.
# Validate PG is ready to stream (also runs automatically inside `wal stream`)
pg_hardstorage wal preflight db1 --pg-connection ...
# Stream WAL continuously (long-running — supervise with systemd)
pg_hardstorage wal stream db1 --pg-connection ... --repo ...
# Take a base backup right now
pg_hardstorage backup db1 --pg-connection ... --repo ...
# Restore latest, or PITR by time / LSN; --preview for a dry-run
pg_hardstorage restore db1 latest --repo ... --target /var/lib/postgresql/restored
pg_hardstorage restore db1 latest --repo ... --target /tmp/r --to "5 minutes ago"
pg_hardstorage restore db1 latest --repo ... --target /tmp/r --to-lsn 0/3000028
pg_hardstorage restore db1 latest --repo ... --target /tmp/r --preview
# Inspect + verify
pg_hardstorage status # all deployments
pg_hardstorage list db1 --repo ...
pg_hardstorage verify db1 latest --repo ...
pg_hardstorage doctor # self-diagnosis
# Retention (dry-run by default; --apply to delete)
pg_hardstorage rotate db1 --repo ... --policy gfs --applyEvery command supports --output json / --output ndjson — the
schema is pg_hardstorage.v1 with a 24-month back-compat commitment.
The doc site is Diátaxis-organised — every page is one of tutorial / how-to / reference / explanation.
| Quadrant | What lives there |
|---|---|
| Tutorials | Learn-by-doing: getting started, first backup + restore, PITR, encryption, Patroni, Kubernetes, plugin authoring, LLM incident response |
| How-to guides | Task-oriented recipes: adding repos / KMS / sinks; operating; Kubernetes; air-gapped; packaging; migration |
| Reference | CLI (200+ auto-generated pages), REST API, plugin contracts, schema catalogues (manifest / output event / KEKRef / storage URL / exit codes / error codes / metrics) |
| Explanation | Conceptual deep-dives: design principles, the WAL pipeline, Patroni failover, envelope encryption, the audit chain, the LLM safety stack, threat model, comparison vs pgBackRest / WAL-G / Barman |
Plus the operations handbook, the compliance docs, the 3am-operator runbooks, the FAQ and the glossary.
Read CONTRIBUTING.md and
docs/CONTRIBUTING-DOCS.md for the
authoring conventions. Bug reports land best with a runnable testkit
scenario — the testkit binary is built via make build-testkit.
Security disclosures: SECURITY.md.
Apache 2.0. See LICENSE.
Copyright © 2026 CYBERTEC PostgreSQL International GmbH.