Point-In-Time Recovery (PITR) in PostgreSQL together with pgbackrest allows you to restore a database to a specific point in time.
This examples shows how to do an "interactive" backup, the first checkpoint is
2024-06-17 09:18:22+00
, multiple records inserted after that time, but we
want to go back to that specific time to find if there is the required data
if timezone is omitted, local time is assumed, for UTC use +00
Stop Postgresql
First you need to stop postgres:
systemctl stop postgresql.service
this varies depending on your operating system here I am using Debian 12
Remove PGDATA
Now remove the PGDATA directory, in my case:
rm -rf /db/16
Restore (--target-action=pause)
Perform the initially recovery with pgbackrest restore, restore to the
desired point in time, but it is important to set --target-action=pause
so
recovery pauses once the target is reached instead of promoting the node:
pgbackrest --stanza=standalone \
--delta \
--log-level-console=info \
--type=time \
--target="2024-06-17 09:18:22+00" \
--target-action=pause restore
LSN
In case you want to restore to an LSN (output of select pg_switch_wal();) you can use:
pgbackrest --stanza=standalone \
--delta \
--log-level-console=info \
--type=lsn \
--target="19/EC000000" \
--target-action=pause restore
This will output something like this:
$ pgbackrest --stanza=standalone --delta --log-level-console=info --type=time --target="2024-06-17 09:18:22+00" --target-action=pause restore
2024-06-18 16:21:02.983 P00 INFO: restore command begin 2.52: --delta --exec-id=721872-e8f6a592 --log-level-console=info --log-path=/db/pgbackrest/log --pg1-path=/db/16 --process-max=3 --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/repo1 --repo1-s3-bucket=pg16 --repo1-s3-endpoint=my-s3-endpoint --repo1-s3-key=<redacted> --repo1-s3-key-secret=<redacted> --repo1-s3-region=region --no-repo1-storage-verify-tls --repo1-type=s3 --spool-path=/db/pgbackrest/spool --stanza=standalone --target="2024-06-17 09:18:22+00" --target-action=pause --type=time
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/db/16' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2024-06-18 16:21:03.398 P00 INFO: repo1: restore backup set 20240612-190741F_20240615-101611I, recovery will start at 2024-06-15 10:16:11
2024-06-18 16:21:46.235 P00 INFO: write updated /db/16/postgresql.auto.conf
2024-06-18 16:21:46.240 P00 INFO: restore global/pg_control (performed last to ensure aborted restores cannot be started)
2024-06-18 16:21:46.240 P00 INFO: restore size = 29.6MB, file total = 1270
2024-06-18 16:21:46.241 P00 INFO: restore command end: completed successfully (43260ms)
pg_ctl
Start using pg_ctl -D PGDATA start
/usr/lib/postgresql/16/bin/pg_ctl -D /db/16 start
In the Postgres logs, you can check that WAL replay paused once the configured recovery target was reached:
024-06-18 16:24:19.279 P00 INFO: found 00000001000000000000008B in the archive asynchronously
2024-06-18 16:24:19.279 P00 INFO: archive-get command end: completed successfully (2ms)
time=2024-06-18 16:24:19 UTC, pid=721892 LOG: restored log file "00000001000000000000008B" from archive
time=2024-06-18 16:24:19 UTC, pid=721892 LOG: recovery stopping before commit of transaction 2996, time 2024-06-17 14:24:18.691936+00
time=2024-06-18 16:24:19 UTC, pid=721892 LOG: pausing at the end of recovery
time=2024-06-18 16:24:19 UTC, pid=721892 HINT: Execute pg_wal_replay_resume() to promote.
You can also verify WAL reply is paused by checking the output of the function pg_is_wal_replay_paused
, for example:
$ psql -c "SELECT pg_is_wal_replay_paused();"
pg_is_wal_replay_paused
-------------------------
t
(1 row)
Connect and check if your data is at the desired state, if not, set a new recovery target recovery_target_time
, then restart Postgres.
To check the current recovery_target_time:
$ psql -c "show recovery_target_time"
recovery_target_time
------------------------
2024-06-17 09:18:22+00
(1 row)
(1 row)
recovery_target_time
To set a new target:
psql -c "ALTER SYSTEM SET recovery_target_time TO '2024-06-17 09:30:00+00'"
This will modify the postgresql.auto.conf
:
$ cat /db/16/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
restore_command = 'pgbackrest --stanza=standalone archive-get %f "%p"'
recovery_target_time = '2024-06-17 09:30:00+00'
Now, restart postgres:
$ /usr/lib/postgresql/16/bin/pg_ctl -D /db/16 restart
waiting for server to shut down.... done
server stopped
waiting for server to start....time=2024-06-18 17:13:37 UTC, pid=723863 LOG: redirecting log output to logging collector process
time=2024-06-18 17:13:37 UTC, pid=723863 HINT: Future log output will appear in directory "/db/log".
. done
server started
If data is not at the desired point, set a new recovery_target_time
and repeat previous steps.
pg_promote
When data is at the desired point, promote Postgres to primary:
$ psql -c "SELECT pg_promote();"
pg_promote
------------
t
(1 row)
And you can now stop postgres with:
/usr/lib/postgresql/16/bin/pg_ctl -D /db/16 stop
Start Postgresql
Start now using:
systemctl start postgresql.service
Patroni
Stop all nodes in your cluster and follow the restore procedure in node 1 after you are at the desired point promote the server and then start Patroni so that it can take the management over.
Archive
In postgresql.conf
check this options:
archive_command = 'pgbackrest --config=/etc/pgbackrest/pgbackrest.conf --stanza=<stanza_name> archive-push %p'
archive_mode = on
archive_timeout = 300
To archive manually switching WAL files ensures that the current WAL file is completed and archived promptly:
psql -c "select *, current_timestamp from pg_switch_wal()";
timelines
If you restore multiple times it could happen that you are in a different
timeline, to check the current timeline, you can use pgbcakrest info
or also pgbackres repo-ls archive
, for example this is the output of pgbackrest info
:
full backup: 20250119-125503F
timestamp start/stop: 2025-01-19 12:55:03+01 / 2025-01-19 12:56:26+01
wal start/stop: 0000002D000001160000006C / 0000002D000001160000006D
database size: 28.8GB, database backup size: 28.8GB
repo1: backup size: 5.0GB
diff backup: 20250119-125503F_20250120-125503D
timestamp start/stop: 2025-01-20 12:55:03+01 / 2025-01-20 12:56:07+01
wal start/stop: 0000002D000001170000009A / 0000002D000001170000009B
database size: 28.8GB, database backup size: 3.4GB
repo1: backup size: 365.3MB
backup reference total: 1 full
diff backup: 20250119-125503F_20250121-125503D
timestamp start/stop: 2025-01-21 12:55:03+01 / 2025-01-21 12:56:13+01
wal start/stop: 0000002D00000118000000B5 / 0000002D00000118000000B6
database size: 28.9GB, database backup size: 3.4GB
repo1: backup size: 418.9MB
backup reference total: 1 full
diff backup: 20250119-125503F_20250122-015503D
timestamp start/stop: 2025-01-22 01:55:03+01 / 2025-01-22 01:57:28+01
wal start/stop: 0000002E000001190000004E / 0000002E000001190000004F
database size: 28.9GB, database backup size: 3.5GB
repo1: backup size: 661.8MB
backup reference total: 1 full
If you would like to restore to 2025-01-22 09:55:22.716307+00
the timeline is 2E
and the LSN is 0000002E000001190000004F
, but to use it in pgbackrest you need to use 46
the decimal value, so you could use something like:
pgbackrest --stanza=standalone \
--delta \
--log-level-console=info \
--type=time \
--target="2024-06-17 09:18:22+00" \
--target-timeline=46 \
--target-action=pause restore
the
--target=timeline=46
is the decimal value of the timeline2E