DBRE.xyz | Point-In-Time Recovery

Point-In-Time Recovery (PITR) in PostgreSQL together with pgbackrest allows you to restore a database to a specific point in time.

This examples shows how to do an "interactive" backup, the first checkpoint is 2024-06-17 09:18:22+00, multiple records inserted after that time, but we want to go back to that specific time to find if there is the required data

if timezone is omitted, local time is assumed, for UTC use +00

Stop Postgresql 

First you need to stop postgres:

systemctl stop postgresql.service

this varies depending on your operating system here I am using Debian 12

Remove PGDATA 

Now remove the PGDATA directory, in my case:

rm -rf /db/16

Restore (--target-action=pause) 

Perform the initially recovery with pgbackrest restore, restore to the desired point in time, but it is important to set --target-action=pause so recovery pauses once the target is reached instead of promoting the node:

pgbackrest --stanza=standalone \
--delta \
--log-level-console=info \
--type=time \
--target="2024-06-17 09:18:22+00" \
--target-action=pause restore

LSN 

In case you want to restore to an LSN (output of select pg_switch_wal();) you can use:

pgbackrest --stanza=standalone \
--delta \
--log-level-console=info \
--type=lsn \
--target="19/EC000000" \
--target-action=pause restore

This will output something like this:

$ pgbackrest --stanza=standalone --delta --log-level-console=info --type=time --target="2024-06-17 09:18:22+00" --target-action=pause restore
2024-06-18 16:21:02.983 P00   INFO: restore command begin 2.52: --delta --exec-id=721872-e8f6a592 --log-level-console=info --log-path=/db/pgbackrest/log --pg1-path=/db/16 --process-max=3 --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/repo1 --repo1-s3-bucket=pg16 --repo1-s3-endpoint=my-s3-endpoint --repo1-s3-key=<redacted> --repo1-s3-key-secret=<redacted> --repo1-s3-region=region --no-repo1-storage-verify-tls --repo1-type=s3 --spool-path=/db/pgbackrest/spool --stanza=standalone --target="2024-06-17 09:18:22+00" --target-action=pause --type=time
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/db/16' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2024-06-18 16:21:03.398 P00   INFO: repo1: restore backup set 20240612-190741F_20240615-101611I, recovery will start at 2024-06-15 10:16:11
2024-06-18 16:21:46.235 P00   INFO: write updated /db/16/postgresql.auto.conf
2024-06-18 16:21:46.240 P00   INFO: restore global/pg_control (performed last to ensure aborted restores cannot be started)
2024-06-18 16:21:46.240 P00   INFO: restore size = 29.6MB, file total = 1270
2024-06-18 16:21:46.241 P00   INFO: restore command end: completed successfully (43260ms)

pg_ctl 

Start using pg_ctl -D PGDATA start

/usr/lib/postgresql/16/bin/pg_ctl -D /db/16 start

In the Postgres logs, you can check that WAL replay paused once the configured recovery target was reached:

024-06-18 16:24:19.279 P00   INFO: found 00000001000000000000008B in the archive asynchronously
2024-06-18 16:24:19.279 P00   INFO: archive-get command end: completed successfully (2ms)
time=2024-06-18 16:24:19 UTC, pid=721892 LOG:  restored log file "00000001000000000000008B" from archive
time=2024-06-18 16:24:19 UTC, pid=721892 LOG:  recovery stopping before commit of transaction 2996, time 2024-06-17 14:24:18.691936+00
time=2024-06-18 16:24:19 UTC, pid=721892 LOG:  pausing at the end of recovery
time=2024-06-18 16:24:19 UTC, pid=721892 HINT:  Execute pg_wal_replay_resume() to promote.

You can also verify WAL reply is paused by checking the output of the function pg_is_wal_replay_paused, for example:

$ psql -c "SELECT pg_is_wal_replay_paused();"
 pg_is_wal_replay_paused
-------------------------
 t
(1 row)

Connect and check if your data is at the desired state, if not, set a new recovery target recovery_target_time, then restart Postgres.

To check the current recovery_target_time:

$ psql -c "show recovery_target_time"
  recovery_target_time
------------------------
 2024-06-17 09:18:22+00
(1 row)

(1 row)

recovery_target_time 

To set a new target:

psql -c "ALTER SYSTEM SET recovery_target_time TO '2024-06-17 09:30:00+00'"

This will modify the postgresql.auto.conf:

$ cat /db/16/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
restore_command = 'pgbackrest --stanza=standalone archive-get %f "%p"'
recovery_target_time = '2024-06-17 09:30:00+00'

Now, restart postgres:

$ /usr/lib/postgresql/16/bin/pg_ctl -D /db/16 restart
waiting for server to shut down.... done
server stopped
waiting for server to start....time=2024-06-18 17:13:37 UTC, pid=723863 LOG:  redirecting log output to logging collector process
time=2024-06-18 17:13:37 UTC, pid=723863 HINT:  Future log output will appear in directory "/db/log".
. done
server started

If data is not at the desired point, set a new recovery_target_time and repeat previous steps.

pg_promote 

When data is at the desired point, promote Postgres to primary:

$ psql -c "SELECT pg_promote();"
 pg_promote
------------
 t
(1 row)

And you can now stop postgres with:

/usr/lib/postgresql/16/bin/pg_ctl -D /db/16 stop

Start Postgresql 

Start now using:

systemctl start postgresql.service

Patroni 

Stop all nodes in your cluster and follow the restore procedure in node 1 after you are at the desired point promote the server and then start Patroni so that it can take the management over.

Archive 

In postgresql.conf check this options:

archive_command = 'pgbackrest --config=/etc/pgbackrest/pgbackrest.conf --stanza=<stanza_name> archive-push %p'
archive_mode = on
archive_timeout = 300

To archive manually switching WAL files ensures that the current WAL file is completed and archived promptly:

psql -c "select *, current_timestamp from pg_switch_wal()";

timelines 

If you restore multiple times it could happen that you are in a different timeline, to check the current timeline, you can use pgbcakrest info or also pgbackres repo-ls archive, for example this is the output of pgbackrest info:

full backup: 20250119-125503F
     timestamp start/stop: 2025-01-19 12:55:03+01 / 2025-01-19 12:56:26+01
     wal start/stop: 0000002D000001160000006C / 0000002D000001160000006D
     database size: 28.8GB, database backup size: 28.8GB
     repo1: backup size: 5.0GB

 diff backup: 20250119-125503F_20250120-125503D
     timestamp start/stop: 2025-01-20 12:55:03+01 / 2025-01-20 12:56:07+01
     wal start/stop: 0000002D000001170000009A / 0000002D000001170000009B
     database size: 28.8GB, database backup size: 3.4GB
     repo1: backup size: 365.3MB
     backup reference total: 1 full

 diff backup: 20250119-125503F_20250121-125503D
     timestamp start/stop: 2025-01-21 12:55:03+01 / 2025-01-21 12:56:13+01
     wal start/stop: 0000002D00000118000000B5 / 0000002D00000118000000B6
     database size: 28.9GB, database backup size: 3.4GB
     repo1: backup size: 418.9MB
     backup reference total: 1 full

 diff backup: 20250119-125503F_20250122-015503D
     timestamp start/stop: 2025-01-22 01:55:03+01 / 2025-01-22 01:57:28+01
     wal start/stop: 0000002E000001190000004E / 0000002E000001190000004F
     database size: 28.9GB, database backup size: 3.5GB
     repo1: backup size: 661.8MB
     backup reference total: 1 full

If you would like to restore to 2025-01-22 09:55:22.716307+00 the timeline is 2E and the LSN is 0000002E000001190000004F, but to use it in pgbackrest you need to use 46 the decimal value, so you could use something like:

pgbackrest --stanza=standalone \
--delta \
--log-level-console=info \
--type=time \
--target="2024-06-17 09:18:22+00" \
--target-timeline=46 \
--target-action=pause restore

the --target=timeline=46 is the decimal value of the timeline 2E