diff options
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/protocol.sgml | 37 | ||||
-rw-r--r-- | doc/src/sgml/ref/allfiles.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/ref/pg_basebackup.sgml | 64 | ||||
-rw-r--r-- | doc/src/sgml/ref/pg_validatebackup.sgml | 291 | ||||
-rw-r--r-- | doc/src/sgml/reference.sgml | 1 |
5 files changed, 393 insertions, 1 deletions
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index f139ba02312..536de9a698e 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -2466,7 +2466,7 @@ The commands accepted in replication mode are: </varlistentry> <varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP"> - <term><literal>BASE_BACKUP</literal> [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ] [ <literal>PROGRESS</literal> ] [ <literal>FAST</literal> ] [ <literal>WAL</literal> ] [ <literal>NOWAIT</literal> ] [ <literal>MAX_RATE</literal> <replaceable>rate</replaceable> ] [ <literal>TABLESPACE_MAP</literal> ] [ <literal>NOVERIFY_CHECKSUMS</literal> ] + <term><literal>BASE_BACKUP</literal> [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ] [ <literal>PROGRESS</literal> ] [ <literal>FAST</literal> ] [ <literal>WAL</literal> ] [ <literal>NOWAIT</literal> ] [ <literal>MAX_RATE</literal> <replaceable>rate</replaceable> ] [ <literal>TABLESPACE_MAP</literal> ] [ <literal>NOVERIFY_CHECKSUMS</literal> ] [ <literal>MANIFEST</literal> <replaceable>manifest_option</replaceable> ] [ <literal>MANIFEST_CHECKSUMS</literal> <replaceable>checksum_algorithm</replaceable> ] <indexterm><primary>BASE_BACKUP</primary></indexterm> </term> <listitem> @@ -2576,6 +2576,41 @@ The commands accepted in replication mode are: </para> </listitem> </varlistentry> + + <varlistentry> + <term><literal>MANIFEST</literal></term> + <listitem> + <para> + When this option is specified with a value of <literal>yes</literal> + or <literal>force-escape</literal>, a backup manifest is created + and sent along with the backup. The manifest is a list of every + file present in the backup with the exception of any WAL files that + may be included. It also stores the size, last modification time, and + an optional checksum for each file. + A value of <literal>force-escape</literal> forces all filenames + to be hex-encoded; otherwise, this type of encoding is performed only + for files whose names are non-UTF8 octet sequences. + <literal>force-escape</literal> is intended primarily for testing + purposes, to be sure that clients which read the backup manifest + can handle this case. For compatibility with previous releases, + the default is <literal>MANIFEST 'no'</literal>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>MANIFEST_CHECKSUMS</literal></term> + <listitem> + <para> + Specifies the algorithm that should be applied to each file included + in the backup manifest. Currently, the available + algorithms are <literal>NONE</literal>, <literal>CRC32C</literal>, + <literal>SHA224</literal>, <literal>SHA256</literal>, + <literal>SHA384</literal>, and <literal>SHA512</literal>. + The default is <literal>CRC32C</literal>. + </para> + </listitem> + </varlistentry> </variablelist> </para> <para> diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 8d91f3529e6..ab71176cdf3 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -211,6 +211,7 @@ Complete list of usable sgml source files in this directory. <!ENTITY pgResetwal SYSTEM "pg_resetwal.sgml"> <!ENTITY pgRestore SYSTEM "pg_restore.sgml"> <!ENTITY pgRewind SYSTEM "pg_rewind.sgml"> +<!ENTITY pgValidateBackup SYSTEM "pg_validatebackup.sgml"> <!ENTITY pgtestfsync SYSTEM "pgtestfsync.sgml"> <!ENTITY pgtesttiming SYSTEM "pgtesttiming.sgml"> <!ENTITY pgupgrade SYSTEM "pgupgrade.sgml"> diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index c8e040bacfc..d9c981cebb9 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -561,6 +561,70 @@ PostgreSQL documentation </para> </listitem> </varlistentry> + + <varlistentry> + <term><option>--no-manifest</option></term> + <listitem> + <para> + Disables generation of a backup manifest. If this option is not + specified, the server will generate and send a backup manifest + which can be verified using <xref linkend="app-pgvalidatebackup" />. + The manifest is a list of every file present in the backup with the + exception of any WAL files that may be included. It also stores the + size, last modification time, and an optional checksum for each file. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>--manifest-force-encode</option></term> + <listitem> + <para> + Forces all filenames in the backup manifest to be hex-encoded. + If this option is not specified, only non-UTF8 filenames are + hex-encoded. This option is mostly intended to test that tools which + read a backup manifest file properly handle this case. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>--manifest-checksums=<replaceable class="parameter">algorithm</replaceable></option></term> + <listitem> + <para> + Specifies the checksum algorithm that should be applied to each file + included in the backup manifest. Currently, the available + algorithms are <literal>NONE</literal>, <literal>CRC32C</literal>, + <literal>SHA224</literal>, <literal>SHA256</literal>, + <literal>SHA384</literal>, and <literal>SHA512</literal>. + The default is <literal>CRC32C</literal>. + </para> + <para> + If <literal>NONE</literal> is selected, the backup manifest will + not contain any checksums. Otherwise, it will contain a checksum + of each file in the backup using the specified algorithm. In addition, + the manifest will always contain a <literal>SHA256</literal> + checksum of its own contents. The <literal>SHA</literal> algorithms + are significantly more CPU-intensive than <literal>CRC32C</literal>, + so selecting one of them may increase the time required to complete + the backup. + </para> + <para> + Using a SHA hash function provides a cryptographically secure digest + of each file for users who wish to verify that the backup has not been + tampered with, while the CRC32C algorithm provides a checksum which is + much faster to calculate and good at catching errors due to accidental + changes but is not resistant to targeted modifications. Note that, to + be useful against an adversary who has access to the backup, the backup + manifest would need to be stored securely elsewhere or otherwise + verified not to have been modified since the backup was taken. + </para> + <para> + <xref linkend="app-pgvalidatebackup" /> can be used to check the + integrity of a backup against the backup manifest. + </para> + </listitem> + </varlistentry> </variablelist> </para> diff --git a/doc/src/sgml/ref/pg_validatebackup.sgml b/doc/src/sgml/ref/pg_validatebackup.sgml new file mode 100644 index 00000000000..19888dc1966 --- /dev/null +++ b/doc/src/sgml/ref/pg_validatebackup.sgml @@ -0,0 +1,291 @@ +<!-- +doc/src/sgml/ref/pg_validatebackup.sgml +PostgreSQL documentation +--> + +<refentry id="app-pgvalidatebackup"> + <indexterm zone="app-pgvalidatebackup"> + <primary>pg_validatebackup</primary> + </indexterm> + + <refmeta> + <refentrytitle>pg_validatebackup</refentrytitle> + <manvolnum>1</manvolnum> + <refmiscinfo>Application</refmiscinfo> + </refmeta> + + <refnamediv> + <refname>pg_validatebackup</refname> + <refpurpose>verify the integrity of a base backup of a + <productname>PostgreSQL</productname> cluster</refpurpose> + </refnamediv> + + <refsynopsisdiv> + <cmdsynopsis> + <command>pg_validatebackup</command> + <arg rep="repeat"><replaceable>option</replaceable></arg> + </cmdsynopsis> + </refsynopsisdiv> + + <refsect1> + <title> + Description + </title> + <para> + <application>pg_validatebackup</application> is used to check the + integrity of a database cluster backup taken using + <command>pg_basebackup</command> against a + <literal>backup_manifest</literal> generated by the server at the time + of the backup. The backup must be stored in the "plain" + format; a "tar" format backup can be checked after extracting it. + </para> + + <para> + It is important to note that that the validation which is performed by + <application>pg_validatebackup</application> does not and can not include + every check which will be performed by a running server when attempting + to make use of the backup. Even if you use this tool, you should still + perform test restores and verify that the resulting databases work as + expected and that they appear to contain the correct data. However, + <application>pg_validatebackup</application> can detect many problems + that commonly occur due to storage problems or user error. + </para> + + <para> + Backup verification proceeds in four stages. First, + <literal>pg_validatebackup</literal> reads the + <literal>backup_manifest</literal> file. If that file + does not exist, cannot be read, is malformed, or fails verification + against its own internal checksum, <literal>pg_validatebackup</literal> + will terminate with a fatal error. + </para> + + <para> + Second, <literal>pg_validatebackup</literal> will attempt to verify that + the data files currently stored on disk are exactly the same as the data + files which the server intended to send, with some exceptions that are + described below. Extra and missing files will be detected, with a few + exceptions. This step will ignore the presence or absence of, or any + modifications to, <literal>postgresql.auto.conf</literal>, + <literal>standby.signal</literal>, and <literal>recovery.signal</literal>, + because it is expected that these files may have been created or modified + as part of the process of taking the backup. It also won't complain about + a <literal>backup_manifest</literal> file in the target directory or + about anything inside <literal>pg_wal</literal>, even though these + files won't be listed in the backup manifest. Only files are checked; + the presence or absence or directories is not verified, except + indirectly: if a directory is missing, any files it should have contained + will necessarily also be missing. + </para> + + <para> + Next, <literal>pg_validatebackup</literal> will checksum all the files, + compare the checksums against the values in the manifest, and emit errors + for any files for which the computed checksum does not match the + checksum stored in the manifest. This step is not performed for any files + which produced errors in the previous step, since they are already known + to have problems. Also, files which were ignored in the previous step are + also ignored in this step. + </para> + + <para> + Finally, <literal>pg_validatebackup</literal> will use the manifest to + verify that the write-ahead log records which will be needed to recover + the backup are present and that they can be read and parsed. The + <literal>backup_manifest</literal> contains information about which + write-ahead log records will be needed, and + <literal>pg_validatebackup</literal> will use that information to + invoke <literal>pg_waldump</literal> to parse those write-ahed log + records. The <literal>--quiet</literal> flag will be used, so that + <literal>pg_waldump</literal> will only report errors, without producing + any other output. While this level of verification is sufficient to + detect obvious problems such as a missing file or one whose internal + checksums do not match, they aren't extensive enough to detect every + possible problem that might occur when attempting to recover. For + instance, a server bug that produces write-ahead log records that have + the correct checksums but specify nonsensical actions can't be detected + by this method. + </para> + + <para> + Note that if extra WAL files which are not required to recover the backup + are present, they will not be checked by this tool, although + a separate invocation of <literal>pg_waldump</literal> could be used for + that purpose. Also note that WAL verification is version-specific: you + must use the version of <literal>pg_validatebackup</literal>, and thus of + <literal>pg_waldump</literal>, which pertains to the backup being checked. + In contrast, the data file integrity checks should work with any version + of the server that generates a <literal>backup_manifest</literal> file. + </para> + </refsect1> + + <refsect1> + <title>Options</title> + + <para> + The following command-line options control the behavior. + + <variablelist> + <varlistentry> + <term><option>-e</option></term> + <term><option>--exit-on-error</option></term> + <listitem> + <para> + Exit as soon as a problem with the backup is detected. If this option + is not specified, <literal>pg_basebackup</literal> will continue + checking the backup even after a problem has been detected, and will + report all problems detected as errors. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-i <replaceable class="parameter">path</replaceable></option></term> + <term><option>--ignore=<replaceable class="parameter">path</replaceable></option></term> + <listitem> + <para> + Ignore the specified file or directory, which should be expressed + as a relative pathname, when comparing the list of data files + actually present in the backup to those listed in the + <literal>backup_manifest</literal> file. If a directory is + specified, this option affects the entire subtree rooted at that + location. Complaints about extra files, missing files, file size + differences, or checksum mismatches will be suppressed if the + relative pathname matches the specified pathname. This option + can be specified multiple times. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-m <replaceable class="parameter">path</replaceable></option></term> + <term><option>--manifest-path=<replaceable class="parameter">path</replaceable></option></term> + <listitem> + <para> + Use the manifest file at the specified path, rather than one located + in the root of the backup directory. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-n</option></term> + <term><option>--no-parse-wal</option></term> + <listitem> + <para> + Don't attempt to parse write-ahead log data that will be needed + to recover from this backup. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-q</option></term> + <term><option>--quiet</option></term> + <listitem> + <para> + Don't print anything when a backup is successfully validated. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-s</option></term> + <term><option>--skip-checksums</option></term> + <listitem> + <para> + Do not validate data file checksums. The presence or absence of + files and the sizes of those files will still be checked. This is + much faster, because the files themselves do not need to be read. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-w <replaceable class="parameter">path</replaceable></option></term> + <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term> + <listitem> + <para> + Try to parse WAL files stored in the specified directory, rather than + in <literal>pg_wal</literal>. This may be useful if the backup is + stored in a separate location from the WAL archive. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + + <para> + Other options are also available: + + <variablelist> + <varlistentry> + <term><option>-V</option></term> + <term><option>--version</option></term> + <listitem> + <para> + Print the <application>pg_validatebackup</application> version and exit. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>-?</option></term> + <term><option>--help</option></term> + <listitem> + <para> + Show help about <application>pg_validatebackup</application> command + line arguments, and exit. + </para> + </listitem> + </varlistentry> + + </variablelist> + </para> + + </refsect1> + + <refsect1> + <title>Examples</title> + + <para> + To create a base backup of the server at <literal>mydbserver</literal> and + validate the integrity of the backup: +<screen> +<prompt>$</prompt> <userinput>pg_basebackup -h mydbserver -D /usr/local/pgsql/data</userinput> +<prompt>$</prompt> <userinput>pg_validatebackup /usr/local/pgsql/data</userinput> +</screen> + </para> + + <para> + To create a base backup of the server at <literal>mydbserver</literal>, move + the manifest somewhere outside the backup directory, and validate the + backup: +<screen> +<prompt>$</prompt> <userinput>pg_basebackup -h mydbserver -D /usr/local/pgsql/backup1234</userinput> +<prompt>$</prompt> <userinput>mv /usr/local/pgsql/backup1234/backup_manifest /my/secure/location/backup_manifest.1234</userinput> +<prompt>$</prompt> <userinput>pg_validatebackup -m /my/secure/location/backup_manifest.1234 /usr/local/pgsql/backup1234</userinput> +</screen> + </para> + + <para> + To validate a backup while ignoring a file that was added manually to the + backup directory, and also skipping checksum verification: +<screen> +<prompt>$</prompt> <userinput>pg_basebackup -h mydbserver -D /usr/local/pgsql/data</userinput> +<prompt>$</prompt> <userinput>edit /usr/local/pgsql/data/note.to.self</userinput> +<prompt>$</prompt> <userinput>pg_validatebackup --ignore=note.to.self --skip-checksums /usr/local/pgsql/data</userinput> +</screen> + </para> + + </refsect1> + + <refsect1> + <title>See Also</title> + + <simplelist type="inline"> + <member><xref linkend="app-pgbasebackup"/></member> + </simplelist> + </refsect1> + +</refentry> diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index cef09dd38b3..d25a77b13c8 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -255,6 +255,7 @@ &pgReceivewal; &pgRecvlogical; &pgRestore; + &pgValidateBackup; &psqlRef; &reindexdb; &vacuumdb; |