aboutsummaryrefslogtreecommitdiff
path: root/doc/src/sgml/ref/pg_combinebackup.sgml
blob: 330a598f7013e9e79141544622423fc44c317349 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
<!--
doc/src/sgml/ref/pg_combinebackup.sgml
PostgreSQL documentation
-->

<refentry id="app-pgcombinebackup">
 <indexterm zone="app-pgcombinebackup">
  <primary>pg_combinebackup</primary>
 </indexterm>

 <refmeta>
  <refentrytitle><application>pg_combinebackup</application></refentrytitle>
  <manvolnum>1</manvolnum>
  <refmiscinfo>Application</refmiscinfo>
 </refmeta>

 <refnamediv>
  <refname>pg_combinebackup</refname>
  <refpurpose>reconstruct a full backup from an incremental backup and dependent backups</refpurpose>
 </refnamediv>

 <refsynopsisdiv>
  <cmdsynopsis>
   <command>pg_combinebackup</command>
   <arg rep="repeat"><replaceable>option</replaceable></arg>
   <arg rep="repeat"><replaceable>backup_directory</replaceable></arg>
  </cmdsynopsis>
 </refsynopsisdiv>

 <refsect1>
  <title>Description</title>
  <para>
   <application>pg_combinebackup</application> is used to reconstruct a
   synthetic full backup from an
   <link linkend="backup-incremental-backup">incremental backup</link> and the
   earlier backups upon which it depends.
  </para>

  <para>
   Specify all of the required backups on the command line from oldest to newest.
   That is, the first backup directory should be the path to the full backup, and
   the last should be the path to the final incremental backup
   that you wish to restore. The reconstructed backup will be written to the
   output directory specified by the <option>-o</option> option.
  </para>

  <para>
   <application>pg_combinebackup</application> will attempt to verify
   that the backups you specify form a legal backup chain from which a correct
   full backup can be reconstructed. However, it is not designed to help you
   keep track of which backups depend on which other backups. If you remove
   one or more of the previous backups upon which your incremental
   backup relies, you will not be able to restore it. Moreover,
   <application>pg_combinebackup</application> only attempts to verify that the
   backups have the correct relationship to each other, not that each
   individual backup is intact; for that, use
   <xref linkend="app-pgverifybackup" />.
  </para>

  <para>
   Since the output of <application>pg_combinebackup</application> is a
   synthetic full backup, it can be used as an input to a future invocation of
   <application>pg_combinebackup</application>. The synthetic full backup would
   be specified on the command line in lieu of the chain of backups from which
   it was reconstructed.
  </para>
 </refsect1>

 <refsect1>
  <title>Options</title>

   <para>
    <variablelist>
     <varlistentry>
      <term><option>-d</option></term>
      <term><option>--debug</option></term>
      <listitem>
       <para>
        Print lots of debug logging output on <filename>stderr</filename>.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>-k</option></term>
      <term><option>--link</option></term>
      <listitem>
       <para>
        Use hard links instead of copying files to the synthetic backup.
        Reconstruction of the synthetic backup might be faster (no file copying)
        and use less disk space, but care must be taken when using the output
        directory, because any modifications to that directory (for example,
        starting the server) can also affect the input directories. Likewise,
        changes to the input directories (for example, starting the server on
        the full backup) could affect the output directory. Thus, this option
        is best used when the input directories are only copies that will be
        removed after <application>pg_combinebackup</application> has completed.
       </para>

       <para>
        Requires that the input backups and the output directory are in the
        same file system.
       </para>

       <para>
        If a backup manifest is not available or does not contain checksum of
        the right type, hard links will still be created, but the file will be
        also read block-by-block for the checksum calculation.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>-n</option></term>
      <term><option>--dry-run</option></term>
      <listitem>
       <para>
        The <option>-n</option>/<option>--dry-run</option> option instructs
        <command>pg_combinebackup</command> to figure out what would be done
        without actually creating the target directory or any output files.
        It is particularly useful in combination with <option>--debug</option>.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>-N</option></term>
      <term><option>--no-sync</option></term>
      <listitem>
       <para>
        By default, <command>pg_combinebackup</command> will wait for all files
        to be written safely to disk.  This option causes
        <command>pg_combinebackup</command> to return without waiting, which is
        faster, but means that a subsequent operating system crash can leave
        the output backup corrupt.  Generally, this option is useful for testing
        but should not be used when creating a production installation.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>-o <replaceable class="parameter">outputdir</replaceable></option></term>
      <term><option>--output=<replaceable class="parameter">outputdir</replaceable></option></term>
      <listitem>
       <para>
        Specifies the output directory to which the synthetic full backup
        should be written. Currently, this argument is required.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>-T <replaceable class="parameter">olddir</replaceable>=<replaceable class="parameter">newdir</replaceable></option></term>
      <term><option>--tablespace-mapping=<replaceable class="parameter">olddir</replaceable>=<replaceable class="parameter">newdir</replaceable></option></term>
      <listitem>
       <para>
        Relocates the tablespace in directory <replaceable>olddir</replaceable>
        to <replaceable>newdir</replaceable> during the backup.
        <replaceable>olddir</replaceable> is the absolute path of the tablespace
        as it exists in the final backup specified on the command line,
        and <replaceable>newdir</replaceable> is the absolute path to use for the
        tablespace in the reconstructed backup.  If either path needs to contain
        an equal sign (<literal>=</literal>), precede that with a backslash.
        This option can be specified multiple times for multiple tablespaces.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>--clone</option></term>
      <listitem>
       <para>
        Use efficient file cloning (also known as <quote>reflinks</quote> on
        some systems) instead of copying files to the new data directory,
        which can result in near-instantaneous copying of the data files.
       </para>

       <para>
        If a backup manifest is not available or does not contain checksum of
        the right type, file cloning will be used to copy the file, but the
        file will be also read block-by-block for the checksum calculation.
       </para>

       <para>
        File cloning is only supported on some operating systems and file
        systems.  If it is selected but not supported, the
        <application>pg_combinebackup</application> run will error.  At present,
        it is supported on Linux (kernel 4.5 or later) with Btrfs and XFS (on
        file systems created with reflink support), and on macOS with APFS.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>--copy</option></term>
      <listitem>
       <para>
        Perform regular file copy.  This is the default.  (See also
        <option>--copy-file-range</option>, <option>--clone</option>, and
        <option>-k</option>/<option>--link</option>.)
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>--copy-file-range</option></term>
      <listitem>
       <para>
        Use the <function>copy_file_range</function> system call for efficient
        copying.  On some file systems this gives results similar to
        <option>--clone</option>, sharing physical disk blocks, while on others
        it may still copy blocks, but do so via an optimized path.  At present,
        it is supported on Linux and FreeBSD.
       </para>

       <para>
        If a backup manifest is not available or does not contain checksum of
        the right type, <function>copy_file_range</function> will be used to
        copy the file, but the file will be also read block-by-block for the
        checksum calculation.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>--manifest-checksums=<replaceable class="parameter">algorithm</replaceable></option></term>
      <listitem>
       <para>
        Like <xref linkend="app-pgbasebackup"/>,
        <application>pg_combinebackup</application> writes a backup manifest
        in the output directory. This option specifies the checksum algorithm
        that should be applied to each file included in the backup manifest.
        Currently, the available algorithms are <literal>NONE</literal>,
        <literal>CRC32C</literal>, <literal>SHA224</literal>,
        <literal>SHA256</literal>, <literal>SHA384</literal>,
        and <literal>SHA512</literal>.  The default is <literal>CRC32C</literal>.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>--no-manifest</option></term>
      <listitem>
       <para>
        Disables generation of a backup manifest. If this option is not
        specified, a backup manifest for the reconstructed backup will be
        written to the output directory.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><option>--sync-method=<replaceable class="parameter">method</replaceable></option></term>
      <listitem>
       <para>
        When set to <literal>fsync</literal>, which is the default,
        <command>pg_combinebackup</command> will recursively open and synchronize
        all files in the backup directory.  When the plain format is used, the
        search for files will follow symbolic links for the WAL directory and
        each configured tablespace.
       </para>
       <para>
        On Linux, <literal>syncfs</literal> may be used instead to ask the
        operating system to synchronize the whole file system that contains the
        backup directory.  When the plain format is used,
        <command>pg_combinebackup</command> will also synchronize the file systems
        that contain the WAL files and each tablespace.  See
        <xref linkend="guc-recovery-init-sync-method"/> for information about
        the caveats to be aware of when using <literal>syncfs</literal>.
       </para>
       <para>
        This option has no effect when <option>--no-sync</option> is used.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
       <term><option>-V</option></term>
       <term><option>--version</option></term>
       <listitem>
       <para>
        Prints the <application>pg_combinebackup</application> version and
        exits.
       </para>
       </listitem>
     </varlistentry>

     <varlistentry>
       <term><option>-?</option></term>
       <term><option>--help</option></term>
       <listitem>
       <para>
        Shows help about <application>pg_combinebackup</application> command
        line arguments, and exits.
       </para>
       </listitem>
     </varlistentry>

    </variablelist>
   </para>

 </refsect1>

 <refsect1 id="app-pgcombinebackup-limitations">
  <title>Limitations</title>

  <para>
   <literal>pg_combinebackup</literal> does not recompute page checksums when
   writing the output directory. Therefore, if any of the backups used for
   reconstruction were taken with checksums disabled, but the final backup was
   taken with checksums enabled, the resulting directory may contain pages
   with invalid checksums.
  </para>

  <para>
   To avoid this problem, taking a new full backup after changing the checksum
   state of the cluster using <xref linkend="app-pgchecksums "/> is
   recommended. Otherwise, you can disable and then optionally reenable
   checksums on the directory produced by <literal>pg_combinebackup</literal>
   in order to correct the problem.
  </para>
 </refsect1>

 <refsect1>
  <title>Environment</title>

  <para>
   This utility, like most other <productname>PostgreSQL</productname> utilities,
   uses the environment variables supported by <application>libpq</application>
   (see <xref linkend="libpq-envars"/>).
  </para>

  <para>
   The environment variable <envar>PG_COLOR</envar> specifies whether to use
   color in diagnostic messages. Possible values are
   <literal>always</literal>, <literal>auto</literal> and
   <literal>never</literal>.
  </para>
 </refsect1>

 <refsect1>
  <title>See Also</title>

  <simplelist type="inline">
   <member><xref linkend="app-pgbasebackup"/></member>
  </simplelist>
 </refsect1>

</refentry>