diff options
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/catalogs.sgml | 123 | ||||
-rw-r--r-- | doc/src/sgml/filelist.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/func.sgml | 201 | ||||
-rw-r--r-- | doc/src/sgml/logicaldecoding.sgml | 35 | ||||
-rw-r--r-- | doc/src/sgml/postgres.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/replication-origins.sgml | 93 |
6 files changed, 448 insertions, 6 deletions
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 898865eea19..4b79958b357 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -239,6 +239,16 @@ </row> <row> + <entry><link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link></entry> + <entry>registered replication origins</entry> + </row> + + <row> + <entry><link linkend="catalog-pg-replication-origin-status"><structname>pg_replication_origin_status</structname></link></entry> + <entry>information about replication origins, including replication progress</entry> + </row> + + <row> <entry><link linkend="catalog-pg-replication-slots"><structname>pg_replication_slots</structname></link></entry> <entry>replication slot information</entry> </row> @@ -5337,6 +5347,119 @@ </sect1> + <sect1 id="catalog-pg-replication-origin"> + <title><structname>pg_replication_origin</structname></title> + + <indexterm zone="catalog-pg-replication-origin"> + <primary>pg_replication_origin</primary> + </indexterm> + + <para> + The <structname>pg_replication_origin</structname> catalog contains + all replication origins created. For more on replication origins + see <xref linkend="replication-origins">. + </para> + + <table> + + <title><structname>pg_replication_origin</structname> Columns</title> + + <tgroup cols="4"> + <thead> + <row> + <entry>Name</entry> + <entry>Type</entry> + <entry>References</entry> + <entry>Description</entry> + </row> + </thead> + + <tbody> + <row> + <entry><structfield>roident</structfield></entry> + <entry><type>Oid</type></entry> + <entry></entry> + <entry>A unique, cluster-wide identifier for the replication + origin. Should never leave the system.</entry> + </row> + + <row> + <entry><structfield>roname</structfield></entry> + <entry><type>text</type></entry> + <entry></entry> + <entry>The external, user defined, name of a replication + origin.</entry> + </row> + </tbody> + </tgroup> + </table> + </sect1> + + <sect1 id="catalog-pg-replication-origin-status"> + <title><structname>pg_replication_origin_status</structname></title> + + <indexterm zone="catalog-pg-replication-origin-status"> + <primary>pg_replication_origin_status</primary> + </indexterm> + + <para> + The <structname>pg_replication_origin_status</structname> view + contains information about how far replay for a certain origin has + progressed. For more on replication origins + see <xref linkend="replication-origins">. + </para> + + <table> + + <title><structname>pg_replication_origin_status</structname> Columns</title> + + <tgroup cols="4"> + <thead> + <row> + <entry>Name</entry> + <entry>Type</entry> + <entry>References</entry> + <entry>Description</entry> + </row> + </thead> + + <tbody> + <row> + <entry><structfield>local_id</structfield></entry> + <entry><type>Oid</type></entry> + <entry><literal><link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link>.roident</literal></entry> + <entry>internal node identifier</entry> + </row> + + <row> + <entry><structfield>external_id</structfield></entry> + <entry><type>text</type></entry> + <entry><literal><link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link>.roname</literal></entry> + <entry>external node identifier</entry> + </row> + + <row> + <entry><structfield>remote_lsn</structfield></entry> + <entry><type>pg_lsn</type></entry> + <entry></entry> + <entry>The origin node's LSN up to which data has been replicated.</entry> + </row> + + + <row> + <entry><structfield>local_lsn</structfield></entry> + <entry><type>pg_lsn</type></entry> + <entry></entry> + <entry>This node's LSN that at + which <literal>remote_lsn</literal> has been replicated. Used to + flush commit records before persisting data to disk when using + asynchronous commits.</entry> + </row> + </tbody> + </tgroup> + </table> + </sect1> + <sect1 id="catalog-pg-replication-slots"> <title><structname>pg_replication_slots</structname></title> diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 26aa7ee50ee..6268d5496bd 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -95,6 +95,7 @@ <!ENTITY fdwhandler SYSTEM "fdwhandler.sgml"> <!ENTITY custom-scan SYSTEM "custom-scan.sgml"> <!ENTITY logicaldecoding SYSTEM "logicaldecoding.sgml"> +<!ENTITY replication-origins SYSTEM "replication-origins.sgml"> <!ENTITY protocol SYSTEM "protocol.sgml"> <!ENTITY sources SYSTEM "sources.sgml"> <!ENTITY storage SYSTEM "storage.sgml"> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 0053d7d4101..dcade93e439 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16879,11 +16879,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); <title>Replication Functions</title> <para> - The functions shown in <xref linkend="functions-replication-table"> are - for controlling and interacting with replication features. - See <xref linkend="streaming-replication"> - and <xref linkend="streaming-replication-slots"> for information about the - underlying features. Use of these functions is restricted to superusers. + The functions shown + in <xref linkend="functions-replication-table"> are for + controlling and interacting with replication features. + See <xref linkend="streaming-replication">, + <xref linkend="streaming-replication-slots">, <xref linkend="replication-origins"> + for information about the underlying features. Use of these + functions is restricted to superusers. </para> <para> @@ -17040,6 +17042,195 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); on future calls. </entry> </row> + + <row id="pg-replication-origin-create"> + <entry> + <indexterm> + <primary>pg_replication_origin_create</primary> + </indexterm> + <literal><function>pg_replication_origin_create(<parameter>node_name</parameter> <type>text</type>)</function></literal> + </entry> + <entry> + <parameter>internal_id</parameter> <type>oid</type> + </entry> + <entry> + Create a replication origin with the the passed in external + name, and create an internal id for it. + </entry> + </row> + + <row id="pg-replication-origin-drop"> + <entry> + <indexterm> + <primary>pg_replication_origin_drop</primary> + </indexterm> + <literal><function>pg_replication_origin_drop(<parameter>node_name</parameter> <type>text</type>)</function></literal> + </entry> + <entry> + void + </entry> + <entry> + Delete a previously created replication origin, including the + associated replay progress. + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>pg_replication_origin_oid</primary> + </indexterm> + <literal><function>pg_replication_origin_oid(<parameter>node_name</parameter> <type>text</type>)</function></literal> + </entry> + <entry> + <parameter>internal_id</parameter> <type>oid</type> + </entry> + <entry> + Lookup replication origin by name and return the internal + oid. If no corresponding replication origin is found a error + is thrown. + </entry> + </row> + + <row id="pg-replication-origin-session-setup"> + <entry> + <indexterm> + <primary>pg_replication_origin_session_setup</primary> + </indexterm> + <literal><function>pg_replication_origin_setup_session(<parameter>node_name</parameter> <type>text</type>)</function></literal> + </entry> + <entry> + void + </entry> + <entry> + Configure the current session to be replaying from the passed in + origin, allowing replay progress to be tracked. Use + <function>pg_replication_origin_session_reset</function> to revert. + Can only be used if no previous origin is configured. + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>pg_replication_origin_session_reset</primary> + </indexterm> + <literal><function>pg_replication_origin_session_reset()</function></literal> + </entry> + <entry> + void + </entry> + <entry> + Cancel the effects + of <function>pg_replication_origin_session_setup()</function>. + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>pg_replication_session_is_setup</primary> + </indexterm> + <literal><function>pg_replication_session_is_setup()</function></literal> + </entry> + <entry> + bool + </entry> + <entry> + Has a replication origin been configured in the current session? + </entry> + </row> + + <row id="pg-replication-origin-session-progress"> + <entry> + <indexterm> + <primary>pg_replication_origin_session_progress</primary> + </indexterm> + <literal><function>pg_replication_origin_progress(<parameter>flush</parameter> <type>bool</type>)</function></literal> + </entry> + <entry> + pg_lsn + </entry> + <entry> + Return the replay position for the replication origin configured in + the current session. The parameter <parameter>flush</parameter> + determines whether the corresponding local transaction will be + guaranteed to have been flushed to disk or not. + </entry> + </row> + + <row id="pg-replication-origin-xact-setup"> + <entry> + <indexterm> + <primary>pg_replication_origin_xact_setup</primary> + </indexterm> + <literal><function>pg_replication_origin_xact_setup(<parameter>origin_lsn</parameter> <type>pg_lsn</type>, <parameter>origin_timestamp</parameter> <type>timestamptz</type>)</function></literal> + </entry> + <entry> + void + </entry> + <entry> + Mark the current transaction to be replaying a transaction that has + committed at the passed in <acronym>LSN</acronym> and timestamp. Can + only be called when a replication origin has previously been + configured using + <function>pg_replication_origin_session_setup()</function>. + </entry> + </row> + + <row id="pg-replication-origin-xact-reset"> + <entry> + <indexterm> + <primary>pg_replication_origin_xact_reset</primary> + </indexterm> + <literal><function>pg_replication_origin_xact_reset()</function></literal> + </entry> + <entry> + void + </entry> + <entry> + Cancel the effects of + <function>pg_replication_origin_xact_setup()</function>. + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>pg_replication_origin_advance</primary> + </indexterm> + <literal>pg_replication_origin_advance<function>(<parameter>node_name</parameter> <type>text</type>, <parameter>pos</parameter> <type>pg_lsn</type>)</function></literal> + </entry> + <entry> + void + </entry> + <entry> + Set replication progress for the passed in node to the passed in + position. This primarily is useful for setting up the initial position + or a new position after configuration changes and similar. Be aware + that careless use of this function can lead to inconsistently + replicated data. + </entry> + </row> + + <row id="pg-replication-origin-progress"> + <entry> + <indexterm> + <primary>pg_replication_origin_progress</primary> + </indexterm> + <literal><function>pg_replication_origin_progress(<parameter>node_name</parameter> <type>text</type>, <parameter>flush</parameter> <type>bool</type>)</function></literal> + </entry> + <entry> + pg_lsn + </entry> + <entry> + Return the replay position for the passed in replication origin. The + parameter <parameter>flush</parameter> determines whether the + corresponding local transaction will be guaranteed to have been + flushed to disk or not. + </entry> + </row> + </tbody> </tgroup> </table> diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml index 0810a2d1f97..f817af3ea8a 100644 --- a/doc/src/sgml/logicaldecoding.sgml +++ b/doc/src/sgml/logicaldecoding.sgml @@ -363,6 +363,7 @@ typedef struct OutputPluginCallbacks LogicalDecodeBeginCB begin_cb; LogicalDecodeChangeCB change_cb; LogicalDecodeCommitCB commit_cb; + LogicalDecodeFilterByOriginCB filter_by_origin_cb; LogicalDecodeShutdownCB shutdown_cb; } OutputPluginCallbacks; @@ -370,7 +371,8 @@ typedef void (*LogicalOutputPluginInit)(struct OutputPluginCallbacks *cb); </programlisting> The <function>begin_cb</function>, <function>change_cb</function> and <function>commit_cb</function> callbacks are required, - while <function>startup_cb</function> + while <function>startup_cb</function>, + <function>filter_by_origin_cb</function> and <function>shutdown_cb</function> are optional. </para> </sect2> @@ -569,6 +571,37 @@ typedef void (*LogicalDecodeChangeCB) ( </para> </note> </sect3> + + <sect3 id="logicaldecoding-output-plugin-filter-by-origin"> + <title>Origin Filter Callback</title> + + <para> + The optional <function>filter_by_origin_cb</function> callback + is called to determine wheter data that has been replayed + from <parameter>origin_id</parameter> is of interest to the + output plugin. +<programlisting> +typedef bool (*LogicalDecodeChangeCB) ( + struct LogicalDecodingContext *ctx, + RepNodeId origin_id +); +</programlisting> + The <parameter>ctx</parameter> parameter has the same contents + as for the other callbacks. No information but the origin is + available. To signal that changes originating on the passed in + node are irrelevant, return true, causing them to be filtered + away; false otherwise. The other callbacks will not be called + for transactions and changes that have been filtered away. + </para> + <para> + This is useful when implementing cascading or multi directional + replication solutions. Filtering by the origin allows to + prevent replicating the same changes back and forth in such + setups. While transactions and changes also carry information + about the origin, filtering via this callback is noticeably + more efficient. + </para> + </sect3> </sect2> <sect2 id="logicaldecoding-output-plugin-output"> diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index e378d6978d0..4a45138bf72 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -220,6 +220,7 @@ &spi; &bgworker; &logicaldecoding; + &replication-origins; </part> diff --git a/doc/src/sgml/replication-origins.sgml b/doc/src/sgml/replication-origins.sgml new file mode 100644 index 00000000000..c5310229119 --- /dev/null +++ b/doc/src/sgml/replication-origins.sgml @@ -0,0 +1,93 @@ +<!-- doc/src/sgml/replication-origins.sgml --> +<chapter id="replication-origins"> + <title>Replication Progress Tracking</title> + <indexterm zone="replication-origins"> + <primary>Replication Progress Tracking</primary> + </indexterm> + <indexterm zone="replication-origins"> + <primary>Replication Origins</primary> + </indexterm> + + <para> + Replication origins are intended to make it easier to implement + logical replication solutions on top + of <xref linkend="logicaldecoding">. They provide a solution to two + common problems: + <itemizedlist> + <listitem><para>How to safely keep track of replication progress</para></listitem> + <listitem><para>How to change replication behavior, based on the + origin of a row; e.g. to avoid loops in bi-directional replication + setups</para></listitem> + </itemizedlist> + </para> + + <para> + Replication origins consist out of a name and a oid. The name, which + is what should be used to refer to the origin across systems, is + free-form text. It should be used in a way that makes conflicts + between replication origins created by different replication + solutions unlikely; e.g. by prefixing the replication solution's + name to it. The oid is used only to avoid having to store the long + version in situations where space efficiency is important. It should + never be shared between systems. + </para> + + <para> + Replication origins can be created using the + <link linkend="pg-replication-origin-create"><function>pg_replication_origin_create()</function></link>; + dropped using + <link linkend="pg-replication-origin-drop"><function>pg_replication_origin_drop()</function></link>; + and seen in the + <link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link> + catalog. + </para> + + <para> + When replicating from one system to another (independent of the fact that + those two might be in the same cluster, or even same database) one + nontrivial part of building a replication solution is to keep track of + replay progress in a safe manner. When the applying process, or the whole + cluster, dies, it needs to be possible to find out up to where data has + successfully been replicated. Naive solutions to this like updating a row in + a table for every replayed transaction have problems like runtime overhead + bloat. + </para> + + <para> + Using the replication origin infrastructure a session can be + marked as replaying from a remote node (using the + <link linkend="pg-replication-origin-session-setup"><function>pg_replication_origin_session_setup()</function></link> + function. Additionally the <acronym>LSN</acronym> and commit + timestamp of every source transaction can be configured on a per + transaction basis using + <link linkend="pg-replication-origin-xact-setup"><function>pg_replication_origin_xact-setup()</function></link>. + If that's done replication progress will be persist in a crash safe + manner. Replay progress for all replication origins can be seen in the + <link linkend="catalog-pg-replication-origin-status"> + <structname>pg_replication_origin_status</structname> + </link> view. A individual origin's progress, e.g. when resuming + replication, can be acquired using + <link linkend="pg-replication-origin-progress"><function>pg_replication_origin_progress()</function></link> + for any origin or + <link linkend="pg-replication-origin-session-progress"><function>pg_replication_origin_session_progress()</function></link> + for the origin configured in the current session. + </para> + + <para> + In more complex replication topologies than replication from exactly one + system to one other, another problem can be that, that it is hard to avoid + replicating replayed rows again. That can lead both to cycles in the + replication and inefficiencies. Replication origins provide a optional + mechanism to recognize and prevent that. When configured using the functions + referenced in the previous paragraph, every change and transaction passed to + output plugin callbacks (see <xref linkend="logicaldecoding-output-plugin">) + generated by the session is tagged with the replication origin of the + generating session. This allows to treat them differently in the output + plugin, e.g. ignoring all but locally originating rows. Additionally + the <link linkend="logicaldecoding-output-plugin-filter-by-origin"> + <function>filter_by_origin_cb</function></link> callback can be used + to filter the logical decoding change stream based on the + source. While less flexible, filtering via that callback is + considerably more efficient. + </para> +</chapter> |