diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2010-06-03 22:17:32 +0000 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2010-06-03 22:17:32 +0000 |
commit | 0cc59cc1f38d91587d52b14789e20bdd4c1af70a (patch) | |
tree | cb7b46dbdfb470b2b66f2e12206e6c06257e309a /doc/src | |
parent | 572ec5a2760dfa12b74001428431a5f7d9027e27 (diff) | |
download | postgresql-0cc59cc1f38d91587d52b14789e20bdd4c1af70a.tar.gz postgresql-0cc59cc1f38d91587d52b14789e20bdd4c1af70a.zip |
Add current WAL end (as seen by walsender, ie, GetWriteRecPtr() result)
and current server clock time to SR data messages. These are not currently
used on the slave side but seem likely to be useful in future, and it'd be
better not to change the SR protocol after release. Per discussion.
Also do some minor code review and cleanup on walsender.c, and improve the
protocol documentation.
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/protocol.sgml | 283 |
1 files changed, 168 insertions, 115 deletions
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 70762671610..b88833c8ee2 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.87 2010/04/03 07:22:55 petere Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.88 2010/06/03 22:17:32 tgl Exp $ --> <chapter id="protocol"> <title>Frontend/Backend Protocol</title> @@ -1284,6 +1284,173 @@ </sect2> </sect1> +<sect1 id="protocol-replication"> +<title>Streaming Replication Protocol</title> + +<para> +To initiate streaming replication, the frontend sends the +<literal>replication</> parameter in the startup message. This tells the +backend to go into walsender mode, wherein a small set of replication commands +can be issued instead of SQL statements. Only the simple query protocol can be +used in walsender mode. + +The commands accepted in walsender mode are: + +<variablelist> + <varlistentry> + <term>IDENTIFY_SYSTEM</term> + <listitem> + <para> + Requests the server to identify itself. Server replies with a result + set of a single row, containing two fields: + </para> + + <para> + <variablelist> + <varlistentry> + <term> + systemid + </term> + <listitem> + <para> + The unique system identifier identifying the cluster. This + can be used to check that the base backup used to initialize the + slave came from the same cluster. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + timeline + </term> + <listitem> + <para> + Current TimelineID. Also useful to check that the slave is + consistent with the master. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>START_REPLICATION <replaceable>XXX</>/<replaceable>XXX</></term> + <listitem> + <para> + Instructs server to start streaming WAL, starting at + WAL position <replaceable>XXX</>/<replaceable>XXX</>. + The server can reply with an error, e.g. if the requested section of WAL + has already been recycled. On success, server responds with a + CopyOutResponse message, and then starts to stream WAL to the frontend. + WAL will continue to be streamed until the connection is broken; + no further commands will be accepted. + </para> + + <para> + WAL data is sent as a series of CopyData messages. (This allows + other information to be intermixed; in particular the server can send + an ErrorResponse message if it encounters a failure after beginning + to stream.) The payload in each CopyData message follows this format: + </para> + + <para> + <variablelist> + <varlistentry> + <term> + XLogData (B) + </term> + <listitem> + <para> + <variablelist> + <varlistentry> + <term> + Byte1('w') + </term> + <listitem> + <para> + Identifies the message as WAL data. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term> + Byte8 + </term> + <listitem> + <para> + The starting point of the WAL data in this message, given in + XLogRecPtr format. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term> + Byte8 + </term> + <listitem> + <para> + The current end of WAL on the server, given in + XLogRecPtr format. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term> + Byte8 + </term> + <listitem> + <para> + The server's system clock at the time of transmission, + given in TimestampTz format. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term> + Byte<replaceable>n</replaceable> + </term> + <listitem> + <para> + A section of the WAL data stream. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + <para> + A single WAL record is never split across two CopyData messages. + When a WAL record crosses a WAL page boundary, and is therefore + already split using continuation records, it can be split at the page + boundary. In other words, the first main WAL record and its + continuation records can be sent in different CopyData messages. + </para> + <para> + Note that all fields within the WAL data and the above-described header + will be in the sending server's native format. Endianness, and the + format for the timestamp, are unpredictable unless the receiver has + verified that the sender's system identifier matches its own + <filename>pg_control</> contents. + </para> + <para> + If the WAL sender process is terminated normally (during postmaster + shutdown), it will send a CommandComplete message before exiting. + This might not happen during an abnormal shutdown, of course. + </para> + </listitem> + </varlistentry> +</variablelist> + +</para> + +</sect1> + <sect1 id="protocol-message-types"> <title>Message Data Types</title> @@ -4137,120 +4304,6 @@ not line breaks. </sect1> -<sect1 id="protocol-replication"> -<title>Streaming Replication Protocol</title> - -<para> -To initiate streaming replication, the frontend sends the "replication" -parameter in the startup message. This tells the backend to go into -walsender mode, where a small set of replication commands can be issued -instead of SQL statements. Only the simple query protocol can be used in -walsender mode. - -The commands accepted in walsender mode are: - -<variablelist> - <varlistentry> - <term>IDENTIFY_SYSTEM</term> - <listitem> - <para> - Requests the server to identify itself. Server replies with a result - set of a single row, and two fields: - - systemid: The unique system identifier identifying the cluster. This - can be used to check that the base backup used to initialize the - slave came from the same cluster. - - timeline: Current TimelineID. Also used to check that the slave is - consistent with the master. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term>START_REPLICATION XXX/XXX</term> - <listitem> - <para> - Instructs backend to start streaming WAL, starting at point XXX/XXX. - Server can reply with an error e.g if the requested piece of WAL has - already been recycled. On success, server responds with a - CopyOutResponse message, and backend starts to stream WAL as CopyData - messages. - The payload in CopyData message consists of the following format. - </para> - - <para> - <variablelist> - <varlistentry> - <term> - XLogData (B) - </term> - <listitem> - <para> - <variablelist> - <varlistentry> - <term> - Byte1('w') - </term> - <listitem> - <para> - Identifies the message as WAL data. - </para> - </listitem> - </varlistentry> - <varlistentry> - <term> - Int32 - </term> - <listitem> - <para> - The log file number of the LSN, indicating the starting point of - the WAL in the message. - </para> - </listitem> - </varlistentry> - <varlistentry> - <term> - Int32 - </term> - <listitem> - <para> - The byte offset of the LSN, indicating the starting point of - the WAL in the message. - </para> - </listitem> - </varlistentry> - <varlistentry> - <term> - Byte<replaceable>n</replaceable> - </term> - <listitem> - <para> - Data that forms part of WAL data stream. - </para> - </listitem> - </varlistentry> - </variablelist> - </para> - </listitem> - </varlistentry> - </variablelist> - </para> - <para> - A single WAL record is never split across two CopyData messages. When - a WAL record crosses a WAL page boundary, however, and is therefore - already split using continuation records, it can be split at the page - boundary. In other words, the first main WAL record and its - continuation records can be split across different CopyData messages. - </para> - </listitem> - </varlistentry> -</variablelist> - -</para> - -</sect1> - <sect1 id="protocol-changes"> <title>Summary of Changes since Protocol 2.0</title> |