aboutsummaryrefslogtreecommitdiff
path: root/doc/src/sgml/postgres-fdw.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/postgres-fdw.sgml')
-rw-r--r--doc/src/sgml/postgres-fdw.sgml325
1 files changed, 325 insertions, 0 deletions
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
new file mode 100644
index 00000000000..61b77774aee
--- /dev/null
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -0,0 +1,325 @@
+<!-- doc/src/sgml/postgres-fdw.sgml -->
+
+<sect1 id="postgres-fdw" xreflabel="postgres_fdw">
+ <title>postgres_fdw</title>
+
+ <indexterm zone="postgres-fdw">
+ <primary>postgres_fdw</primary>
+ </indexterm>
+
+ <para>
+ The <filename>postgres_fdw</> module provides the foreign-data wrapper
+ <literal>postgres_fdw</literal>, which can be used to access data
+ stored in external <productname>PostgreSQL</productname> servers.
+ </para>
+
+ <para>
+ The functionality provided by this module overlaps substantially
+ with the functionality of the older <xref linkend="dblink"> module.
+ But <filename>postgres_fdw</> provides more transparent and
+ standards-compliant syntax for accessing remote tables, and can give
+ better performance in many cases.
+ </para>
+
+ <para>
+ To prepare for remote access using <filename>postgres_fdw</>:
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Install the <filename>postgres_fdw</> extension using <xref
+ linkend="sql-createextension">.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Create a foreign server object, using <xref linkend="sql-createserver">,
+ to represent each remote database you want to connect to.
+ Specify connection information, except <literal>user</literal> and
+ <literal>password</literal>, as options of the server object.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Create a user mapping, using <xref linkend="sql-createusermapping">, for
+ each database user you want to allow to access each foreign server.
+ Specify the remote user name and password to use as
+ <literal>user</literal> and <literal>password</literal> options of the
+ user mapping.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Create a foreign table, using <xref linkend="sql-createforeigntable">,
+ for each remote table you want to access. The columns of the foreign
+ table must match the referenced remote table. You can, however, use
+ table and/or column names different from the remote table's, if you
+ specify the correct remote names as options of the foreign table object.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ Now you need only <command>SELECT</> from a foreign table to access
+ the data stored in its underlying remote table.
+ </para>
+
+ <para>
+ It is generally recommended that the columns of a foreign table be declared
+ with exactly the same data types, and collations if applicable, as the
+ referenced columns of the remote table. Although <filename>postgres_fdw</>
+ is currently rather forgiving about performing data type conversions at
+ need, surprising semantic anomalies may arise when types or collations do
+ not match, due to the remote server interpreting <literal>WHERE</> clauses
+ slightly differently from the local server.
+ </para>
+
+ <para>
+ Note that a foreign table can be declared with fewer columns, or with a
+ different column order, than its underlying remote table has. Matching
+ of columns to the remote table is by name, not position.
+ </para>
+
+ <sect2>
+ <title>FDW Options of postgres_fdw</title>
+
+ <sect3>
+ <title>Connection Options</title>
+
+ <para>
+ A foreign server using the <filename>postgres_fdw</> foreign data wrapper
+ can have the same options that <application>libpq</> accepts in
+ connection strings, as described in <xref linkend="libpq-paramkeywords">,
+ except that these options are not allowed:
+
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ <literal>user</literal> and <literal>password</literal> (specify these
+ for a user mapping, instead)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>client_encoding</> (this is automatically set from the local
+ server encoding)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>fallback_application_name</> (always set to
+ <literal>postgres_fdw</>)
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Only superusers may connect to foreign servers without password
+ authentication, so always specify the <literal>password</literal> option
+ for user mappings belonging to non-superusers.
+ </para>
+ </sect3>
+
+ <sect3>
+ <title>Object Name Options</title>
+
+ <para>
+ These options can be used to control the names used in SQL statements
+ sent to the remote <productname>PostgreSQL</productname> server. These
+ options are needed when a foreign table is created with names different
+ from the underlying remote table's names.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>schema_name</literal></term>
+ <listitem>
+ <para>
+ This option, which can be specified for a foreign table, gives the
+ schema name to use for the foreign table on the remote server. If this
+ option is omitted, the name of the foreign table's schema is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>table_name</literal></term>
+ <listitem>
+ <para>
+ This option, which can be specified for a foreign table, gives the
+ table name to use for the foreign table on the remote server. If this
+ option is omitted, the foreign table's name is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>column_name</literal></term>
+ <listitem>
+ <para>
+ This option, which can be specified for a column of a foreign table,
+ gives the column name to use for the column on the remote server.
+ If this option is omitted, the column's name is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect3>
+
+ <sect3>
+ <title>Cost Estimation Options</title>
+
+ <para>
+ <filename>postgres_fdw</> retrieves remote data by executing queries
+ against remote servers, so ideally the estimated cost of scanning a
+ foreign table should be whatever it costs to be done on the remote
+ server, plus some overhead for communication. The most reliable way to
+ get such an estimate is to ask the remote server and then add something
+ for overhead &mdash; but for simple queries, it may not be worth the cost
+ of an additional remote query to get a cost estimate.
+ So <filename>postgres_fdw</> provides the following options to control
+ how cost estimation is done:
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>use_remote_estimate</literal></term>
+ <listitem>
+ <para>
+ This option, which can be specified for a foreign table or a foreign
+ server, controls whether <filename>postgres_fdw</> issues remote
+ <command>EXPLAIN</command> commands to obtain cost estimates.
+ A setting for a foreign table overrides any setting for its server,
+ but only for that table.
+ The default is <literal>false</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>fdw_startup_cost</literal></term>
+ <listitem>
+ <para>
+ This option, which can be specified for a foreign server, is a numeric
+ value that is added to the estimated startup cost of any foreign-table
+ scan on that server. This represents the additional overhead of
+ establishing a connection, parsing and planning the query on the
+ remote side, etc.
+ The default value is <literal>100</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>fdw_tuple_cost</literal></term>
+ <listitem>
+ <para>
+ This option, which can be specified for a foreign server, is a numeric
+ value that is used as extra cost per-tuple for foreign-table
+ scans on that server. This represents the additional overhead of
+ data transfer between servers. You might increase or decrease this
+ number to reflect higher or lower network delay to the remote server.
+ The default value is <literal>0.01</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ When <literal>use_remote_estimate</literal> is true,
+ <filename>postgres_fdw</> obtains rowcount and cost estimates from the
+ remote server and then adds <literal>fdw_startup_cost</literal> and
+ <literal>fdw_tuple_cost</literal> to the cost estimates. When
+ <literal>use_remote_estimate</literal> is false,
+ <filename>postgres_fdw</> performs local rowcount and cost estimation
+ and then adds <literal>fdw_startup_cost</literal> and
+ <literal>fdw_tuple_cost</literal> to the cost estimates. This local
+ estimation is unlikely to be very accurate unless local copies of the
+ remote table's statistics are available. Running
+ <xref linkend="sql-analyze"> on the foreign table is the way to update
+ the local statistics; this will perform a scan of the remote table and
+ then calculate and store statistics just as though the table were local.
+ Keeping local statistics can be a useful way to reduce per-query planning
+ overhead for a remote table &mdash; but if the remote table is
+ frequently updated, the local statistics will soon be obsolete.
+ </para>
+
+ </sect3>
+ </sect2>
+
+ <sect2>
+ <title>Connection Management</title>
+
+ <para>
+ <filename>postgres_fdw</filename> establishes a connection to a
+ foreign server during the first query that uses a foreign table
+ associated with the foreign server. This connection is kept and
+ re-used for subsequent queries in the same session. However, if
+ multiple user identities (user mappings) are used to access the foreign
+ server, a connection is established for each user mapping.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Transaction Management</title>
+
+ <para>
+ During a query that references any remote tables on a foreign server,
+ <filename>postgres_fdw</filename> opens a transaction on the
+ remote server if one is not already open corresponding to the current
+ local transaction. The remote transaction is committed or aborted when
+ the local transaction commits or aborts. Savepoints are similarly
+ managed by creating corresponding remote savepoints.
+ </para>
+
+ <para>
+ The remote transaction uses <literal>SERIALIZABLE</>
+ isolation level when the local transaction has <literal>SERIALIZABLE</>
+ isolation level; otherwise it uses <literal>REPEATABLE READ</>
+ isolation level. This choice ensures that if a query performs multiple
+ table scans on the remote server, it will get snapshot-consistent results
+ for all the scans. A consequence is that successive queries within a
+ single transaction will see the same data from the remote server, even if
+ concurrent updates are occurring on the remote server due to other
+ activities. That behavior would be expected anyway if the local
+ transaction uses <literal>SERIALIZABLE</> or <literal>REPEATABLE READ</>
+ isolation level, but it might be surprising for a <literal>READ
+ COMMITTED</> local transaction. A future
+ <productname>PostgreSQL</productname> release might modify these rules.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Remote Query Optimization</title>
+
+ <para>
+ <filename>postgres_fdw</> attempts to optimize remote queries to reduce
+ the amount of data transferred from foreign servers. This is done by
+ sending query <literal>WHERE</> clauses to the remote server for
+ execution, and by not retrieving table columns that are not needed for
+ the current query. To reduce the risk of misexecution of queries,
+ <literal>WHERE</> clauses are not sent to the remote server unless they use
+ only built-in data types, operators, and functions. Operators and
+ functions in the clauses must be <literal>IMMUTABLE</> as well.
+ </para>
+
+ <para>
+ The query that is actually sent to the remote server for execution can
+ be examined using <command>EXPLAIN VERBOSE</>.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Author</title>
+ <para>
+ Shigeru Hanada <email>shigeru.hanada@gmail.com</email>
+ </para>
+ </sect2>
+
+</sect1>