diff options
Diffstat (limited to 'doc/src/sgml/postgres-fdw.sgml')
-rw-r--r-- | doc/src/sgml/postgres-fdw.sgml | 325 |
1 files changed, 325 insertions, 0 deletions
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml new file mode 100644 index 00000000000..61b77774aee --- /dev/null +++ b/doc/src/sgml/postgres-fdw.sgml @@ -0,0 +1,325 @@ +<!-- doc/src/sgml/postgres-fdw.sgml --> + +<sect1 id="postgres-fdw" xreflabel="postgres_fdw"> + <title>postgres_fdw</title> + + <indexterm zone="postgres-fdw"> + <primary>postgres_fdw</primary> + </indexterm> + + <para> + The <filename>postgres_fdw</> module provides the foreign-data wrapper + <literal>postgres_fdw</literal>, which can be used to access data + stored in external <productname>PostgreSQL</productname> servers. + </para> + + <para> + The functionality provided by this module overlaps substantially + with the functionality of the older <xref linkend="dblink"> module. + But <filename>postgres_fdw</> provides more transparent and + standards-compliant syntax for accessing remote tables, and can give + better performance in many cases. + </para> + + <para> + To prepare for remote access using <filename>postgres_fdw</>: + <orderedlist spacing="compact"> + <listitem> + <para> + Install the <filename>postgres_fdw</> extension using <xref + linkend="sql-createextension">. + </para> + </listitem> + <listitem> + <para> + Create a foreign server object, using <xref linkend="sql-createserver">, + to represent each remote database you want to connect to. + Specify connection information, except <literal>user</literal> and + <literal>password</literal>, as options of the server object. + </para> + </listitem> + <listitem> + <para> + Create a user mapping, using <xref linkend="sql-createusermapping">, for + each database user you want to allow to access each foreign server. + Specify the remote user name and password to use as + <literal>user</literal> and <literal>password</literal> options of the + user mapping. + </para> + </listitem> + <listitem> + <para> + Create a foreign table, using <xref linkend="sql-createforeigntable">, + for each remote table you want to access. The columns of the foreign + table must match the referenced remote table. You can, however, use + table and/or column names different from the remote table's, if you + specify the correct remote names as options of the foreign table object. + </para> + </listitem> + </orderedlist> + </para> + + <para> + Now you need only <command>SELECT</> from a foreign table to access + the data stored in its underlying remote table. + </para> + + <para> + It is generally recommended that the columns of a foreign table be declared + with exactly the same data types, and collations if applicable, as the + referenced columns of the remote table. Although <filename>postgres_fdw</> + is currently rather forgiving about performing data type conversions at + need, surprising semantic anomalies may arise when types or collations do + not match, due to the remote server interpreting <literal>WHERE</> clauses + slightly differently from the local server. + </para> + + <para> + Note that a foreign table can be declared with fewer columns, or with a + different column order, than its underlying remote table has. Matching + of columns to the remote table is by name, not position. + </para> + + <sect2> + <title>FDW Options of postgres_fdw</title> + + <sect3> + <title>Connection Options</title> + + <para> + A foreign server using the <filename>postgres_fdw</> foreign data wrapper + can have the same options that <application>libpq</> accepts in + connection strings, as described in <xref linkend="libpq-paramkeywords">, + except that these options are not allowed: + + <itemizedlist spacing="compact"> + <listitem> + <para> + <literal>user</literal> and <literal>password</literal> (specify these + for a user mapping, instead) + </para> + </listitem> + <listitem> + <para> + <literal>client_encoding</> (this is automatically set from the local + server encoding) + </para> + </listitem> + <listitem> + <para> + <literal>fallback_application_name</> (always set to + <literal>postgres_fdw</>) + </para> + </listitem> + </itemizedlist> + </para> + + <para> + Only superusers may connect to foreign servers without password + authentication, so always specify the <literal>password</literal> option + for user mappings belonging to non-superusers. + </para> + </sect3> + + <sect3> + <title>Object Name Options</title> + + <para> + These options can be used to control the names used in SQL statements + sent to the remote <productname>PostgreSQL</productname> server. These + options are needed when a foreign table is created with names different + from the underlying remote table's names. + </para> + + <variablelist> + + <varlistentry> + <term><literal>schema_name</literal></term> + <listitem> + <para> + This option, which can be specified for a foreign table, gives the + schema name to use for the foreign table on the remote server. If this + option is omitted, the name of the foreign table's schema is used. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>table_name</literal></term> + <listitem> + <para> + This option, which can be specified for a foreign table, gives the + table name to use for the foreign table on the remote server. If this + option is omitted, the foreign table's name is used. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>column_name</literal></term> + <listitem> + <para> + This option, which can be specified for a column of a foreign table, + gives the column name to use for the column on the remote server. + If this option is omitted, the column's name is used. + </para> + </listitem> + </varlistentry> + + </variablelist> + + </sect3> + + <sect3> + <title>Cost Estimation Options</title> + + <para> + <filename>postgres_fdw</> retrieves remote data by executing queries + against remote servers, so ideally the estimated cost of scanning a + foreign table should be whatever it costs to be done on the remote + server, plus some overhead for communication. The most reliable way to + get such an estimate is to ask the remote server and then add something + for overhead — but for simple queries, it may not be worth the cost + of an additional remote query to get a cost estimate. + So <filename>postgres_fdw</> provides the following options to control + how cost estimation is done: + </para> + + <variablelist> + + <varlistentry> + <term><literal>use_remote_estimate</literal></term> + <listitem> + <para> + This option, which can be specified for a foreign table or a foreign + server, controls whether <filename>postgres_fdw</> issues remote + <command>EXPLAIN</command> commands to obtain cost estimates. + A setting for a foreign table overrides any setting for its server, + but only for that table. + The default is <literal>false</literal>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>fdw_startup_cost</literal></term> + <listitem> + <para> + This option, which can be specified for a foreign server, is a numeric + value that is added to the estimated startup cost of any foreign-table + scan on that server. This represents the additional overhead of + establishing a connection, parsing and planning the query on the + remote side, etc. + The default value is <literal>100</literal>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>fdw_tuple_cost</literal></term> + <listitem> + <para> + This option, which can be specified for a foreign server, is a numeric + value that is used as extra cost per-tuple for foreign-table + scans on that server. This represents the additional overhead of + data transfer between servers. You might increase or decrease this + number to reflect higher or lower network delay to the remote server. + The default value is <literal>0.01</literal>. + </para> + </listitem> + </varlistentry> + + </variablelist> + + <para> + When <literal>use_remote_estimate</literal> is true, + <filename>postgres_fdw</> obtains rowcount and cost estimates from the + remote server and then adds <literal>fdw_startup_cost</literal> and + <literal>fdw_tuple_cost</literal> to the cost estimates. When + <literal>use_remote_estimate</literal> is false, + <filename>postgres_fdw</> performs local rowcount and cost estimation + and then adds <literal>fdw_startup_cost</literal> and + <literal>fdw_tuple_cost</literal> to the cost estimates. This local + estimation is unlikely to be very accurate unless local copies of the + remote table's statistics are available. Running + <xref linkend="sql-analyze"> on the foreign table is the way to update + the local statistics; this will perform a scan of the remote table and + then calculate and store statistics just as though the table were local. + Keeping local statistics can be a useful way to reduce per-query planning + overhead for a remote table — but if the remote table is + frequently updated, the local statistics will soon be obsolete. + </para> + + </sect3> + </sect2> + + <sect2> + <title>Connection Management</title> + + <para> + <filename>postgres_fdw</filename> establishes a connection to a + foreign server during the first query that uses a foreign table + associated with the foreign server. This connection is kept and + re-used for subsequent queries in the same session. However, if + multiple user identities (user mappings) are used to access the foreign + server, a connection is established for each user mapping. + </para> + </sect2> + + <sect2> + <title>Transaction Management</title> + + <para> + During a query that references any remote tables on a foreign server, + <filename>postgres_fdw</filename> opens a transaction on the + remote server if one is not already open corresponding to the current + local transaction. The remote transaction is committed or aborted when + the local transaction commits or aborts. Savepoints are similarly + managed by creating corresponding remote savepoints. + </para> + + <para> + The remote transaction uses <literal>SERIALIZABLE</> + isolation level when the local transaction has <literal>SERIALIZABLE</> + isolation level; otherwise it uses <literal>REPEATABLE READ</> + isolation level. This choice ensures that if a query performs multiple + table scans on the remote server, it will get snapshot-consistent results + for all the scans. A consequence is that successive queries within a + single transaction will see the same data from the remote server, even if + concurrent updates are occurring on the remote server due to other + activities. That behavior would be expected anyway if the local + transaction uses <literal>SERIALIZABLE</> or <literal>REPEATABLE READ</> + isolation level, but it might be surprising for a <literal>READ + COMMITTED</> local transaction. A future + <productname>PostgreSQL</productname> release might modify these rules. + </para> + </sect2> + + <sect2> + <title>Remote Query Optimization</title> + + <para> + <filename>postgres_fdw</> attempts to optimize remote queries to reduce + the amount of data transferred from foreign servers. This is done by + sending query <literal>WHERE</> clauses to the remote server for + execution, and by not retrieving table columns that are not needed for + the current query. To reduce the risk of misexecution of queries, + <literal>WHERE</> clauses are not sent to the remote server unless they use + only built-in data types, operators, and functions. Operators and + functions in the clauses must be <literal>IMMUTABLE</> as well. + </para> + + <para> + The query that is actually sent to the remote server for execution can + be examined using <command>EXPLAIN VERBOSE</>. + </para> + </sect2> + + <sect2> + <title>Author</title> + <para> + Shigeru Hanada <email>shigeru.hanada@gmail.com</email> + </para> + </sect2> + +</sect1> |