aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2011-02-20 00:17:18 -0500
committerTom Lane <tgl@sss.pgh.pa.us>2011-02-20 00:18:14 -0500
commitbb742407947ad1cbf19355d24282380d576e7654 (patch)
treeac377ed05d85d9cbd0b33127f4d59750b6e60cda /doc/src
parentd5813488a4ccc78ec3a4ad0d5da4e6e844af75e8 (diff)
downloadpostgresql-bb742407947ad1cbf19355d24282380d576e7654.tar.gz
postgresql-bb742407947ad1cbf19355d24282380d576e7654.zip
Implement an API to let foreign-data wrappers actually be functional.
This commit provides the core code and documentation needed. A contrib module test case will follow shortly. Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/ddl.sgml47
-rw-r--r--doc/src/sgml/fdwhandler.sgml212
-rw-r--r--doc/src/sgml/filelist.sgml1
-rw-r--r--doc/src/sgml/postgres.sgml1
-rw-r--r--doc/src/sgml/ref/create_foreign_data_wrapper.sgml13
-rw-r--r--doc/src/sgml/ref/create_foreign_table.sgml4
6 files changed, 267 insertions, 11 deletions
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index a65b4bcd338..12f7c3706e8 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2986,6 +2986,53 @@ ANALYZE measurement;
</sect2>
</sect1>
+ <sect1 id="ddl-foreign-data">
+ <title>Foreign Data</title>
+
+ <indexterm>
+ <primary>foreign data</primary>
+ </indexterm>
+ <indexterm>
+ <primary>foreign table</primary>
+ </indexterm>
+
+ <para>
+ <productname>PostgreSQL</productname> implements portions of the SQL/MED
+ specification, allowing you to access data that resides outside
+ PostgreSQL using regular SQL queries. Such data is referred to as
+ <firstterm>foreign data</>. (Note that this usage is not to be confused
+ with foreign keys, which are a type of constraint within the database.)
+ </para>
+
+ <para>
+ Foreign data is accessed with help from a
+ <firstterm>foreign data wrapper</firstterm>. A foreign data wrapper is a
+ library that can communicate with an external data source, hiding the
+ details of connecting to the data source and fetching data from it. There
+ are several foreign data wrappers available, which can for example read
+ plain data files residing on the server, or connect to another PostgreSQL
+ instance. If none of the existing foreign data wrappers suit your needs,
+ you can write your own; see <xref linkend="fdwhandler">.
+ </para>
+
+ <para>
+ To access foreign data, you need to create a <firstterm>foreign server</>
+ object, which defines how to connect to a particular external data source,
+ according to the set of options used by a particular foreign data
+ wrapper. Then you need to create one or more <firstterm>foreign
+ tables</firstterm>, which define the structure of the remote data. A
+ foreign table can be used in queries just like a normal table, but a
+ foreign table has no storage in the PostgreSQL server. Whenever it is
+ used, PostgreSQL asks the foreign data wrapper to fetch the data from the
+ external source.
+ </para>
+
+ <para>
+ Currently, foreign tables are read-only. This limitation may be fixed
+ in a future release.
+ </para>
+ </sect1>
+
<sect1 id="ddl-others">
<title>Other Database Objects</title>
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
new file mode 100644
index 00000000000..fc07f129b79
--- /dev/null
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -0,0 +1,212 @@
+<!-- doc/src/sgml/fdwhandler.sgml -->
+
+ <chapter id="fdwhandler">
+ <title>Writing A Foreign Data Wrapper</title>
+
+ <indexterm zone="fdwhandler">
+ <primary>foreign data wrapper</primary>
+ <secondary>handler for</secondary>
+ </indexterm>
+
+ <para>
+ All operations on a foreign table are handled through its foreign data
+ wrapper, which consists of a set of functions that the planner and
+ executor call. The foreign data wrapper is responsible for fetching
+ data from the remote data source and returning it to the
+ <productname>PostgreSQL</productname> executor. This chapter outlines how
+ to write a new foreign data wrapper.
+ </para>
+
+ <para>
+ The FDW author needs to implement a handler function, and optionally
+ a validator function. Both functions must be written in a compiled
+ language such as C, using the version-1 interface.
+ For details on C language calling conventions and dynamic loading,
+ see <xref linkend="xfunc-c">.
+ </para>
+
+ <para>
+ The handler function simply returns a struct of function pointers to
+ callback functions that will be called by the planner and executor.
+ Most of the effort in writing an FDW is in implementing these callback
+ functions.
+ The handler function must be registered with
+ <productname>PostgreSQL</productname> as taking no arguments and returning
+ the special pseudo-type <type>fdw_handler</type>.
+ The callback functions are plain C functions and are not visible or
+ callable at the SQL level.
+ </para>
+
+ <para>
+ The validator function is responsible for validating options given in the
+ <command>CREATE FOREIGN DATA WRAPPER</command>, <command>CREATE
+ SERVER</command> and <command>CREATE FOREIGN TABLE</command> commands.
+ The validator function must be registered as taking two arguments, a text
+ array containing the options to be validated, and an OID representing the
+ type of object the options are associated with (in the form of the OID
+ of the system catalog the object would be stored in). If no validator
+ function is supplied, the options are not checked at object creation time.
+ </para>
+
+ <para>
+ The foreign data wrappers included in the standard distribution are good
+ references when trying to write your own. Look into the
+ <filename>contrib/file_fdw</> subdirectory of the source tree.
+ The <xref linkend="sql-createforeigndatawrapper"> reference page also has
+ some useful details.
+ </para>
+
+ <note>
+ <para>
+ The SQL standard specifies an interface for writing foreign data wrappers.
+ However, PostgreSQL does not implement that API, because the effort to
+ accommodate it into PostgreSQL would be large, and the standard API hasn't
+ gained wide adoption anyway.
+ </para>
+ </note>
+
+ <sect1 id="fdw-routines">
+ <title>Foreign Data Wrapper Callback Routines</title>
+
+ <para>
+ The FDW handler function returns a palloc'd <structname>FdwRoutine</>
+ struct containing pointers to the following callback functions:
+ </para>
+
+ <para>
+<programlisting>
+FdwPlan *
+PlanForeignScan (Oid foreigntableid,
+ PlannerInfo *root,
+ RelOptInfo *baserel);
+</programlisting>
+
+ Plan a scan on a foreign table. This is called when a query is planned.
+ <literal>foreigntableid</> is the <structname>pg_class</> OID of the
+ foreign table. <literal>root</> is the planner's global information
+ about the query, and <literal>baserel</> is the planner's information
+ about this table.
+ The function must return a palloc'd struct that contains cost estimates
+ plus any FDW-private information that is needed to execute the foreign
+ scan at a later time. (Note that the private information must be
+ represented in a form that <function>copyObject</> knows how to copy.)
+ </para>
+
+ <para>
+ The information in <literal>root</> and <literal>baserel</> can be used
+ to reduce the amount of information that has to be fetched from the
+ foreign table (and therefore reduce the cost estimate).
+ <literal>baserel-&gt;baserestrictinfo</> is particularly interesting, as
+ it contains restriction quals (<literal>WHERE</> clauses) that can be
+ used to filter the rows to be fetched. (The FDW is not required to
+ enforce these quals, as the finished plan will recheck them anyway.)
+ <literal>baserel-&gt;reltargetlist</> can be used to determine which
+ columns need to be fetched.
+ </para>
+
+ <para>
+ In addition to returning cost estimates, the function should update
+ <literal>baserel-&gt;rows</> to be the expected number of rows returned
+ by the scan, after accounting for the filtering done by the restriction
+ quals. The initial value of <literal>baserel-&gt;rows</> is just a
+ constant default estimate, which should be replaced if at all possible.
+ The function may also choose to update <literal>baserel-&gt;width</> if
+ it can compute a better estimate of the average result row width.
+ </para>
+
+ <para>
+<programlisting>
+void
+ExplainForeignScan (ForeignScanState *node,
+ ExplainState *es);
+</programlisting>
+
+ Print additional <command>EXPLAIN</> output for a foreign table scan.
+ This can just return if there is no need to print anything.
+ Otherwise, it should call <function>ExplainPropertyText</> and
+ related functions to add fields to the <command>EXPLAIN</> output.
+ The flag fields in <literal>es</> can be used to determine what to
+ print, and the state of the <structname>ForeignScanState</> node
+ can be inspected to provide runtime statistics in the <command>EXPLAIN
+ ANALYZE</> case.
+ </para>
+
+ <para>
+<programlisting>
+void
+BeginForeignScan (ForeignScanState *node,
+ int eflags);
+</programlisting>
+
+ Begin executing a foreign scan. This is called during executor startup.
+ It should perform any initialization needed before the scan can start.
+ The <structname>ForeignScanState</> node has already been created, but
+ its <structfield>fdw_state</> field is still NULL. Information about
+ the table to scan is accessible through the
+ <structname>ForeignScanState</> node (in particular, from the underlying
+ <structname>ForeignScan</> plan node, which contains a pointer to the
+ <structname>FdwPlan</> structure returned by
+ <function>PlanForeignScan</>).
+ </para>
+
+ <para>
+ Note that when <literal>(eflags &amp; EXEC_FLAG_EXPLAIN_ONLY)</> is
+ true, this function should not perform any externally-visible actions;
+ it should only do the minimum required to make the node state valid
+ for <function>ExplainForeignScan</> and <function>EndForeignScan</>.
+ </para>
+
+ <para>
+<programlisting>
+TupleTableSlot *
+IterateForeignScan (ForeignScanState *node);
+</programlisting>
+
+ Fetch one row from the foreign source, returning it in a tuple table slot
+ (the node's <structfield>ScanTupleSlot</> should be used for this
+ purpose). Return NULL if no more rows are available. The tuple table
+ slot infrastructure allows either a physical or virtual tuple to be
+ returned; in most cases the latter choice is preferable from a
+ performance standpoint. Note that this is called in a short-lived memory
+ context that will be reset between invocations. Create a memory context
+ in <function>BeginForeignScan</> if you need longer-lived storage, or use
+ the <structfield>es_query_cxt</> of the node's <structname>EState</>.
+ </para>
+
+ <para>
+ The rows returned must match the column signature of the foreign table
+ being scanned. If you choose to optimize away fetching columns that
+ are not needed, you should insert nulls in those column positions.
+ </para>
+
+ <para>
+<programlisting>
+void
+ReScanForeignScan (ForeignScanState *node);
+</programlisting>
+
+ Restart the scan from the beginning. Note that any parameters the
+ scan depends on may have changed value, so the new scan does not
+ necessarily return exactly the same rows.
+ </para>
+
+ <para>
+<programlisting>
+void
+EndForeignScan (ForeignScanState *node);
+</programlisting>
+
+ End the scan and release resources. It is normally not important
+ to release palloc'd memory, but for example open files and connections
+ to remote servers should be cleaned up.
+ </para>
+
+ <para>
+ The <structname>FdwRoutine</> and <structname>FdwPlan</> struct types
+ are declared in <filename>src/include/foreign/fdwapi.h</>, which see
+ for additional details.
+ </para>
+
+ </sect1>
+
+ </chapter>
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index b9d4ea59b1a..659bcba7c78 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -86,6 +86,7 @@
<!entity indexam SYSTEM "indexam.sgml">
<!entity nls SYSTEM "nls.sgml">
<!entity plhandler SYSTEM "plhandler.sgml">
+<!entity fdwhandler SYSTEM "fdwhandler.sgml">
<!entity protocol SYSTEM "protocol.sgml">
<!entity sources SYSTEM "sources.sgml">
<!entity storage SYSTEM "storage.sgml">
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index 4d32f7db259..98d19a5c733 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -238,6 +238,7 @@
&sources;
&nls;
&plhandler;
+ &fdwhandler;
&geqo;
&indexam;
&gist;
diff --git a/doc/src/sgml/ref/create_foreign_data_wrapper.sgml b/doc/src/sgml/ref/create_foreign_data_wrapper.sgml
index 711f32b118b..3093ebcb4ac 100644
--- a/doc/src/sgml/ref/create_foreign_data_wrapper.sgml
+++ b/doc/src/sgml/ref/create_foreign_data_wrapper.sgml
@@ -119,18 +119,13 @@ CREATE FOREIGN DATA WRAPPER <replaceable class="parameter">name</replaceable>
<title>Notes</title>
<para>
- At the moment, the foreign-data wrapper functionality is very
- rudimentary. The purpose of foreign-data wrappers, foreign
- servers, and user mappings is to store this information in a
- standard way so that it can be queried by interested applications.
- One such application is <application>dblink</application>;
- see <xref linkend="dblink">. The functionality to actually query
- external data through a foreign-data wrapper library does not exist
- yet.
+ At the moment, the foreign-data wrapper functionality is rudimentary.
+ There is no support for updating a foreign table, and optimization of
+ queries is primitive (and mostly left to the wrapper, too).
</para>
<para>
- There is currently one foreign-data wrapper validator function
+ There is one built-in foreign-data wrapper validator function
provided:
<filename>postgresql_fdw_validator</filename>, which accepts
options corresponding to <application>libpq</> connection
diff --git a/doc/src/sgml/ref/create_foreign_table.sgml b/doc/src/sgml/ref/create_foreign_table.sgml
index ac2e1393e38..77c62140f28 100644
--- a/doc/src/sgml/ref/create_foreign_table.sgml
+++ b/doc/src/sgml/ref/create_foreign_table.sgml
@@ -131,8 +131,8 @@ CREATE FOREIGN TABLE [ IF NOT EXISTS ] <replaceable class="PARAMETER">table_name
<para>
Options to be associated with the new foreign table.
The allowed option names and values are specific to each foreign
- data wrapper and are validated using the foreign-data wrapper
- library. Option names must be unique.
+ data wrapper and are validated using the foreign-data wrapper's
+ validator function. Option names must be unique.
</para>
</listitem>
</varlistentry>