diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2011-02-20 00:17:18 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2011-02-20 00:18:14 -0500 |
commit | bb742407947ad1cbf19355d24282380d576e7654 (patch) | |
tree | ac377ed05d85d9cbd0b33127f4d59750b6e60cda /doc/src | |
parent | d5813488a4ccc78ec3a4ad0d5da4e6e844af75e8 (diff) | |
download | postgresql-bb742407947ad1cbf19355d24282380d576e7654.tar.gz postgresql-bb742407947ad1cbf19355d24282380d576e7654.zip |
Implement an API to let foreign-data wrappers actually be functional.
This commit provides the core code and documentation needed. A contrib
module test case will follow shortly.
Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/ddl.sgml | 47 | ||||
-rw-r--r-- | doc/src/sgml/fdwhandler.sgml | 212 | ||||
-rw-r--r-- | doc/src/sgml/filelist.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/postgres.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_foreign_data_wrapper.sgml | 13 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_foreign_table.sgml | 4 |
6 files changed, 267 insertions, 11 deletions
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index a65b4bcd338..12f7c3706e8 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -2986,6 +2986,53 @@ ANALYZE measurement; </sect2> </sect1> + <sect1 id="ddl-foreign-data"> + <title>Foreign Data</title> + + <indexterm> + <primary>foreign data</primary> + </indexterm> + <indexterm> + <primary>foreign table</primary> + </indexterm> + + <para> + <productname>PostgreSQL</productname> implements portions of the SQL/MED + specification, allowing you to access data that resides outside + PostgreSQL using regular SQL queries. Such data is referred to as + <firstterm>foreign data</>. (Note that this usage is not to be confused + with foreign keys, which are a type of constraint within the database.) + </para> + + <para> + Foreign data is accessed with help from a + <firstterm>foreign data wrapper</firstterm>. A foreign data wrapper is a + library that can communicate with an external data source, hiding the + details of connecting to the data source and fetching data from it. There + are several foreign data wrappers available, which can for example read + plain data files residing on the server, or connect to another PostgreSQL + instance. If none of the existing foreign data wrappers suit your needs, + you can write your own; see <xref linkend="fdwhandler">. + </para> + + <para> + To access foreign data, you need to create a <firstterm>foreign server</> + object, which defines how to connect to a particular external data source, + according to the set of options used by a particular foreign data + wrapper. Then you need to create one or more <firstterm>foreign + tables</firstterm>, which define the structure of the remote data. A + foreign table can be used in queries just like a normal table, but a + foreign table has no storage in the PostgreSQL server. Whenever it is + used, PostgreSQL asks the foreign data wrapper to fetch the data from the + external source. + </para> + + <para> + Currently, foreign tables are read-only. This limitation may be fixed + in a future release. + </para> + </sect1> + <sect1 id="ddl-others"> <title>Other Database Objects</title> diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml new file mode 100644 index 00000000000..fc07f129b79 --- /dev/null +++ b/doc/src/sgml/fdwhandler.sgml @@ -0,0 +1,212 @@ +<!-- doc/src/sgml/fdwhandler.sgml --> + + <chapter id="fdwhandler"> + <title>Writing A Foreign Data Wrapper</title> + + <indexterm zone="fdwhandler"> + <primary>foreign data wrapper</primary> + <secondary>handler for</secondary> + </indexterm> + + <para> + All operations on a foreign table are handled through its foreign data + wrapper, which consists of a set of functions that the planner and + executor call. The foreign data wrapper is responsible for fetching + data from the remote data source and returning it to the + <productname>PostgreSQL</productname> executor. This chapter outlines how + to write a new foreign data wrapper. + </para> + + <para> + The FDW author needs to implement a handler function, and optionally + a validator function. Both functions must be written in a compiled + language such as C, using the version-1 interface. + For details on C language calling conventions and dynamic loading, + see <xref linkend="xfunc-c">. + </para> + + <para> + The handler function simply returns a struct of function pointers to + callback functions that will be called by the planner and executor. + Most of the effort in writing an FDW is in implementing these callback + functions. + The handler function must be registered with + <productname>PostgreSQL</productname> as taking no arguments and returning + the special pseudo-type <type>fdw_handler</type>. + The callback functions are plain C functions and are not visible or + callable at the SQL level. + </para> + + <para> + The validator function is responsible for validating options given in the + <command>CREATE FOREIGN DATA WRAPPER</command>, <command>CREATE + SERVER</command> and <command>CREATE FOREIGN TABLE</command> commands. + The validator function must be registered as taking two arguments, a text + array containing the options to be validated, and an OID representing the + type of object the options are associated with (in the form of the OID + of the system catalog the object would be stored in). If no validator + function is supplied, the options are not checked at object creation time. + </para> + + <para> + The foreign data wrappers included in the standard distribution are good + references when trying to write your own. Look into the + <filename>contrib/file_fdw</> subdirectory of the source tree. + The <xref linkend="sql-createforeigndatawrapper"> reference page also has + some useful details. + </para> + + <note> + <para> + The SQL standard specifies an interface for writing foreign data wrappers. + However, PostgreSQL does not implement that API, because the effort to + accommodate it into PostgreSQL would be large, and the standard API hasn't + gained wide adoption anyway. + </para> + </note> + + <sect1 id="fdw-routines"> + <title>Foreign Data Wrapper Callback Routines</title> + + <para> + The FDW handler function returns a palloc'd <structname>FdwRoutine</> + struct containing pointers to the following callback functions: + </para> + + <para> +<programlisting> +FdwPlan * +PlanForeignScan (Oid foreigntableid, + PlannerInfo *root, + RelOptInfo *baserel); +</programlisting> + + Plan a scan on a foreign table. This is called when a query is planned. + <literal>foreigntableid</> is the <structname>pg_class</> OID of the + foreign table. <literal>root</> is the planner's global information + about the query, and <literal>baserel</> is the planner's information + about this table. + The function must return a palloc'd struct that contains cost estimates + plus any FDW-private information that is needed to execute the foreign + scan at a later time. (Note that the private information must be + represented in a form that <function>copyObject</> knows how to copy.) + </para> + + <para> + The information in <literal>root</> and <literal>baserel</> can be used + to reduce the amount of information that has to be fetched from the + foreign table (and therefore reduce the cost estimate). + <literal>baserel->baserestrictinfo</> is particularly interesting, as + it contains restriction quals (<literal>WHERE</> clauses) that can be + used to filter the rows to be fetched. (The FDW is not required to + enforce these quals, as the finished plan will recheck them anyway.) + <literal>baserel->reltargetlist</> can be used to determine which + columns need to be fetched. + </para> + + <para> + In addition to returning cost estimates, the function should update + <literal>baserel->rows</> to be the expected number of rows returned + by the scan, after accounting for the filtering done by the restriction + quals. The initial value of <literal>baserel->rows</> is just a + constant default estimate, which should be replaced if at all possible. + The function may also choose to update <literal>baserel->width</> if + it can compute a better estimate of the average result row width. + </para> + + <para> +<programlisting> +void +ExplainForeignScan (ForeignScanState *node, + ExplainState *es); +</programlisting> + + Print additional <command>EXPLAIN</> output for a foreign table scan. + This can just return if there is no need to print anything. + Otherwise, it should call <function>ExplainPropertyText</> and + related functions to add fields to the <command>EXPLAIN</> output. + The flag fields in <literal>es</> can be used to determine what to + print, and the state of the <structname>ForeignScanState</> node + can be inspected to provide runtime statistics in the <command>EXPLAIN + ANALYZE</> case. + </para> + + <para> +<programlisting> +void +BeginForeignScan (ForeignScanState *node, + int eflags); +</programlisting> + + Begin executing a foreign scan. This is called during executor startup. + It should perform any initialization needed before the scan can start. + The <structname>ForeignScanState</> node has already been created, but + its <structfield>fdw_state</> field is still NULL. Information about + the table to scan is accessible through the + <structname>ForeignScanState</> node (in particular, from the underlying + <structname>ForeignScan</> plan node, which contains a pointer to the + <structname>FdwPlan</> structure returned by + <function>PlanForeignScan</>). + </para> + + <para> + Note that when <literal>(eflags & EXEC_FLAG_EXPLAIN_ONLY)</> is + true, this function should not perform any externally-visible actions; + it should only do the minimum required to make the node state valid + for <function>ExplainForeignScan</> and <function>EndForeignScan</>. + </para> + + <para> +<programlisting> +TupleTableSlot * +IterateForeignScan (ForeignScanState *node); +</programlisting> + + Fetch one row from the foreign source, returning it in a tuple table slot + (the node's <structfield>ScanTupleSlot</> should be used for this + purpose). Return NULL if no more rows are available. The tuple table + slot infrastructure allows either a physical or virtual tuple to be + returned; in most cases the latter choice is preferable from a + performance standpoint. Note that this is called in a short-lived memory + context that will be reset between invocations. Create a memory context + in <function>BeginForeignScan</> if you need longer-lived storage, or use + the <structfield>es_query_cxt</> of the node's <structname>EState</>. + </para> + + <para> + The rows returned must match the column signature of the foreign table + being scanned. If you choose to optimize away fetching columns that + are not needed, you should insert nulls in those column positions. + </para> + + <para> +<programlisting> +void +ReScanForeignScan (ForeignScanState *node); +</programlisting> + + Restart the scan from the beginning. Note that any parameters the + scan depends on may have changed value, so the new scan does not + necessarily return exactly the same rows. + </para> + + <para> +<programlisting> +void +EndForeignScan (ForeignScanState *node); +</programlisting> + + End the scan and release resources. It is normally not important + to release palloc'd memory, but for example open files and connections + to remote servers should be cleaned up. + </para> + + <para> + The <structname>FdwRoutine</> and <structname>FdwPlan</> struct types + are declared in <filename>src/include/foreign/fdwapi.h</>, which see + for additional details. + </para> + + </sect1> + + </chapter> diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index b9d4ea59b1a..659bcba7c78 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -86,6 +86,7 @@ <!entity indexam SYSTEM "indexam.sgml"> <!entity nls SYSTEM "nls.sgml"> <!entity plhandler SYSTEM "plhandler.sgml"> +<!entity fdwhandler SYSTEM "fdwhandler.sgml"> <!entity protocol SYSTEM "protocol.sgml"> <!entity sources SYSTEM "sources.sgml"> <!entity storage SYSTEM "storage.sgml"> diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index 4d32f7db259..98d19a5c733 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -238,6 +238,7 @@ &sources; &nls; &plhandler; + &fdwhandler; &geqo; &indexam; &gist; diff --git a/doc/src/sgml/ref/create_foreign_data_wrapper.sgml b/doc/src/sgml/ref/create_foreign_data_wrapper.sgml index 711f32b118b..3093ebcb4ac 100644 --- a/doc/src/sgml/ref/create_foreign_data_wrapper.sgml +++ b/doc/src/sgml/ref/create_foreign_data_wrapper.sgml @@ -119,18 +119,13 @@ CREATE FOREIGN DATA WRAPPER <replaceable class="parameter">name</replaceable> <title>Notes</title> <para> - At the moment, the foreign-data wrapper functionality is very - rudimentary. The purpose of foreign-data wrappers, foreign - servers, and user mappings is to store this information in a - standard way so that it can be queried by interested applications. - One such application is <application>dblink</application>; - see <xref linkend="dblink">. The functionality to actually query - external data through a foreign-data wrapper library does not exist - yet. + At the moment, the foreign-data wrapper functionality is rudimentary. + There is no support for updating a foreign table, and optimization of + queries is primitive (and mostly left to the wrapper, too). </para> <para> - There is currently one foreign-data wrapper validator function + There is one built-in foreign-data wrapper validator function provided: <filename>postgresql_fdw_validator</filename>, which accepts options corresponding to <application>libpq</> connection diff --git a/doc/src/sgml/ref/create_foreign_table.sgml b/doc/src/sgml/ref/create_foreign_table.sgml index ac2e1393e38..77c62140f28 100644 --- a/doc/src/sgml/ref/create_foreign_table.sgml +++ b/doc/src/sgml/ref/create_foreign_table.sgml @@ -131,8 +131,8 @@ CREATE FOREIGN TABLE [ IF NOT EXISTS ] <replaceable class="PARAMETER">table_name <para> Options to be associated with the new foreign table. The allowed option names and values are specific to each foreign - data wrapper and are validated using the foreign-data wrapper - library. Option names must be unique. + data wrapper and are validated using the foreign-data wrapper's + validator function. Option names must be unique. </para> </listitem> </varlistentry> |