From b663a4136331de6c7364226e3dbf7c88bfee7145 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Wed, 20 Jan 2021 23:05:46 +0100 Subject: Implement support for bulk inserts in postgres_fdw Extends the FDW API to allow batching inserts into foreign tables. That is usually much more efficient than inserting individual rows, due to high latency for each round-trip to the foreign server. It was possible to implement something similar in the regular FDW API, but it was inconvenient and there were issues with reporting the number of actually inserted rows etc. This extends the FDW API with two new functions: * GetForeignModifyBatchSize - allows the FDW picking optimal batch size * ExecForeignBatchInsert - inserts a batch of rows at once Currently, only INSERT queries support batching. Support for DELETE and UPDATE may be added in the future. This also implements batching for postgres_fdw. The batch size may be specified using "batch_size" option both at the server and table level. The initial patch version was written by me, but it was rewritten and improved in many ways by Takayuki Tsunakawa. Author: Takayuki Tsunakawa Reviewed-by: Tomas Vondra, Amit Langote Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development --- doc/src/sgml/fdwhandler.sgml | 89 +++++++++++++++++++++++++++++++++++++++--- doc/src/sgml/postgres-fdw.sgml | 13 ++++++ 2 files changed, 96 insertions(+), 6 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml index 9c9293414c5..854913ae5fc 100644 --- a/doc/src/sgml/fdwhandler.sgml +++ b/doc/src/sgml/fdwhandler.sgml @@ -523,8 +523,9 @@ BeginForeignModify(ModifyTableState *mtstate, Begin executing a foreign table modification operation. This routine is called during executor startup. It should perform any initialization needed prior to the actual table modifications. Subsequently, - ExecForeignInsert, ExecForeignUpdate or - ExecForeignDelete will be called for each tuple to be + ExecForeignInsert/ExecForeignBatchInsert, + ExecForeignUpdate or + ExecForeignDelete will be called for tuple(s) to be inserted, updated, or deleted. @@ -614,6 +615,81 @@ ExecForeignInsert(EState *estate, +TupleTableSlot ** +ExecForeignBatchInsert(EState *estate, + ResultRelInfo *rinfo, + TupleTableSlot **slots, + TupleTableSlot *planSlots, + int *numSlots); + + + Insert multiple tuples in bulk into the foreign table. + The parameters are the same for ExecForeignInsert + except slots and planSlots contain + multiple tuples and *numSlots> specifies the number of + tuples in those arrays. + + + + The return value is an array of slots containing the data that was + actually inserted (this might differ from the data supplied, for + example as a result of trigger actions.) + The passed-in slots can be re-used for this purpose. + The number of successfully inserted tuples is returned in + *numSlots. + + + + The data in the returned slot is used only if the INSERT + statement involves a view + WITH CHECK OPTION; or if the foreign table has + an AFTER ROW trigger. Triggers require all columns, + but the FDW could choose to optimize away returning some or all columns + depending on the contents of the + WITH CHECK OPTION constraints. + + + + If the ExecForeignBatchInsert or + GetForeignModifyBatchSize pointer is set to + NULL, attempts to insert into the foreign table will + use ExecForeignInsert. + This function is not used if the INSERT has the + RETURNING> clause. + + + + Note that this function is also called when inserting routed tuples into + a foreign-table partition. See the callback functions + described below that allow the FDW to support that. + + + + +int +GetForeignModifyBatchSize(ResultRelInfo *rinfo); + + + Report the maximum number of tuples that a single + ExecForeignBatchInsert call can handle for + the specified foreign table. That is, The executor passes at most + the number of tuples that this function returns to + ExecForeignBatchInsert. + rinfo is the ResultRelInfo struct describing + the target foreign table. + The FDW is expected to provide a foreign server and/or foreign + table option for the user to set this value, or some hard-coded value. + + + + If the ExecForeignBatchInsert or + GetForeignModifyBatchSize pointer is set to + NULL, attempts to insert into the foreign table will + use ExecForeignInsert. + + + + TupleTableSlot * ExecForeignUpdate(EState *estate, ResultRelInfo *rinfo, @@ -741,8 +817,9 @@ BeginForeignInsert(ModifyTableState *mtstate, in both cases when it is the partition chosen for tuple routing and the target specified in a COPY FROM command. It should perform any initialization needed prior to the actual insertion. - Subsequently, ExecForeignInsert will be called for - each tuple to be inserted into the foreign table. + Subsequently, ExecForeignInsert or + ExecForeignBatchInsert will be called for + tuple(s) to be inserted into the foreign table. @@ -773,8 +850,8 @@ BeginForeignInsert(ModifyTableState *mtstate, Note that if the FDW does not support routable foreign-table partitions and/or executing COPY FROM on foreign tables, this - function or ExecForeignInsert subsequently called - must throw error as needed. + function or ExecForeignInsert/ExecForeignBatchInsert + subsequently called must throw error as needed. diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index 9adc8d12a9d..fb4c22ac69f 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -354,6 +354,19 @@ OPTIONS (ADD password_required 'false'); + + batch_size + + + This option specifies the number of rows postgres_fdw + should insert in each insert operation. It can be specified for a + foreign table or a foreign server. The option specified on a table + overrides an option specified for the server. + The default is 1. + + + + -- cgit v1.2.3