Doc: show how to get the equivalent of LIMIT for UPDATE/DELETE.

Add examples showing use of a CTE and a self-join to perform partial UPDATEs and DELETEs. Corey Huinker, reviewed by Laurenz Albe Discussion: https://postgr.es/m/CADkLM=caNEQsUwPWnfi2jR4ix99E0EJM_3jtcE-YjnEQC7Rssw@mail.gmail.com
author: Tom Lane <tgl@sss.pgh.pa.us> 2024-04-07 16:26:47 -0400
committer: Tom Lane <tgl@sss.pgh.pa.us> 2024-04-07 16:26:47 -0400
commit: 2daeba6a4e49f2ac4cc202fc9352c2a59e928a6c (patch)
tree: 1bfda8e54b3f7c645dd1119707fb0c24f2eee80e /doc/src
parent: 1973d9fb318316df16b0da89e6fe5f2c53473b38 (diff)
download: postgresql-2daeba6a4e49f2ac4cc202fc9352c2a59e928a6c.tar.gz
postgresql-2daeba6a4e49f2ac4cc202fc9352c2a59e928a6c.zip
2 files changed, 61 insertions, 3 deletions
diff --git a/doc/src/sgml/ref/delete.sgml b/doc/src/sgml/ref/delete.sgml
index 1b81b4e7d74..0b6fa005123 100644
--- a/doc/src/sgml/ref/delete.sgml
+++ b/doc/src/sgml/ref/delete.sgml
@@ -260,12 +260,32 @@ DELETE FROM tasks WHERE status = 'DONE' RETURNING *;
 </programlisting>
   </para>
 
-   <para>
+  <para>
    Delete the row of <structname>tasks</structname> on which the cursor
    <literal>c_tasks</literal> is currently positioned:
 <programlisting>
 DELETE FROM tasks WHERE CURRENT OF c_tasks;
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   While there is no <literal>LIMIT</literal> clause
+   for <command>DELETE</command>, it is possible to get a similar effect
+   using the same method described in <link linkend="update-limit">the
+   documentation of <command>UPDATE</command></link>:
+<programlisting>
+WITH delete_batch AS (
+  SELECT l.ctid FROM user_logs AS l
+    WHERE l.status = 'archived'
+    ORDER BY l.creation_date
+    FOR UPDATE
+    LIMIT 10000
+)
+DELETE FROM user_logs AS dl
+  USING delete_batch AS del
+  WHERE dl.ctid = del.ctid;
+</programlisting>
+  </para>
  </refsect1>
 
  <refsect1>
diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 2ab24b0523e..babb34fa511 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -441,7 +441,45 @@ COMMIT;
    <literal>c_films</literal> is currently positioned:
 <programlisting>
 UPDATE films SET kind = 'Dramatic' WHERE CURRENT OF c_films;
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para id="update-limit">
+   Updates affecting many rows can have negative effects on system
+   performance, such as table bloat, increased replica lag, and increased
+   lock contention.  In such situations it can make sense to perform the
+   operation in smaller batches, possibly with a <command>VACUUM</command>
+   operation on the table between batches.  While there is
+   no <literal>LIMIT</literal> clause for <command>UPDATE</command>, it is
+   possible to get a similar effect through the use of
+   a <link linkend="queries-with">Common Table Expression</link> and a
+   self-join.  With the standard <productname>PostgreSQL</productname>
+   table access method, a self-join on the system
+   column <link linkend="ddl-system-columns-ctid">ctid</link> is very
+   efficient:
+<programlisting>
+WITH exceeded_max_retries AS (
+  SELECT w.ctid FROM work_item AS w
+    WHERE w.status = 'active' AND w.num_retries &gt; 10
+    ORDER BY w.retry_timestamp
+    FOR UPDATE
+    LIMIT 5000
+)
+UPDATE work_item SET status = 'failed'
+  FROM exceeded_max_retries AS emr
+  WHERE work_item.ctid = emr.ctid;
+</programlisting>
+   This command will need to be repeated until no rows remain to be updated.
+   Use of an <literal>ORDER BY</literal> clause allows the command to
+   prioritize which rows will be updated; it can also prevent deadlock
+   with other update operations if they use the same ordering.
+   If lock contention is a concern, then <literal>SKIP LOCKED</literal>
+   can be added to the <acronym>CTE</acronym> to prevent multiple commands
+   from updating the same row.  However, then a
+   final <command>UPDATE</command> without <literal>SKIP LOCKED</literal>
+   or <literal>LIMIT</literal> will be needed to ensure that no matching
+   rows were overlooked.
+  </para>
  </refsect1>
 
  <refsect1>
author	Tom Lane <tgl@sss.pgh.pa.us>	2024-04-07 16:26:47 -0400
committer	Tom Lane <tgl@sss.pgh.pa.us>	2024-04-07 16:26:47 -0400
commit	2daeba6a4e49f2ac4cc202fc9352c2a59e928a6c (patch)
tree	1bfda8e54b3f7c645dd1119707fb0c24f2eee80e /doc/src
parent	1973d9fb318316df16b0da89e6fe5f2c53473b38 (diff)
download	postgresql-2daeba6a4e49f2ac4cc202fc9352c2a59e928a6c.tar.gz postgresql-2daeba6a4e49f2ac4cc202fc9352c2a59e928a6c.zip