aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2024-04-03 17:41:54 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2024-04-03 17:41:57 -0400
commit06286709ee0637ec7376329a5aa026b7682dcfe2 (patch)
treef1c4f4b606b28227c401be2722b1d08c924586e9 /doc/src
parent97ce821e3e171ce99fa7c398889ac08432cd0264 (diff)
downloadpostgresql-06286709ee0637ec7376329a5aa026b7682dcfe2.tar.gz
postgresql-06286709ee0637ec7376329a5aa026b7682dcfe2.zip
Invent SERIALIZE option for EXPLAIN.
EXPLAIN (ANALYZE, SERIALIZE) allows collection of statistics about the volume of data emitted by a query, as well as the time taken to convert the data to the on-the-wire format. Previously there was no way to investigate this without actually sending the data to the client, in which case network transmission costs might swamp what you wanted to see. In particular this feature allows investigating the costs of de-TOASTing compressed or out-of-line data during formatting. Stepan Rutz and Matthias van de Meent, reviewed by Tomas Vondra and myself Discussion: https://postgr.es/m/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1@gmx.de
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/perform.sgml24
-rw-r--r--doc/src/sgml/ref/explain.sgml29
2 files changed, 47 insertions, 6 deletions
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 4b831a62066..2d0097f121a 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -144,11 +144,11 @@ EXPLAIN SELECT * FROM tenk1;
It's important to understand that the cost of an upper-level node includes
the cost of all its child nodes. It's also important to realize that
the cost only reflects things that the planner cares about.
- In particular, the cost does not consider the time spent transmitting
- result rows to the client, which could be an important
- factor in the real elapsed time; but the planner ignores it because
- it cannot change it by altering the plan. (Every correct plan will
- output the same row set, we trust.)
+ In particular, the cost does not consider the time spent to convert
+ output values to text form or to transmit them to the client, which
+ could be important factors in the real elapsed time; but the planner
+ ignores those costs because it cannot change them by altering the
+ plan. (Every correct plan will output the same row set, we trust.)
</para>
<para>
@@ -956,6 +956,17 @@ EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
<command>EXPLAIN ANALYZE</command>.
</para>
+ <para>
+ The time shown for the top-level node does not include any time needed
+ to convert the query's output data into displayable form or to send it
+ to the client. While <command>EXPLAIN ANALYZE</command> will never
+ send the data to the client, it can be told to convert the query's
+ output data to displayable form and measure the time needed for that,
+ by specifying the <literal>SERIALIZE</literal> option. That time will
+ be shown separately, and it's also included in the
+ total <literal>Execution time</literal>.
+ </para>
+
</sect2>
<sect2 id="using-explain-caveats">
@@ -965,7 +976,8 @@ EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
There are two significant ways in which run times measured by
<command>EXPLAIN ANALYZE</command> can deviate from normal execution of
the same query. First, since no output rows are delivered to the client,
- network transmission costs and I/O conversion costs are not included.
+ network transmission costs are not included. I/O conversion costs are
+ not included either unless <literal>SERIALIZE</literal> is specified.
Second, the measurement overhead added by <command>EXPLAIN
ANALYZE</command> can be significant, especially on machines with slow
<function>gettimeofday()</function> operating-system calls. You can use the
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index a4b6564bdb3..db9d3a8549a 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -41,6 +41,7 @@ EXPLAIN [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] <rep
SETTINGS [ <replaceable class="parameter">boolean</replaceable> ]
GENERIC_PLAN [ <replaceable class="parameter">boolean</replaceable> ]
BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
+ SERIALIZE [ { NONE | TEXT | BINARY } ]
WAL [ <replaceable class="parameter">boolean</replaceable> ]
TIMING [ <replaceable class="parameter">boolean</replaceable> ]
SUMMARY [ <replaceable class="parameter">boolean</replaceable> ]
@@ -207,6 +208,34 @@ ROLLBACK;
</varlistentry>
<varlistentry>
+ <term><literal>SERIALIZE</literal></term>
+ <listitem>
+ <para>
+ Include information on the cost
+ of <firstterm>serializing</firstterm> the query's output data, that
+ is converting it to text or binary format to send to the client.
+ This can be a significant part of the time required for regular
+ execution of the query, if the datatype output functions are
+ expensive or if <acronym>TOAST</acronym>ed values must be fetched
+ from out-of-line storage. <command>EXPLAIN</command>'s default
+ behavior, <literal>SERIALIZE NONE</literal>, does not perform these
+ conversions. If <literal>SERIALIZE TEXT</literal>
+ or <literal>SERIALIZE BINARY</literal> is specified, the appropriate
+ conversions are performed, and the time spent doing so is measured
+ (unless <literal>TIMING OFF</literal> is specified). If
+ the <literal>BUFFERS</literal> option is also specified, then any
+ buffer accesses involved in the conversions are counted too.
+ In no case, however, will <command>EXPLAIN</command> actually send
+ the resulting data to the client; hence network transmission costs
+ cannot be investigated this way.
+ Serialization may only be enabled when <literal>ANALYZE</literal> is
+ also enabled. If <literal>SERIALIZE</literal> is written without an
+ argument, <literal>TEXT</literal> is assumed.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>WAL</literal></term>
<listitem>
<para>