diff options
author | Michael Paquier <michael@paquier.xyz> | 2023-07-24 13:48:22 +0900 |
---|---|---|
committer | Michael Paquier <michael@paquier.xyz> | 2023-07-24 13:48:22 +0900 |
commit | e35cc3b3f2d081e2de3a0e077715d12b3580cc74 (patch) | |
tree | f2147a1ac59d2adf0b3ca3d9deacd0cb578d268c /doc/src | |
parent | 29836df323d752d534deb7b922cd48f08132e044 (diff) | |
download | postgresql-e35cc3b3f2d081e2de3a0e077715d12b3580cc74.tar.gz postgresql-e35cc3b3f2d081e2de3a0e077715d12b3580cc74.zip |
pgbench: Use COPY for client-side data generation
This commit switches the client-side data generation from INSERT queries
to COPY for the two tables pgbench_branches and pgbench_tellers.
pgbench_accounts was already using COPY.
COPY is a better interface for bulk loading or high latency connections
(this point can be countered with the option for server-side data
generation, still client-side is the default), and measurements have
proved that using it for these two other tables can lead to improvements
during initialization. I did not notice slowdowns at large scale
numbers on a local setup, either, most of the work happening for the
accounts table.
Previously COPY was only used for the pgbench_accounts table because the
amount of data was much larger than the two other tables. The code is
refactored so as all three tables use the same code path to execute the
COPY queries, with a callback to build data rows.
Author: Tristan Partin
Discussion: https://postgr.es/m/CSTU5P82ONZ1.19XFUGHMXHBRY@c3po
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/ref/pgbench.sgml | 9 |
1 files changed, 5 insertions, 4 deletions
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 850028557d3..6c5c8afa6d4 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -231,10 +231,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d extensively through a <command>COPY</command>. <command>pgbench</command> uses the FREEZE option with version 14 or later of <productname>PostgreSQL</productname> to speed up - subsequent <command>VACUUM</command>, unless partitions are enabled. - Using <literal>g</literal> causes logging to print one message - every 100,000 rows while generating data for the - <structname>pgbench_accounts</structname> table. + subsequent <command>VACUUM</command>, except on the + <literal>pgbench_accounts</literal> table if partitions are + enabled. Using <literal>g</literal> causes logging to + print one message every 100,000 rows while generating data for all + tables. </para> <para> With <literal>G</literal> (server-side data generation), |