aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2024-09-30 17:57:12 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2024-09-30 17:57:12 -0400
commit7702337489810f645b3501d99215c2b525c5abca (patch)
tree6f7491e11685972ddb619064e31efa2538ab8436 /doc/src
parenta19f83f87966f763991cc76404f8e42a36e7e842 (diff)
downloadpostgresql-7702337489810f645b3501d99215c2b525c5abca.tar.gz
postgresql-7702337489810f645b3501d99215c2b525c5abca.zip
Do not treat \. as an EOF marker in CSV mode for COPY IN.
Since backslash is (typically) not special in CSV data, we should not be treating \. as special either. The server historically did this to keep CSV and TEXT modes more alike and to support V2 protocol; but V2 protocol is long dead, and the inconsistency with CSV standards is annoying. Remove that behavior in CopyReadLineText, and make some minor consequent code simplifications. On the client side, we need to fix psql so that it does not check for \. except when reading data from STDIN (that is, the script source). We must do that regardless of TEXT/CSV mode or there is no way to end the COPY short of script EOF. Also, be careful not to send the \. to the server in that case. This is a small compatibility break in that other applications beside psql may need similar adjustment. Also, using an older version of psql with a v18 server may result in misbehavior during CSV-mode COPY IN. Daniel Vérité, reviewed by vignesh C, Robert Haas, and myself Discussion: https://postgr.es/m/ed659f37-a9dd-42a7-82b9-0da562cc4006@manitou-mail.org
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/libpq.sgml5
-rw-r--r--doc/src/sgml/protocol.sgml5
-rw-r--r--doc/src/sgml/ref/copy.sgml36
-rw-r--r--doc/src/sgml/ref/psql-ref.sgml7
4 files changed, 33 insertions, 20 deletions
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 783e8e750bb..4a727d44997 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -7381,8 +7381,9 @@ int PQputline(PGconn *conn,
<literal>\.</literal> as a final line to indicate to the server that it had
finished sending <command>COPY</command> data. While this still works, it is deprecated and the
special meaning of <literal>\.</literal> can be expected to be removed in a
- future release. It is sufficient to call <xref linkend="libpq-PQendcopy"/> after
- having sent the actual data.
+ future release. (It already will misbehave in <literal>CSV</literal>
+ mode.) It is sufficient to call <xref linkend="libpq-PQendcopy"/>
+ after having sent the actual data.
</para>
</note>
</listitem>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 11b64567797..2d2481bb8b8 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -7606,8 +7606,9 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
is a well-defined way to recover from errors during <command>COPY</command>. The special
<quote><literal>\.</literal></quote> last line is not needed anymore, and is not sent
during <command>COPY OUT</command>.
- (It is still recognized as a terminator during <command>COPY IN</command>, but its use is
- deprecated and will eventually be removed.) Binary <command>COPY</command> is supported.
+ (It is still recognized as a terminator during text-mode <command>COPY
+ IN</command>, but not in CSV mode. The text-mode behavior is
+ deprecated and may eventually be removed.) Binary <command>COPY</command> is supported.
The CopyInResponse and CopyOutResponse messages include fields indicating
the number of columns and the format of each column.
</para>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a045..fdbd20bc50b 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -646,11 +646,16 @@ COPY <replaceable class="parameter">count</replaceable>
</para>
<para>
- End of data can be represented by a single line containing just
+ End of data can be represented by a line containing just
backslash-period (<literal>\.</literal>). An end-of-data marker is
not necessary when reading from a file, since the end of file
- serves perfectly well; it is needed only when copying data to or from
- client applications using pre-3.0 client protocol.
+ serves perfectly well; in that context this provision exists only for
+ backward compatibility. However, <application>psql</application>
+ uses <literal>\.</literal> to terminate a <literal>COPY FROM
+ STDIN</literal> operation (that is, reading
+ in-line <command>COPY</command> data in a SQL script). In that
+ context the rule is needed to be able to end the operation before the
+ end of the script.
</para>
<para>
@@ -811,18 +816,27 @@ COPY <replaceable class="parameter">count</replaceable>
<para>
Because backslash is not a special character in the <literal>CSV</literal>
- format, <literal>\.</literal>, the end-of-data marker, could also appear
- as a data value. To avoid any misinterpretation, a <literal>\.</literal>
- data value appearing as a lone entry on a line is automatically
- quoted on output, and on input, if quoted, is not interpreted as the
- end-of-data marker. If you are loading a file created by another
- application that has a single unquoted column and might have a
- value of <literal>\.</literal>, you might need to quote that value in the
- input file.
+ format, the end-of-data marker used in text mode (<literal>\.</literal>)
+ is not normally treated as special when reading <literal>CSV</literal>
+ data. An exception is that <application>psql</application> will terminate
+ a <literal>COPY FROM STDIN</literal> operation (that is, reading
+ in-line <command>COPY</command> data in a SQL script) at a line containing
+ only <literal>\.</literal>, whether it is text or <literal>CSV</literal>
+ mode.
</para>
<note>
<para>
+ <productname>PostgreSQL</productname> versions before v18 always
+ recognized unquoted <literal>\.</literal> as an end-of-data marker,
+ even when reading from a separate file. For compatibility with older
+ versions, <command>COPY TO</command> will quote <literal>\.</literal>
+ when it's alone on a line, even though this is no longer necessary.
+ </para>
+ </note>
+
+ <note>
+ <para>
In <literal>CSV</literal> format, all characters are significant. A quoted value
surrounded by white space, or any characters other than
<literal>DELIMITER</literal>, will include those characters. This can cause
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 3fd9959ed16..b825ca96a23 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1135,7 +1135,8 @@ SELECT $1 \parse stmt1
<para>
For <literal>\copy ... from stdin</literal>, data rows are read from the same
- source that issued the command, continuing until <literal>\.</literal>
+ source that issued the command, continuing until a line containing
+ only <literal>\.</literal>
is read or the stream reaches <acronym>EOF</acronym>. This option is useful
for populating tables in-line within an SQL script file.
For <literal>\copy ... to stdout</literal>, output is sent to the same place
@@ -1179,10 +1180,6 @@ SELECT $1 \parse stmt1
destination, because all data must pass through the client/server
connection. For large amounts of data the <acronym>SQL</acronym>
command might be preferable.
- Also, because of this pass-through method, <literal>\copy
- ... from</literal> in <acronym>CSV</acronym> mode will erroneously
- treat a <literal>\.</literal> data value alone on a line as an
- end-of-input marker.
</para>
</tip>