| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
Lifetime of the input string is not right, so create a separate copy.
Author: Amit Langote
Discussion: https://postgr.es/m/a2773420-50d1-0a42-3396-fe42b0921134@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
| |
In commit f0e4475, GetIndexOpClass was renamed to ResolveOpClass, but
the comment in typecmds.c didn't get the memo.
In objectaddress.c, missing 'of' in a comment.
Both noticed by Vik Fearing, patch is mine though.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EventTriggerTableRewrite crashed if there were table_rewrite triggers
present, but there had not been when the calling command started.
EventTriggerDDLCommandEnd called ddl_command_end triggers if present,
even if there had been no such triggers when the calling command started,
which would lead to a failure in pg_event_trigger_ddl_commands.
In both cases, fix by doing nothing; it's better to wait till the next
command when things will be properly initialized.
In passing, remove an elog(DEBUG1) call that might have seemed interesting
four years ago but surely isn't today.
We found this because of intermittent failures in the buildfarm. Thanks
to Alvaro Herrera and Andrew Gierth for analysis.
Back-patch to 9.5; some of this code exists before that, but the specific
hazards we need to guard against don't.
Discussion: https://postgr.es/m/5767.1523995174@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 54eff5311 did not account for the possibility that we'd have
a transaction snapshot due to default_transaction_isolation being
set high enough to require one. The transaction snapshot is enough
to hold back our advertised xmin and thus risk deadlock anyway.
The only way to get rid of that snap is to start a new transaction,
so let's do that instead. Also throw in an assert checking that we
really have gotten to a state where no xmin is being advertised.
Back-patch to 9.4, like the previous commit.
Discussion: https://postgr.es/m/CAMkU=1ztk3TpQdcUNbxq93pc80FrXUjpDWLGMeVBDx71GHNwZQ@mail.gmail.com
|
|
|
|
|
| |
Also add regression test cases for detecting infinite recursion in
locking view tests. Some document enhancements. Patch by Yugo Nagata.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's been a massive addition of partitioning code in PostgreSQL 11,
with little oversight on its placement, resulting in a
catalog/partition.c with poorly defined boundaries and responsibilities.
This commit tries to set a couple of distinct modules to separate things
a little bit. There are no code changes here, only code movement.
There are three new files:
src/backend/utils/cache/partcache.c
src/include/partitioning/partdefs.h
src/include/utils/partcache.h
The previous arrangement of #including catalog/partition.h almost
everywhere is no more.
Authors: Amit Langote and Álvaro Herrera
Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp
https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp
https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp
https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
| |
We need to call expand_function_arguments() to expand named and default
arguments.
In PL/pgSQL, we also need to deal with named and default INOUT arguments
when receiving the output values into variables.
Author: Pavel Stehule <pavel.stehule@gmail.com>
|
|
|
|
|
|
|
|
|
| |
I lowered the lock level for partitions being scanned from
AccessExclusive to ShareLock in the course of 72cf7f310c07, but that was
bogus, as pointed out by Robert Haas. Revert that bit. Doing this is
possible, but requires more work.
Discussion: https://postgr.es/m/CA+TgmobV7Nfmqv+TZXcdSsb9Bjc-OL-Anv6BNmCbfJVZLYPE4Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug is caused due to the original IndexStmt that DefineIndex receives
being overwritten when processing the INCLUDE columns. Use separate list of
index params to propagate to child tables. Add tests covering this case.
Amit Langote and Alexander Korotkov.
Re-commit 5c6110c6a960ad6fe1b0d0fec6ae36ef4eb913f5 because it discovered a bug
fixed in c266ed31a8a3beed3533e6a78faeca78234cbd43
Discussion: https://www.postgresql.org/message-id/CAJGNTeO%3DBguEyG8wxMpU_Vgvg3nGGzy71zUQ0RpzEn_mb0bSWA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
- Explicitly forbids opclass, collation and indoptions (like DESC/ASC etc) for
including columns. Throw an error if user points that.
- Truncated storage arrays for such attributes to store only key atrributes,
added assertion checks.
- Do not check opfamily and collation for including columns in
CompareIndexInfo()
Discussion: https://www.postgresql.org/message-id/5ee72852-3c4e-ee35-e2ed-c1d053d45c08@sigaev.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commits d204ef63776b8a00ca220adec23979091564e465,
83454e3c2b28141c0db01c7d2027e01040df5249 and a few more commits thereafter
(complete list at the end) related to MERGE feature.
While the feature was fully functional, with sufficient test coverage and
necessary documentation, it was felt that some parts of the executor and
parse-analyzer can use a different design and it wasn't possible to do that in
the available time. So it was decided to revert the patch for PG11 and retry
again in the future.
Thanks again to all reviewers and bug reporters.
List of commits reverted, in reverse chronological order:
f1464c5380 Improve parse representation for MERGE
ddb4158579 MERGE syntax diagram correction
530e69e59b Allow cpluspluscheck to pass by renaming variable
01b88b4df5 MERGE minor errata
3af7b2b0d4 MERGE fix variable warning in non-assert builds
a5d86181ec MERGE INSERT allows only one VALUES clause
4b2d44031f MERGE post-commit review
4923550c20 Tab completion for MERGE
aa3faa3c7a WITH support in MERGE
83454e3c2b New files for MERGE
d204ef6377 MERGE SQL Command following SQL:2016
Author: Pavan Deolasee
Reviewed-by: Michael Paquier
|
|
|
|
|
|
|
|
| |
Rename ii_KeyAttrNumbers to ii_IndexAttrNumbers to prevent confusion with
ii_NumIndexAttrs/ii_NumIndexKeyAttrs. ii_IndexAttrNumbers contains
all attributes including "including" columns, not only key attribute.
Discussion: https://www.postgresql.org/message-id/13123421-1d52-d0e4-c95c-6d69011e0595%40sigaev.ru
|
|
|
|
|
|
|
|
|
|
|
| |
Oversight in commit 8b08f7d4820f: pg_class.relispartition was not
being set for index partitions, which is a bit odd, and was also causing
the code to unnecessarily call has_superclass() when simply checking the
flag was enough.
Author: Álvaro Herrera
Reported-by: Amit Langote
Discussion: https://postgr.es/m/12085bc4-0bc6-0f3a-4c43-57fe0681772b@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the table being attached contained values that contradict the default
partition's partition constraint, it would fail to complain, because
CommandCounterIncrement changes in 4dba331cb3dc coupled with some bogus
coding in the existing ValidatePartitionConstraints prevented the
partition constraint from being validated after all -- or rather, it
caused to constraint to become an empty one, always succeeding.
Fix by not re-reading the OID of the default partition in
ATExecAttachPartition. To forestall similar problems, revise the
existing code:
* rename routine from ValidatePartitionConstraints() to
QueuePartitionConstraintValidation, to better represent what it
actually does.
* add an Assert() to make sure that when queueing a constraint for a
partition we're not overwriting a constraint previously queued.
* add an Assert() that we don't try to invoke the special-purpose
validation of the default partition when attaching the default
partition itself.
While at it, change some loops to obtain partition OIDs from
partdesc->oids rather than find_all_inheritors; reduce the lock level
of partitions being scanned from AccessExclusiveLock to ShareLock;
rewrite QueuePartitionConstraintValidation in a recursive fashion rather
than repetitive.
Author: Álvaro Herrera. Tests written by Amit Langote
Reported-by: Rushabh Lathia
Diagnosed-by: Kyotaro HORIGUCHI, who also provided the initial fix.
Reviewed-by: Kyotaro HORIGUCHI, Amit Langote, Jeevan Ladhe
Discussion: https://postgr.es/m/CAGPqQf0W+v-Ci_qNV_5R3A=Z9LsK4+jO7LzgddRncpp_rrnJqQ@mail.gmail.com
|
|
|
|
| |
It discovers one more bug in CompareIndexInfo(), should be fixed first.
|
|
|
|
|
|
|
|
| |
The bug is caused due to the original IndexStmt that DefineIndex receives
being overwritten when processing the INCLUDE columns. Use separate list of
index params to propagate to child tables. Add tests covering this case.
Amit Langote and Alexander Korotkov.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The HeapFetches counter was using a simple value in IndexOnlyScanState,
which fails to propagate values from parallel workers; so the counts are
wrong when IndexOnlyScan runs in parallel. Move it to Instrumentation,
like all the other counters.
While at it, change INSERT ON CONFLICT conflicting tuple counter to use
the new ntuples2 instead of nfiltered2, which is a blatant misuse.
Discussion: https://postgr.es/m/20180409215851.idwc75ct2bzi6tea@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 9c0a0de4c, I'd failed to notice that catalog/catalog.h
should also be considered a frontend-unsafe header, because it includes
(and needs) the full form of pg_class.h, not to mention relcache.h.
However, various frontend code was depending on it to get
TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for.
The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY,
as well as the OIDCHARS symbol, to common/relpath.h. Do that, and mop up
inclusions as necessary. (I found that quite a few current users of
catalog/catalog.h don't seem to need it at all anymore, apparently as a
result of the refactorings that created common/relpath.[hc]. And
initdb.c needed it only as a route to pg_class_d.h.)
Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CheckIndexCompatible() misused ComputeIndexAttrs() by not bothering
to fill ii_NumIndexAttrs and ii_NumIndexKeyAttrs in the passed
IndexInfo. Omission of ii_NumIndexAttrs was previously unimportant,
but now this matters because ComputeIndexAttrs depends on
ii_NumIndexKeyAttrs to decide how many columns it needs to report on.
(BTW, the fact that this oversight wasn't detected earlier implies
that we have no regression test verifying whether CheckIndexCompatible
ever succeeds. Bad dog. Not the job of this patch to fix it, though.)
Also, change the API of ComputeIndexAttrs so that it fills the opclass
output array for all column positions, as it does for the options output
array; positions for non-key index columns are filled with zeroes.
This isn't directly fixing any bug, but it seems like a good idea.
Per valgrind failure reports from buildfarm.
Alexander Korotkov, tweaked a bit by me
Discussion: https://postgr.es/m/CAPpHfduWrysrT-qAhn+3Ea5+Mg6Vhc-oA6o2Z-hRCPRdvf3tiw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally, include/catalog/pg_foo.h contains extern declarations
for functions in backend/catalog/pg_foo.c, in addition to its function
as the authoritative definition of the pg_foo catalog's rowtype.
In some cases, we'd been forced to split out those extern declarations
into separate pg_foo_fn.h headers so that the catalog definitions
could be #include'd by frontend code. That problem is gone as of
commit 9c0a0de4c, so let's undo the splits to make things less
confusing.
Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).
Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).
Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.
Authors: David Steele <david@pgmasters.net>,
Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query. This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.
This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:
1. Parameterized Nested Loop Joins: The parameter from the outer side of the
join can be used to determine the minimum set of inner side partitions to
scan.
2. Initplans: Once an initplan has been executed we can then determine which
partitions match the value from the initplan.
Partition pruning is performed in two ways. When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor. This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.
For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait. Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output. In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)". Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times. This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.
This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable. This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.
Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an update moves a row between partitions (supported since
2f178441044b), our normal logic for following update chains in READ
COMMITTED mode doesn't work anymore. Cross partition updates are
modeled as an delete from the old and insert into the new
partition. No ctid chain exists across partitions, and there's no
convenient space to introduce that link.
Not throwing an error in a partitioned context when one would have
been thrown without partitioning is obviously problematic. This commit
introduces infrastructure to detect when a tuple has been moved, not
just plainly deleted. That allows to throw an error when encountering
a deletion that's actually a move, while attempting to following a
ctid chain.
The row deleted as part of a cross partition update is marked by
pointing it's t_ctid to an invalid block, instead of self as a normal
update would. That was deemed to be the least invasive and most
future proof way to represent the knowledge, given how few infomask
bits are there to be recycled (there's also some locking issues with
using infomask bits).
External code following ctid chains should be updated to check for
moved tuples. The most likely consequence of not doing so is a missed
error.
Author: Amul Sul, editorialized by me
Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas
Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces INCLUDE clause to index definition. This clause
specifies a list of columns which will be included as a non-key part in
the index. The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans. Also, such columns don't need to have
appropriate operator classes. Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.
Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause.
In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys). Therefore, B-tree indexes
now might have variable number of attributes. This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating
that. This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.
Bump catalog version
Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the built-in logical replication system to make use of the
previously added logical decoding for TRUNCATE support. Add the
required truncate callback to pgoutput and a new logical replication
protocol message.
Publications get a new attribute to determine whether to replicate
truncate actions. When updating a publication via pg_dump from an older
version, this is not set, thus preserving the previous behavior.
Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new WAL record type for TRUNCATE, which is only used when
wal_level >= logical. (For physical replication, TRUNCATE is already
replicated via SMGR records.) Add new callback for logical decoding
output plugins to receive TRUNCATE actions.
Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Also enable this for postgres_fdw.
Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
patch series of which this is a part has been reviewed by Amit
Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
and me. Minor documentation changes to the final version by me.
Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds new default roles named 'pg_read_server_files',
'pg_write_server_files', 'pg_execute_server_program' which
allow an administrator to GRANT to a non-superuser role the ability to
access server-side files or run programs through PostgreSQL (as the user
the database is running as). Having one of these roles allows a
non-superuser to use server-side COPY to read, write, or with a program,
and to use file_fdw (if installed by a superuser and GRANT'd USAGE on
it) to read from files or run a program.
The existing misc file functions are also changed to allow a user with
the 'pg_read_server_files' default role to read any files on the
filesystem, matching the privileges given to that role through COPY and
file_fdw from above.
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
|
|
|
|
|
|
| |
MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier
Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous coding inadvertently checked the constraints for the
partitioned table rather than the target partition, which could
lead to data in a partition that fails to satisfy some constraint
on that partition. This problem seems to date back to when
table partitioning was introduced; prior to that, there was only
one target table for a COPY, so the problem didn't occur, and the
code just didn't get updated.
Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat
Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp
|
|
|
|
|
|
|
|
| |
We don't actually need the insert-or-update logic, so it's clearer to
have separate functions for the inserting and updating.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Review comments from Andres Freund
* Consolidate code into AfterTriggerGetTransitionTable()
* Rename nodeMerge.c to execMerge.c
* Rename nodeMerge.h to execMerge.h
* Move MERGE handling in ExecInitModifyTable()
into a execMerge.c ExecInitMerge()
* Move mt_merge_subcommands flags into execMerge.h
* Rename opt_and_condition to opt_merge_when_and_condition
* Wordsmith various comments
Author: Pavan Deolasee
Reviewer: Simon Riggs
|
|
|
|
|
|
| |
Author: Álvaro Herrera
Discussion: https://postgr.es/m/20171231194359.cvojcour423ulha4@alvherre.pgsql
Reviewed-by: Peter Eisentraut
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trigger cloning to partitions was supposed to occur for user-visible
triggers only, but during development the protection that prevented it
from occurring to internal triggers was lost. Reinstate it, as well as
add a test case to ensure internal triggers (in the tested case,
triggers implementing a deferred unique constraint) are not cloned.
Without the code fix, the partitions in the test end up with different
numbers of triggers, which is clearly wrong ...
Bug in 86f575948c77.
Discussion: https://postgr.es/m/20180403214903.ozfagwjcpk337uw7@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
|
|
|
|
| |
This reverts commit 354f13855e6381d288dfaa52bcd4f2cb0fd4a5eb.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LockViewRecurese() obtains view relation using heap_open() and passes
it to get_view_query() to get view info. It immediately closes the
relation then uses the returned view info by calling
LockViewRecurse_walker(). Since get_view_query() returns a pointer
within the relcache, the relcache should be kept until
LockViewRecurse_walker() returns. Otherwise the relation could point
to a garbage memory area.
Fix is moving the heap_close() call after LockViewRecurse_walker().
Problem reported by Tom Lane (buildfarm is unhappy, especially prion
since it enables -DRELCACHE_FORCE_RELEASE cpp flag), fix by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A followup patch will add a SKIP_LOCKED option. To avoid introducing
evermore arguments, breaking existing callers each time, introduce a
flags argument. This'll no doubt break a few external users...
Also change the MISSING_OK behaviour so a DEBUG1 debug message is
emitted when a relation is not found.
Author: Nathan Bossart
Reviewed-By: Michael Paquier and Andres Freund
Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
| |
Now all tables used in view definitions can be recursively locked by a
LOCK command.
Author: Yugo Nagata
Reviewed by Robert Haas, Thomas Munro and me.
Discussion: https://postgr.es/m/20171011183629.eb2817b3.nagata%40sraoss.co.jp
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier, Abhijit Menon-Sen
Discussion: https://postgr.es/m/20180305045854.GB2266@paquier.xyz
Author: Michael Paquier, Abhijit Menon-Sen, me
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VACUUM updates leaf-level FSM entries immediately after cleaning the
corresponding heap blocks. fsmpage.c updates the intra-page search trees
on the leaf-level FSM pages when this happens, but it does not touch the
upper-level FSM pages, so that the released space might not actually be
findable by searchers. Previously, updating the upper-level pages happened
only at the conclusion of the VACUUM run, in a single FreeSpaceMapVacuum()
call. This is bad because the VACUUM might get canceled before ever
reaching that point, so that from the point of view of searchers no space
has been freed at all, leading to table bloat.
We can improve matters by updating the upper pages immediately after each
cycle of index-cleaning and heap-cleaning, processing just the FSM pages
corresponding to the range of heap blocks we have now fully cleaned.
This adds a small amount of extra work, since the FSM pages leading down
to each range boundary will be touched twice, but it's pretty negligible
compared to everything else going on in a large VACUUM.
If there are no indexes, VACUUM doesn't work in cycles but just cleans
each heap page on first visit. In that case we just arbitrarily update
upper FSM pages after each 8GB of heap. That maintains the goal of not
letting all this work slide until the very end, and it doesn't seem worth
expending extra complexity on a case that so seldom occurs in practice.
In either case, the FSM is fully up to date before any attempt is made
to truncate the relation, so that the most likely scenario for VACUUM
cancellation no longer results in out-of-date upper FSM pages. When
we do successfully truncate, adjusting the FSM to reflect that is now
fully handled within FreeSpaceMapTruncateRel.
Claudio Freire, reviewed by Masahiko Sawada and Jing Wang, some additional
tweaks by me
Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
This just shows a few details about JITing, e.g. how many functions
have been JITed, and how long that took. To avoid noise in regression
tests with functions sometimes being JITed in --with-llvm builds,
disable display when COSTS OFF is specified.
Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.
Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.
The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g. catalog relations) NULL can be passed for this
argument.
Andrew Dunstan, heavily modified from an original patch from Serge
Rielau.
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.
Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, FOR EACH ROW triggers were not allowed in partitioned
tables. Now we allow AFTER triggers on them, and on trigger creation we
cascade to create an identical trigger in each partition. We also clone
the triggers to each partition that is created or attached later.
This means that deferred unique keys are allowed on partitioned tables,
too.
Author: Álvaro Herrera
Reviewed-by: Peter Eisentraut, Simon Riggs, Amit Langote, Robert Haas,
Thomas Munro
Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VACUUM thought that reltuples represents the total number of tuples in
the relation, while ANALYZE counted only live tuples. This can cause
"flapping" in the value when background vacuums and analyzes happen
separately. The planner's use of reltuples essentially assumes that
it's the count of live (visible) tuples, so let's standardize on having
it mean live tuples.
Another issue is that the definition of "live tuple" isn't totally clear;
what should be done with INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples?
ANALYZE's choices in this regard are made on the assumption that if the
originating transaction commits at all, it will happen after ANALYZE
finishes, so we should ignore the effects of the in-progress transaction
--- unless it is our own transaction, and then we should count it.
Let's propagate this definition into VACUUM, too.
Likewise propagate this definition into CREATE INDEX, and into
contrib/pgstattuple's pgstattuple_approx() function.
Tomas Vondra, reviewed by Haribabu Kommi, some corrections by me
Discussion: https://postgr.es/m/16db4468-edfa-830a-f921-39a50498e77e@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, a value was included in the MCV list if its frequency was
25% larger than the estimated average frequency of all nonnull values
in the table. For uniform distributions, that can lead to values
being included in the MCV list and significantly overestimated on the
basis of relatively few (sometimes just 2) instances being seen in the
sample. For non-uniform distributions, it can lead to too few values
being included in the MCV list, since the overall average frequency
may be dominated by a small number of very common values, while the
remaining values may still have a large spread of frequencies, causing
both substantial overestimation and underestimation of the remaining
values. Furthermore, increasing the statistics target may have little
effect because the overall average frequency will remain relatively
unchanged.
Instead, populate the MCV list with the largest set of common values
that are statistically significantly more common than the average
frequency of the remaining values. This takes into account the
variance of the sample counts, which depends on the counts themselves
and on the proportion of the table that was sampled. As a result, it
constrains the relative standard error of estimates based on the
frequencies of values in the list, reducing the chances of too many
values being included. At the same time, it allows more values to be
included, since the MCVs need only be more common than the remaining
non-MCVs, rather than the overall average. Thus it tends to produce
fewer MCVs than the previous code for uniform distributions, and more
for non-uniform distributions, reducing estimation errors in both
cases. In addition, the algorithm responds better to increasing the
statistics target, allowing more values to be included in the MCV list
when more of the table is sampled.
Jeff Janes, substantially modified by me. Reviewed by John Naylor and
Tomas Vondra.
Discussion: https://postgr.es/m/CAMkU=1yvdGvW9TmiLAhz2erFnvnPFYHbOZuO+a=4DVkzpuQ2tw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
My commit 4dba331cb3dc that moved around CommandCounterIncrement calls
in partitioning DDL code unearthed a problem with the relcache handling
for the 'default' partition: the construction of a correct relcache
entry for the partitioned table was at the mercy of lack of CCI calls in
non-trivial amounts of code. This was prone to creating problems later
on, as the code develops. This was visible as a test failure in a
compile with RELCACHE_FORCE_RELASE (buildfarm member prion).
The problem is that after the mentioned commit it was possible to create
a relcache entry that had incomplete information regarding the default
partition because I introduced a CCI between adding the catalog entries
for the default partition (StorePartitionBound) and the update of
pg_partitioned_table entry for its parent partitioned table
(update_default_partition_oid). It seems the best fix is to move the
latter so that it occurs inside the former; the purposeful lack of
intervening CCI should be more obvious, and harder to break.
I also remove a check in RelationBuildPartitionDesc that returns NULL if
the key is not set. I couldn't find any place that needs this hack
anymore; probably it was required because of bugs that have since been
fixed.
Fix a few typos I noticed while reviewing the code involved.
Discussion: https://postgr.es/m/20180320182659.nyzn3vqtjbbtfgwq@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Logical decoding should not publish anything about tables created as
part of a heap rewrite during DDL. Those tables don't exist externally,
so consumers of logical decoding cannot do anything sensible with that
information. In ab28feae2bd3d4629bd73ae3548e671c57d785f0, we worked
around this for built-in logical replication, but that was hack.
This is a more proper fix: We mark such transient heaps using the new
field pg_class.relwrite, linking to the original relation OID. By
default, we ignore them in logical decoding before they get to the
output plugin. Optionally, a plugin can register their interest in
getting such changes, if they handle DDL specially, in which case the
new field will help them get information about the actual table.
Reviewed-by: Craig Ringer <craig@2ndquadrant.com>
|