Introduce an RTE for the grouping step

If there are subqueries in the grouping expressions, each of these subqueries in the targetlist and HAVING clause is expanded into distinct SubPlan nodes. As a result, only one of these SubPlan nodes would be converted to reference to the grouping key column output by the Agg node; others would have to get evaluated afresh. This is not efficient, and with grouping sets this can cause wrong results issues in cases where they should go to NULL because they are from the wrong grouping set. Furthermore, during re-evaluation, these SubPlan nodes might use nulled column values from grouping sets, which is not correct. This issue is not limited to subqueries. For other types of expressions that are part of grouping items, if they are transformed into another form during preprocessing, they may fail to match lower target items. This can also lead to wrong results with grouping sets. To fix this issue, we introduce a new kind of RTE representing the output of the grouping step, with columns that are the Vars or expressions being grouped on. In the parser, we replace the grouping expressions in the targetlist and HAVING clause with Vars referencing this new RTE, so that the output of the parser directly expresses the semantic requirement that the grouping expressions be gotten from the grouping output rather than computed some other way. In the planner, we first preprocess all the columns of this new RTE and then replace any Vars in the targetlist and HAVING clause that reference this new RTE with the underlying grouping expressions, so that we will have only one instance of a SubPlan node for each subquery contained in the grouping expressions. Bump catversion because this changes the querytree produced by the parser. Thanks to Tom Lane for the idea to invent a new kind of RTE. Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from various threads. Author: Richard Guo Reviewed-by: Ashutosh Bapat, Sutou Kouhei Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
author: Richard Guo <rguo@postgresql.org> 2024-09-10 12:35:34 +0900
committer: Richard Guo <rguo@postgresql.org> 2024-09-10 12:35:34 +0900
commit: 247dea89f7616fdf06b7272b74abafc29e8e5860 (patch)
tree: 8f4f7c0acb5b9842df4517019805d7531aec1b46 /src/backend/parser/parse_relation.c
parent: fba49d5293b4455b25485450baf02af42bf543d7 (diff)
download: postgresql-247dea89f7616fdf06b7272b74abafc29e8e5860.tar.gz
postgresql-247dea89f7616fdf06b7272b74abafc29e8e5860.zip
1 files changed, 77 insertions, 2 deletions
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 2f64eaf0e37..8075b1b8a1b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -2557,6 +2557,79 @@ addRangeTableEntryForENR(ParseState *pstate,
 									tupdesc);
 }
 
+/*
+ * Add an entry for grouping step to the pstate's range table (p_rtable).
+ * Then, construct and return a ParseNamespaceItem for the new RTE.
+ */
+ParseNamespaceItem *
+addRangeTableEntryForGroup(ParseState *pstate,
+						   List *groupClauses)
+{
+	RangeTblEntry *rte = makeNode(RangeTblEntry);
+	Alias	   *eref;
+	List	   *groupexprs;
+	List	   *coltypes,
+			   *coltypmods,
+			   *colcollations;
+	ListCell   *lc;
+	ParseNamespaceItem *nsitem;
+
+	Assert(pstate != NULL);
+
+	rte->rtekind = RTE_GROUP;
+	rte->alias = NULL;
+
+	eref = makeAlias("*GROUP*", NIL);
+
+	/* fill in any unspecified alias columns, and extract column type info */
+	groupexprs = NIL;
+	coltypes = coltypmods = colcollations = NIL;
+	foreach(lc, groupClauses)
+	{
+		TargetEntry *te = (TargetEntry *) lfirst(lc);
+		char	   *colname = te->resname ? pstrdup(te->resname) : "?column?";
+
+		eref->colnames = lappend(eref->colnames, makeString(colname));
+
+		groupexprs = lappend(groupexprs, copyObject(te->expr));
+
+		coltypes = lappend_oid(coltypes,
+							   exprType((Node *) te->expr));
+		coltypmods = lappend_int(coltypmods,
+								 exprTypmod((Node *) te->expr));
+		colcollations = lappend_oid(colcollations,
+									exprCollation((Node *) te->expr));
+	}
+
+	rte->eref = eref;
+	rte->groupexprs = groupexprs;
+
+	/*
+	 * Set flags.
+	 *
+	 * The grouping step is never checked for access rights, so no need to
+	 * perform addRTEPermissionInfo().
+	 */
+	rte->lateral = false;
+	rte->inFromCl = false;
+
+	/*
+	 * Add completed RTE to pstate's range table list, so that we know its
+	 * index.  But we don't add it to the join list --- caller must do that if
+	 * appropriate.
+	 */
+	pstate->p_rtable = lappend(pstate->p_rtable, rte);
+
+	/*
+	 * Build a ParseNamespaceItem, but don't add it to the pstate's namespace
+	 * list --- caller must do that if appropriate.
+	 */
+	nsitem = buildNSItemFromLists(rte, list_length(pstate->p_rtable),
+								  coltypes, coltypmods, colcollations);
+
+	return nsitem;
+}
+
 
 /*
  * Has the specified refname been selected FOR UPDATE/FOR SHARE?
@@ -3003,6 +3076,7 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
 			}
 			break;
 		case RTE_RESULT:
+		case RTE_GROUP:
 			/* These expose no columns, so nothing to do */
 			break;
 		default:
@@ -3317,10 +3391,11 @@ get_rte_attribute_is_dropped(RangeTblEntry *rte, AttrNumber attnum)
 		case RTE_TABLEFUNC:
 		case RTE_VALUES:
 		case RTE_CTE:
+		case RTE_GROUP:
 
 			/*
-			 * Subselect, Table Functions, Values, CTE RTEs never have dropped
-			 * columns
+			 * Subselect, Table Functions, Values, CTE, GROUP RTEs never have
+			 * dropped columns
 			 */
 			result = false;
 			break;
author	Richard Guo <rguo@postgresql.org>	2024-09-10 12:35:34 +0900
committer	Richard Guo <rguo@postgresql.org>	2024-09-10 12:35:34 +0900
commit	247dea89f7616fdf06b7272b74abafc29e8e5860 (patch)
tree	8f4f7c0acb5b9842df4517019805d7531aec1b46 /src/backend/parser/parse_relation.c
parent	fba49d5293b4455b25485450baf02af42bf543d7 (diff)
download	postgresql-247dea89f7616fdf06b7272b74abafc29e8e5860.tar.gz postgresql-247dea89f7616fdf06b7272b74abafc29e8e5860.zip