docs(aggregations): use .alias() on grouping(), drop obsolete workaround

timsaucer · claude · timsaucer · commit 91a1b044b955 · 2026-06-13T18:06:37.000+02:00
apache/datafusion#21411 is resolved — `.alias()` now works directly on a `grouping()` expression. Removed the note describing the limitation and the with_column_renamed workaround in the rollup and grouping_sets examples, aliasing the grouping columns inline instead. Verified on the current branch: the aliased aggregates execute and produce the named columns. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
diff --git a/docs/source/user-guide/common-operations/aggregations.md b/docs/source/user-guide/common-operations/aggregations.md
@@ -262,29 +262,15 @@ tell a grand-total `null` apart from a Pokemon that genuinely has no type? The
 {py:func}`~datafusion.functions.grouping` function returns `0` when the column is a grouping key
 for that row and `1` when it is aggregated across.
 
-:::{note}
-Due to an upstream DataFusion limitation
-([apache/datafusion#21411](https://github.com/apache/datafusion/issues/21411)),
-`.alias()` cannot be applied directly to a `grouping()` expression — it will raise an
-error at execution time. Instead, use
-{py:meth}`~datafusion.dataframe.DataFrame.with_column_renamed` on the result DataFrame to
-give the column a readable name. Once the upstream issue is resolved, you will be able to
-use `.alias()` directly and the workaround below will no longer be necessary.
-:::
-
-The raw column name generated by `grouping()` contains internal identifiers, so we use
-{py:meth}`~datafusion.dataframe.DataFrame.with_column_renamed` to clean it up:
+Apply `.alias()` to the `grouping()` expression to give the column a readable name:
 
 ```{code-cell} ipython3
 result = df.aggregate(
     [GroupingSet.rollup(col_type_1)],
     [f.count(col_speed).alias("Count"),
      f.avg(col_speed).alias("Avg Speed"),
-     f.grouping(col_type_1)]
+     f.grouping(col_type_1).alias("Is Total")]
 )
-for field in result.schema():
-    if field.name.startswith("grouping("):
-        result = result.with_column_renamed(field.name, "Is Total")
 result.sort(col_type_1.sort(ascending=True, nulls_first=True))
 ```
 
@@ -357,13 +343,9 @@ result = df.aggregate(
     [GroupingSet.grouping_sets([col_type_1], [col_type_2])],
     [f.count(col_speed).alias("Count"),
      f.avg(col_speed).alias("Avg Speed"),
-     f.grouping(col_type_1),
-     f.grouping(col_type_2)]
+     f.grouping(col_type_1).alias("grouping(Type 1)"),
+     f.grouping(col_type_2).alias("grouping(Type 2)")]
 )
-for field in result.schema():
-    if field.name.startswith("grouping("):
-        clean = field.name.split(".")[-1].rstrip(")")
-        result = result.with_column_renamed(field.name, f"grouping({clean})")
 result.sort(
     col_type_1.sort(ascending=True, nulls_first=True),
     col_type_2.sort(ascending=True, nulls_first=True)