Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,12 @@ internal interface GroupByDocs {
* `| `__`.`__[**`last`**][GroupBy.last]` \[ `**` { `**`rowCondition: `[`RowFilter`][RowFilter]**` } `**`]`
*
* {@include [Indent]}
* `| `__`.`__[**`medianBy`**][GroupBy.medianBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`**
*
* {@include [Indent]}
* `| `__`.`__[**`percentileBy`**][GroupBy.percentileBy]**`(`**`percentile: `[`Double`][Double]**`) { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`**
*
* {@include [Indent]}
* __`.`__[**`concat`**][ReducedGroupBy.concat]**`() `**
*
* {@include [Indent]}
Expand Down Expand Up @@ -251,8 +257,8 @@ internal interface GroupByDocs {
* (optionally, the first or last one that satisfies a predicate) of each group;
* * [minBy][GroupBy.minBy] / [maxBy][GroupBy.maxBy] — take the row with the minimum or maximum value
* of the given [RowExpression] calculated on rows within each group;
* * [medianBy][GroupBy.medianBy] / [percentileBy][GroupBy.percentileBy] — take the row with
* the median or specific percentile value of the given [RowExpression] calculated on rows within each group;
* * [medianBy][GroupBy.medianBy] / [percentileBy][GroupBy.percentileBy] — take the row at the position closest
* to the estimated median/percentile index of the [RowExpression]'s results calculated on rows within each group.
*
* These functions return a [ReducedGroupBy], which can then be transformed into a new [DataFrame]
* containing the reduced rows (either original or transformed) using one of the following methods:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,10 @@ internal interface PivotDocs {
* `| `__`.`__[**`last`**][Pivot.last]` \[ `**`{ `**`rowCondition: `[`RowFilter`][RowFilter]**` } `**`]`
*
* {@include [Indent]}
* `| `__`.`__[**`medianBy`**][Pivot.medianBy]**` { `**`column: `[`RowExpression`][RowExpression]**` }`**
* `| `__`.`__[**`medianBy`**][Pivot.medianBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`**
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the parameter be called like in the function declaration (rowExpression instead of column)?

*
* {@include [Indent]}
* `| `__`.`__[**`percentileBy`**][Pivot.percentileBy]**`(`**`percentile: `[`Double`][Double]**`) { `**`column: `[`RowExpression`][RowExpression]**` }`**
* `| `__`.`__[**`percentileBy`**][Pivot.percentileBy]**`(`**`percentile: `[`Double`][Double]**`) { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`**
*
* {@include [Indent]}
* __`.`__[**`with`**][Pivot.with]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`**
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be on ReducedPivot instead? If we mean the function for the second step of reducing: transform ReducedPivot into a DataRow.

An overload on Pivot also exists though, so I'm not completely sure.

Expand Down Expand Up @@ -150,8 +150,8 @@ internal interface PivotDocs {
* (optionally, the first or last one that satisfies a predicate) of each group;
* * [minBy][Pivot.minBy] / [maxBy][Pivot.maxBy] — take the row with the minimum or maximum value
* of the given [RowExpression] evaluated on rows within each group;
* * [medianBy][Pivot.medianBy] / [percentileBy][Pivot.percentileBy] — take the row with
* the median or a specific percentile value of the given [RowExpression] evaluated on rows within each group.
* * [medianBy][Pivot.medianBy] / [percentileBy][Pivot.percentileBy] — take the row at the position closest
* to the estimated median/percentile index of the [RowExpression]'s results calculated on rows within each group.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! :)

*
* These functions return a [ReducedPivot], which can then be transformed into a new [DataFrame]
* containing a single combined row (either using the original reduced rows or their transformed versions)
Expand Down
2 changes: 1 addition & 1 deletion docs/StardustDocs/topics/groupBy.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ To perform a reducing operation, use the following functions:
* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) – to get from each group the row with the smallest / largest result
of the [`row expression`](DataRow.md#row-expressions) supplied to the function.

* [`medianBy`](median.md) / [`percentileBy`](percentile.md) – to get the row with the value closest to the estimated
* [`medianBy`](median.md) / [`percentileBy`](percentile.md) – to get the row at the position closest to the estimated
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jolanrensen I think last time I incorrectly implemented what you meant. Does it sound better now? :)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes :) hopefully people do still understand it, haha. It's quite a difficult explanation, but so is the concept

median/percentile index of the [`row expression`](DataRow.md#row-expressions)'s results calculated on rows within each group.

These functions return an instance of `ReducedGroupBy`, which is a class serving as a transitional step
Expand Down
6 changes: 4 additions & 2 deletions docs/StardustDocs/topics/pivot.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,8 +219,10 @@ Reducing is a specific case of [`aggregation`](pivot.md#aggregation).
### Step 1: use a reducing method
Use the following functions to collapse each group in a [`Pivot`](pivot.md) into a single row:
* [`first`](first.md) / [`last`](last.md) — take the first or last row (optionally, the first or last one that satisfies a predicate) of each group;
* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) — take the row with the minimum or maximum value of the given `RowExpression` evaluated on rows within each group;
* [`medianBy`](median.md) / [`percentileBy`](percentile.md) — take the row with the median or a specific percentile value of the given `RowExpression` evaluated on rows within each group.
* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) — take the row with the minimum or maximum value
of the given [`row expression`](DataRow.md#row-expressions) evaluated on rows within each group;
* [`medianBy`](median.md) / [`percentileBy`](percentile.md) — take the row at the position closest to the estimated
median/percentile index of the [`row expression`](DataRow.md#row-expressions)'s results calculated on rows within each group.

These functions return an instance of `ReducedPivot`, which is a class serving as a transitional step between performing a reduction on [`Pivot`](pivot.md) groups
and specifying how the resulting reduced rows should be represented in a resulting [`DataRow`](DataRow.md).
Expand Down
Loading