diff --git a/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/groupBy.kt b/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/groupBy.kt index 33e1b1d6c0..32c1c5bf81 100644 --- a/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/groupBy.kt +++ b/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/groupBy.kt @@ -112,10 +112,10 @@ internal interface GroupByDocs { * ### Reduce [GroupBy] into [DataFrame] * * {@include [Indent]} - * [GroupBy][GroupBy]`.`[**`minBy`**][GroupBy.minBy]**` { `**`column: `[`ColumnSelector`][ColumnSelector]**` }`** + * [GroupBy][GroupBy]`.`[**`minBy`**][GroupBy.minBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} - * `| `__`.`__[**`maxBy`**][GroupBy.maxBy]**` { `**`column: `[`ColumnSelector`][ColumnSelector]**` }`** + * `| `__`.`__[**`maxBy`**][GroupBy.maxBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} * `| `__`.`__[**`first`**][GroupBy.first]` \[ `**` { `**`rowCondition: `[`RowFilter`][RowFilter]**` } `**`]` @@ -124,6 +124,12 @@ internal interface GroupByDocs { * `| `__`.`__[**`last`**][GroupBy.last]` \[ `**` { `**`rowCondition: `[`RowFilter`][RowFilter]**` } `**`]` * * {@include [Indent]} + * `| `__`.`__[**`medianBy`**][GroupBy.medianBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** + * + * {@include [Indent]} + * `| `__`.`__[**`percentileBy`**][GroupBy.percentileBy]**`(`**`percentile: `[`Double`][Double]**`) { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** + * + * {@include [Indent]} * __`.`__[**`concat`**][ReducedGroupBy.concat]**`() `** * * {@include [Indent]} @@ -251,8 +257,8 @@ internal interface GroupByDocs { * (optionally, the first or last one that satisfies a predicate) of each group; * * [minBy][GroupBy.minBy] / [maxBy][GroupBy.maxBy] — take the row with the minimum or maximum value * of the given [RowExpression] calculated on rows within each group; - * * [medianBy][GroupBy.medianBy] / [percentileBy][GroupBy.percentileBy] — take the row with - * the median or specific percentile value of the given [RowExpression] calculated on rows within each group; + * * [medianBy][GroupBy.medianBy] / [percentileBy][GroupBy.percentileBy] — take the row at the position closest + * to the estimated median/percentile index of the [RowExpression]'s results calculated on rows within each group. * * These functions return a [ReducedGroupBy], which can then be transformed into a new [DataFrame] * containing the reduced rows (either original or transformed) using one of the following methods: diff --git a/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/pivot.kt b/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/pivot.kt index af49b39c49..f428683a64 100644 --- a/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/pivot.kt +++ b/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/pivot.kt @@ -75,10 +75,10 @@ internal interface PivotDocs { * * ### Reduce [Pivot] into [DataRow] * - * [Pivot][Pivot]`.`[**`minBy`**][Pivot.minBy]**` { `**`column: `[`RowExpression`][RowExpression]**` }`** + * [Pivot][Pivot]`.`[**`minBy`**][Pivot.minBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} - * `| `__`.`__[**`maxBy`**][Pivot.maxBy]**` { `**`column: `[`RowExpression`][RowExpression]**` }`** + * `| `__`.`__[**`maxBy`**][Pivot.maxBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} * `| `__`.`__[**`first`**][Pivot.first]` \[ `**` { `**`rowCondition: `[`RowFilter`][RowFilter]**` } `**`]` @@ -87,16 +87,16 @@ internal interface PivotDocs { * `| `__`.`__[**`last`**][Pivot.last]` \[ `**`{ `**`rowCondition: `[`RowFilter`][RowFilter]**` } `**`]` * * {@include [Indent]} - * `| `__`.`__[**`medianBy`**][Pivot.medianBy]**` { `**`column: `[`RowExpression`][RowExpression]**` }`** + * `| `__`.`__[**`medianBy`**][Pivot.medianBy]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} - * `| `__`.`__[**`percentileBy`**][Pivot.percentileBy]**`(`**`percentile: `[`Double`][Double]**`) { `**`column: `[`RowExpression`][RowExpression]**` }`** + * `| `__`.`__[**`percentileBy`**][Pivot.percentileBy]**`(`**`percentile: `[`Double`][Double]**`) { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} - * __`.`__[**`with`**][Pivot.with]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** + * __`.`__[**`with`**][ReducedPivot.with]**` { `**`rowExpression: `[`RowExpression`][RowExpression]**` }`** * * {@include [Indent]} - * `| `__`.`__[**`values`**][Pivot.values]**` { `**`valueColumns: `[`ColumnsSelector`][ColumnsSelector]**` }`** + * `| `__`.`__[**`values`**][ReducedPivot.values]**` { `**`valueColumns: `[`ColumnsSelector`][ColumnsSelector]**` }`** * * ### Aggregate [Pivot] into [DataRow] * @@ -151,8 +151,8 @@ internal interface PivotDocs { * (optionally, the first or last one that satisfies a predicate) of each group; * * [minBy][Pivot.minBy] / [maxBy][Pivot.maxBy] — take the row with the minimum or maximum value * of the given [RowExpression] evaluated on rows within each group; - * * [medianBy][Pivot.medianBy] / [percentileBy][Pivot.percentileBy] — take the row with - * the median or a specific percentile value of the given [RowExpression] evaluated on rows within each group. + * * [medianBy][Pivot.medianBy] / [percentileBy][Pivot.percentileBy] — take the row at the position closest + * to the estimated median/percentile index of the [RowExpression]'s results calculated on rows within each group. * * These functions return a [ReducedPivot], which can then be transformed into a new [DataFrame] * containing a single combined row (either using the original reduced rows or their transformed versions) diff --git a/docs/StardustDocs/topics/groupBy.md b/docs/StardustDocs/topics/groupBy.md index e3ba64a073..b710b53e81 100644 --- a/docs/StardustDocs/topics/groupBy.md +++ b/docs/StardustDocs/topics/groupBy.md @@ -372,7 +372,7 @@ To perform a reducing operation, use the following functions: * [`minBy`](minBy.md) / [`maxBy`](maxBy.md) – to get from each group the row with the smallest / largest result of the [`row expression`](DataRow.md#row-expressions) supplied to the function. -* [`medianBy`](median.md) / [`percentileBy`](percentile.md) – to get the row with the value closest to the estimated +* [`medianBy`](median.md) / [`percentileBy`](percentile.md) – to get the row at the position closest to the estimated median/percentile index of the [`row expression`](DataRow.md#row-expressions)'s results calculated on rows within each group. These functions return an instance of `ReducedGroupBy`, which is a class serving as a transitional step diff --git a/docs/StardustDocs/topics/pivot.md b/docs/StardustDocs/topics/pivot.md index 6c3f66fe5a..4e693a51c1 100644 --- a/docs/StardustDocs/topics/pivot.md +++ b/docs/StardustDocs/topics/pivot.md @@ -219,10 +219,10 @@ Reducing is a specific case of [`aggregation`](pivot.md#aggregation). ### Step 1: use a reducing method Use the following functions to collapse each group in a [`Pivot`](pivot.md) into a single row: * [`first`](first.md) / [`last`](last.md) — take the first or last row (optionally, the first or last one that satisfies a predicate) of each group; -* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) — take the row with the minimum or maximum value of the given -[`row expression`](DataRow.md#row-expressions) evaluated on rows within each group; -* [`medianBy`](median.md) / [`percentileBy`](percentile.md) — take the row with the median or a specific percentile value -of the given [`row expression`](DataRow.md#row-expressions) evaluated on rows within each group. +* [`minBy`](minBy.md) / [`maxBy`](maxBy.md) — take the row with the minimum or maximum value +of the given [`row expression`](DataRow.md#row-expressions) evaluated on rows within each group; +* [`medianBy`](median.md) / [`percentileBy`](percentile.md) — take the row at the position closest to the estimated +median/percentile index of the [`row expression`](DataRow.md#row-expressions)'s results calculated on rows within each group. These functions return an instance of `ReducedPivot`, which is a class serving as a transitional step between performing a reduction on [`Pivot`](pivot.md) groups and specifying how the resulting reduced rows should be represented in a resulting [`DataRow`](DataRow.md).