Skip to content

Commit 2e9029c

Browse files
committed
Adding trimmedMean()
1 parent 620810e commit 2e9029c

7 files changed

Lines changed: 400 additions & 125 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
- Adding `StreamingStat` class (experimental) for streaming/online computation of mean, variance, stdev, skewness, kurtosis, sum, min, and max with O(1) memory
55
- Adding `percentile()` method for computing the value at any percentile (0–100) with linear interpolation
66
- Adding `coefficientOfVariation()` method for relative dispersion (CV%), supporting both sample and population modes
7+
- Adding `trimmedMean()` method for robust central tendency — computes the mean after removing outliers from each side
78

89
## 1.2.5 - 2026-02-22
910
- Adding `kurtosis()` method for excess kurtosis

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ The various mathematical statistics are listed below:
6464
| ---------------------- | ----------- |
6565
| `mean()` | arithmetic mean or "average" of data |
6666
| `fmean()` | floating-point arithmetic mean, with optional weighting and precision |
67+
| `trimmedMean()` | trimmed (truncated) mean — mean after removing outliers from each side |
6768
| `median()` | median or "middle value" of data |
6869
| `medianLow()` | low median of data |
6970
| `medianHigh()` | high median of data |
@@ -132,6 +133,23 @@ $fmean = Stat::fmean([3.5, 4.0, 5.25], [1, 2, 1], 3);
132133
If the input is empty, or weights are invalid (e.g., length mismatch or sum is zero), an exception is thrown.
133134
Use this function when you need floating-point accuracy or to apply custom weighting and rounding to your average.
134135

136+
#### Stat::trimmedMean( array $data, float $proportionToCut = 0.1, ?int $round = null )
137+
Return the trimmed (truncated) mean of the data. Computes the mean after removing the lowest and highest fraction of values. This is a robust measure of central tendency, less sensitive to outliers than the regular mean.
138+
139+
The `$proportionToCut` parameter specifies the fraction to trim from **each** side (must be in the range `[0, 0.5)`). For example, `0.1` removes the bottom 10% and top 10%.
140+
141+
```php
142+
use HiFolks\Statistics\Stat;
143+
$mean = Stat::trimmedMean([1, 2, 3, 4, 5, 6, 7, 8, 9, 100], 0.1);
144+
// 5.5 (outlier 100 and lowest value 1 removed)
145+
146+
$mean = Stat::trimmedMean([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 0.2);
147+
// 5.5 (removes 2 values from each side)
148+
149+
$mean = Stat::trimmedMean([1, 2, 3, 4, 5], 0.0);
150+
// 3.0 (no trimming, same as regular mean)
151+
```
152+
135153
#### Stat::geometricMean( array $data )
136154
The geometric mean indicates the central tendency or typical value of the data using the product of the values (as opposed to the arithmetic mean which uses their sum).
137155

TODO.md

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,14 @@
22

33
### Descriptive Statistics
44

5-
- Trimmed/Truncated mean - mean after removing outliers (top/bottom x%)
65
- Weighted median - median with weights (like fmean supports weights, but median doesn't)
76
- Standard error of the mean (SEM)
8-
- Coefficient of variation (CV) - stdev / mean, useful for comparing variability across datasets
97
- Mean absolute deviation (MAD)
10-
- Percentile - arbitrary percentile (e.g., 90th percentile) — quantiles() exists but a direct percentile($data, $p) would be convenient
8+
119

1210
### Correlation & Regression
1311

14-
- Spearman rank correlation - non-parametric correlation
12+
1513
- Kendall tau correlation - another rank-based correlation
1614
- Multiple/polynomial regression
1715
- R-squared (coefficient of determination)
@@ -42,9 +40,3 @@
4240

4341
- Rank - assign ranks to data points
4442
- Percentile rank - what percentile a given value falls at
45-
46-
---
47-
48-
### Notes
49-
50-
The most impactful additions would likely be (skewness DONE), (kurtosis DONE), coefficient of variation, percentile, and Spearman correlation — these are commonly needed and align well with the package's existing scope (inspired by Python's statistics module).

0 commit comments

Comments
 (0)