[pull] develop from freqtrade:develop by pull[bot] · Pull Request #1795 · Uncodedtech/freqtrade

pull · 2026-06-21T14:36:18Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Freqtrade already reports the SQN, which is the t-statistic of the mean per-trade return (sqrt(n) * mean / std), but never tells the user whether that value is statistically distinguishable from zero. This adds a two-sided p-value for the null hypothesis "mean trade return = 0" so users can judge whether a backtest result represents a real edge or just noise. - `calculate_p_value()` in data/metrics.py computes the one-sample Student's t-test p-value. The Student's t CDF is evaluated via a pure-Python regularized incomplete beta function (continued fraction, Numerical Recipes), so no SciPy dependency is added to the core backtest path (SciPy is only an optional hyperopt extra). - The value is added to the strategy stats as `profit_p_value` and shown as "Mean profit p-value" in the SUMMARY METRICS table, right after SQN. - Backward compatible: older stored results without the key render "N/A". - Tests validate the p-value against scipy.stats.ttest_1samp reference values, scale invariance, and edge cases (n<2, zero variance). Docs updated with the metric description and its i.i.d. caveat. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Following review feedback on the backtest report's "Mean profit p-value" metric, trim the legend entry to a crisp definition and add a plain-English note for non-statisticians: the p-value is the chance pure luck would produce an average result at least this far from zero if the strategy had no edge (so p=0.48 ~ 48% chance from noise), lower means less likely a fluke, the usual bar is <0.05, and the i.i.d. assumption plus multiple testing mean it flags absence of significance rather than proving an edge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Codecov flagged 7 uncovered patch lines, all in the new pure-Python incomplete-beta helpers behind calculate_p_value. Add targeted tests: - _regularized_incomplete_beta against closed-form values (uniform, x**2, 2x-x**2, 3x**2-2x**3, arcsine) plus the x<=0 / x>=1 domain guards. - calculate_p_value with break-even (zero-mean) returns -> p-value 1.0, exercising the x>=1 path through the public API. - _beta_continued_fraction at degenerate points that trip the Lentz underflow guards, asserting the result stays finite. The two continued-fraction c-underflow guards cannot be reached for the bounded (a, b) argument range this routine uses (verified empirically over ~3.2M argument combinations), so they are marked `# pragma: no cover`. Changed lines are now fully covered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@xmatthias

Apply @xmatthias's review feedback (PR #13227): - metrics.py: drop the misleading `# pragma: no cover` markers and the false "cannot be reached" claim on the continued-fraction underflow guards; leave them as honest, plainly-uncovered defensive code. - Rename the stats key `profit_p_value` -> `p_value` to match the bare sibling keys (sqn, calmar, sharpe, sortino); no prefix, no plan for multiple p-value types. - docs/backtesting.md: use the actual `.4g`-formatted value (0.4799) instead of the hand-rounded 0.48, and collapse the explanatory admonition (`!!!` -> `???`). - test_optimize_reports.py: assert the exact computed p-value via pytest.approx instead of the weak 0 <= p <= 1 range check. - test_metrics.py: compute the reference with scipy.stats.ttest_1samp directly (with an importorskip guard) rather than hard-coding values; drop the misleading "stay self-contained / don't introduce scipy" comment (scipy is in the dev/test env). - conftest.py: give the macOS torch mock a dummy `Tensor` attribute so SciPy's array-API dispatch stays usable under the mock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Per maintainer feedback on PR #13227: - Replace hand-rolled betainc / t-distribution implementation with a one-liner: scipy.stats.ttest_1samp. Much simpler and avoids reinventing well-tested numerics. - Move scipy from requirements-hyperopt.txt to requirements.txt so it is always installed and no random runtime errors occur. - Update test to derive expected p-values live via scipy rather than hard-coded reference numbers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The Mean profit p-value in the backtesting example was hand-computed (0.4799). Regenerated it from the documented backtest command (SampleStrategy on tests/testdata/config.tests.usdt.json, bybit futures, 20250701-20250801, 5m) so the figure is reproducible and verifiable: the actual value is 0.4768. All other metrics in the example are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Enables freqUI to display this.

…pvalue Add mean-trade-return p-value to backtest summary metrics

yongzhe2160cs and others added 10 commits June 4, 2026 14:50

Merge branch 'develop' into pr/yongzhe2160cs/13227

f887586

test: reduce test comment verbosity

e1c5357

feat: run p_value test per line

f77facc

Enables freqUI to display this.

Merge pull request #13227 from yongzhe2160cs/feature/backtest-profit-…

064e67c

…pvalue Add mean-trade-return p-value to backtest summary metrics

pull Bot locked and limited conversation to collaborators Jun 21, 2026

pull Bot added the ⤵️ pull label Jun 21, 2026

pull Bot merged commit 064e67c into Uncodedtech:develop Jun 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] develop from freqtrade:develop#1795

[pull] develop from freqtrade:develop#1795
pull[bot] merged 10 commits into
Uncodedtech:developfrom
freqtrade:develop

pull Bot commented Jun 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pull Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull Bot commented Jun 21, 2026 •

edited

Loading