Commit 8cfe5cf
authored
spec: add diagnostic-regression-panel specification (#5257)
## New Specification: `diagnostic-regression-panel`
Related to #5242
---
### specification.md
# diagnostic-regression-panel: Regression Diagnostic Panel (Four-Plot
Display)
## Description
A 2x2 panel of diagnostic plots for evaluating linear regression model
assumptions, replicating the classic output of R's `plot(lm)`. The four
subplots are: (1) Residuals vs Fitted values to detect non-linearity and
heteroscedasticity, (2) Normal Q-Q plot of standardized residuals to
assess normality, (3) Scale-Location plot (square root of standardized
residuals vs fitted values) to check homoscedasticity, and (4) Residuals
vs Leverage with Cook's distance contours to identify influential
observations. This composite display is the standard first step in
regression model validation across statistics, academia, and regulated
industries.
## Applications
- Validating linear regression assumptions before reporting results in
statistical analysis
- Checking for heteroscedasticity, non-linearity, and influential
outliers in fitted models
- Teaching regression diagnostics in academic statistics courses and
textbooks
- Regulatory model validation in finance (credit risk models) and pharma
(dose-response modeling)
## Data
- `fitted` (float) - Fitted/predicted values from the regression model
- `residuals` (float) - Raw residuals (observed minus predicted)
- `std_residuals` (float) - Standardized (or studentized) residuals
- `leverage` (float) - Hat values / leverage for each observation
- `cooks_d` (float) - Cook's distance measuring each observation's
influence
- Size: 50-500 observations
## Notes
- Four subplots arranged in a 2x2 grid layout with shared figure title
- **Subplot 1 (Residuals vs Fitted):** Scatter of residuals against
fitted values with a horizontal zero-reference line and a LOWESS
smoother to reveal non-linear patterns
- **Subplot 2 (Normal Q-Q):** Standardized residuals plotted against
theoretical normal quantiles with a 45-degree reference line; deviations
indicate non-normality
- **Subplot 3 (Scale-Location):** Square root of absolute standardized
residuals vs fitted values with a LOWESS smoother; a flat line indicates
constant variance
- **Subplot 4 (Residuals vs Leverage):** Standardized residuals vs
leverage with Cook's distance contour lines (e.g., at 0.5 and 1.0) to
highlight influential points
- Label the 2-3 most influential points (highest Cook's distance) with
observation indices in each subplot
- Use consistent point styling across all four subplots
---
**Next:** Add `approved` label to the issue to merge this PR.
---
:robot: *[spec-create
workflow](https://github.com/MarkusNeusinger/pyplots/actions/runs/24290826351)*
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>1 parent a57bafa commit 8cfe5cf
2 files changed
Lines changed: 60 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
0 commit comments