Skip to content

feat: gg_isopro newdata arg — varPro Phase 4b predict.isopro wrapper (+ training-path polarity fix)#96

Merged
ehrlinger merged 11 commits into
mainfrom
feat/varpro-phase4-predict-isopro
May 26, 2026
Merged

feat: gg_isopro newdata arg — varPro Phase 4b predict.isopro wrapper (+ training-path polarity fix)#96
ehrlinger merged 11 commits into
mainfrom
feat/varpro-phase4-predict-isopro

Conversation

@ehrlinger

Copy link
Copy Markdown
Owner

Summary

Second of three Phase 4 sub-projects. Adds a newdata argument to gg_isopro() so a fitted varPro::isopro model can score new observations into the same tidy gg_isopro frame, with the same plot / print / summary / autoplot S3 companions as the training extractor.

  • gg_isopro(fit, newdata = test_df) returns the same c("gg_isopro", "data.frame") shape: obs / case.depth / howbad.
  • Internally calls predict(fit, newdata, quantiles = FALSE) for case.depth and predict(fit, newdata, quantiles = TRUE) for the quantile, then computes howbad = 1 - quantile so the wrapper convention ("higher = more anomalous") holds across train and test.
  • case.depth keeps varPro's native polarity (lower = more anomalous), so the wrapper isn't hiding the transformation — both polarities are visible. The relationship howbad = 1 - predict(fit, newdata, quantiles = TRUE) is named explicitly in the roxygen.
  • To overlay train and test, bind the two extractor outputs with a method label column; plot.gg_isopro() colour-groups by it (the same column it uses for rnd / unsupv / auto comparisons in PR feat: gg_isopro — varPro Phase 4 anomaly-score wrapper #94).

Polarity bug fix in the training path

Building the test-data sanity check (training-as-newdata top-5 overlap) surfaced a real bug in PR #94's training path. varPro::isopro stores $howbad with lower = more anomalous polarity (it is the quantile of case.depth), but plot.gg_isopro and the docs both assume higher = more anomalous. Train and test scores were anti-correlated (Pearson -1) until this PR's training-path flip (howbad = 1 - object$howbad) brought them into agreement.

After the fix the training-as-newdata top-5 overlap is 5/5 on iris (was 0/5 before). The two vdiffr baselines recorded in PR #94 (gg-isopro-default, gg-isopro-threshold) are now visually flipped relative to the new behaviour — the elbow rises into the anomalous tail instead of falling. CI skips snapshots (VDIFFR_RUN_TESTS = false) so no failure surfaces; re-record with VDIFFR_RUN_TESTS = true when convenient.

Test plan

  • `devtools::test()` — 7 new tests, all pass; 16 PR feat: gg_isopro — varPro Phase 4 anomaly-score wrapper #94 tests still green (66 expectations total).
  • `devtools::check(args = '--as-cran')` — 0 errors, 0 warnings, 0 notes.
  • vdiffr snapshot added (`gg-isopro-predict-overlay`), skip cleanly without `VDIFFR_RUN_TESTS=true`.

Spec: `dev/plans/2026-05-26-varpro-phase4-predict-isopro-design.md`
Plan: `dev/plans/2026-05-26-varpro-phase4-predict-isopro-plan.md`

Next Phase 4 sub-projects: `gg_beta_varpro`, then `gg_ivarpro`.

🤖 Generated with Claude Code

ehrlinger added 10 commits May 26, 2026 16:07
Second sub-project of Phase 4 (gg_beta_varpro and gg_ivarpro come after).
Adds a newdata argument to gg_isopro() so a fitted isopro model can score
new observations into the same tidy gg_isopro frame. The polarity flip
between varPro's predict.isopro (smaller = anomalous) and the package's
howbad (higher = anomalous) is hidden inside the wrapper; the column is
semantically the same whether you score training or test data. Train/test
overlay reuses the existing method-column auto-detect in plot.gg_isopro,
explicitly documented.
After review discussion: rename the 'Polarity reminder' section to
'Polarity: how the wrapper presents both conventions' and rewrite it
so it explicitly names that case.depth keeps varPro's native polarity
while howbad carries the flipped version. Documentation section gains
a concrete-code-form requirement so the implementer writes the
transformation as 'howbad = 1 - predict(fit, newdata, quantiles=TRUE)'
in the roxygen. Same design (Option A), clearer framing.
Adds three sanity tests for the predict.isopro path: newdata type
validation, training-as-newdata top-5 ordering agreement, and the
howbad = 1 - quantile relationship.

The top-5 ordering test caught a real polarity bug in the training
path: gg_isopro.isopro was returning howbad = object$howbad directly,
but varPro's $howbad uses "lower = more anomalous" polarity (it is the
quantile of case.depth, low depth = anomalous). The wrapper convention
is "higher = more anomalous". Flip the training path the same way the
prediction path does (1 - quantile) so train and test scores live on
the same polarity. Also drop backticks from the newdata validation
error so the regex match in the new tests is unambiguous.
@ehrlinger ehrlinger force-pushed the feat/varpro-phase4-predict-isopro branch from c0ebf67 to d143075 Compare May 26, 2026 20:14
@codecov

codecov Bot commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.72%. Comparing base (5c78a66) to head (26cf62b).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #96      +/-   ##
==========================================
+ Coverage   87.62%   87.72%   +0.09%     
==========================================
  Files          40       40              
  Lines        3193     3217      +24     
==========================================
+ Hits         2798     2822      +24     
  Misses        395      395              
Files with missing lines Coverage Δ
R/gg_isopro.R 95.74% <100.00%> (+4.44%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the gg_isopro() varPro wrapper to support scoring new observations via predict.isopro, while also fixing a polarity mismatch in the existing training/extraction path so that howbad consistently means “higher = more anomalous”.

Changes:

  • Add newdata argument to gg_isopro() / gg_isopro.isopro() to score new rows using predict(..., quantiles = FALSE/TRUE) and compute howbad = 1 - quantile.
  • Fix training-path howbad polarity to align with plotting/docs and the new prediction path.
  • Add test coverage + a new (guarded) vdiffr snapshot for train/test overlay usage; update NEWS/docs accordingly.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
R/gg_isopro.R Adds newdata scoring path and flips training-path howbad polarity; updates roxygen.
man/gg_isopro.Rd Regenerated documentation reflecting newdata and new “Scoring new data” section.
tests/testthat/test_gg_isopro.R Adds tests for newdata behavior, provenance, polarity relationship, and overlay plotting.
tests/testthat/test_snapshots.R Adds guarded vdiffr snapshot for train/test overlay plot.
NEWS.md Documents the new newdata capability and the training-path polarity fix.
DESCRIPTION Bumps development version.
dev/plans/2026-05-26-varpro-phase4-predict-isopro-plan.md Adds implementation plan documentation for Phase 4b.
dev/plans/2026-05-26-varpro-phase4-predict-isopro-design.md Adds design spec documentation for Phase 4b.
Files not reviewed (1)
  • man/gg_isopro.Rd: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/gg_isopro.R Outdated
Comment on lines 167 to 173
gg_isopro <- function(object, newdata = NULL, ...) {
UseMethod("gg_isopro", object)
}

#' @export
gg_isopro.isopro <- function(object, ...) {
gg_isopro.isopro <- function(object, newdata = NULL, ...) {
if (!inherits(object, "isopro")) {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 26cf62b. Moved newdata after ... in both the generic and the isopro method so it's only matched by name. All existing tests already pass newdata = by name, and the roxygen now explicitly notes the by-name requirement and the back-compat rationale. devtools::test(filter = "gg_isopro") clean.

Address Copilot review on PR #96: placing newdata as the 2nd
positional argument would change positional matching for any
caller of the PR #94 signature gg_isopro(object, ...). Moving
newdata after ... means it can only be matched by name, so
existing positional calls are unaffected. All tests already pass
newdata by name; no test changes needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ehrlinger ehrlinger merged commit edbc9b6 into main May 26, 2026
15 checks passed
@ehrlinger ehrlinger deleted the feat/varpro-phase4-predict-isopro branch May 28, 2026 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants