feat: gg_isopro newdata arg — varPro Phase 4b predict.isopro wrapper (+ training-path polarity fix)#96
Conversation
Second sub-project of Phase 4 (gg_beta_varpro and gg_ivarpro come after). Adds a newdata argument to gg_isopro() so a fitted isopro model can score new observations into the same tidy gg_isopro frame. The polarity flip between varPro's predict.isopro (smaller = anomalous) and the package's howbad (higher = anomalous) is hidden inside the wrapper; the column is semantically the same whether you score training or test data. Train/test overlay reuses the existing method-column auto-detect in plot.gg_isopro, explicitly documented.
After review discussion: rename the 'Polarity reminder' section to 'Polarity: how the wrapper presents both conventions' and rewrite it so it explicitly names that case.depth keeps varPro's native polarity while howbad carries the flipped version. Documentation section gains a concrete-code-form requirement so the implementer writes the transformation as 'howbad = 1 - predict(fit, newdata, quantiles=TRUE)' in the roxygen. Same design (Option A), clearer framing.
Adds three sanity tests for the predict.isopro path: newdata type validation, training-as-newdata top-5 ordering agreement, and the howbad = 1 - quantile relationship. The top-5 ordering test caught a real polarity bug in the training path: gg_isopro.isopro was returning howbad = object$howbad directly, but varPro's $howbad uses "lower = more anomalous" polarity (it is the quantile of case.depth, low depth = anomalous). The wrapper convention is "higher = more anomalous". Flip the training path the same way the prediction path does (1 - quantile) so train and test scores live on the same polarity. Also drop backticks from the newdata validation error so the regex match in the new tests is unambiguous.
…g-path polarity fix
c0ebf67 to
d143075
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #96 +/- ##
==========================================
+ Coverage 87.62% 87.72% +0.09%
==========================================
Files 40 40
Lines 3193 3217 +24
==========================================
+ Hits 2798 2822 +24
Misses 395 395
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR extends the gg_isopro() varPro wrapper to support scoring new observations via predict.isopro, while also fixing a polarity mismatch in the existing training/extraction path so that howbad consistently means “higher = more anomalous”.
Changes:
- Add
newdataargument togg_isopro()/gg_isopro.isopro()to score new rows usingpredict(..., quantiles = FALSE/TRUE)and computehowbad = 1 - quantile. - Fix training-path
howbadpolarity to align with plotting/docs and the new prediction path. - Add test coverage + a new (guarded) vdiffr snapshot for train/test overlay usage; update NEWS/docs accordingly.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
R/gg_isopro.R |
Adds newdata scoring path and flips training-path howbad polarity; updates roxygen. |
man/gg_isopro.Rd |
Regenerated documentation reflecting newdata and new “Scoring new data” section. |
tests/testthat/test_gg_isopro.R |
Adds tests for newdata behavior, provenance, polarity relationship, and overlay plotting. |
tests/testthat/test_snapshots.R |
Adds guarded vdiffr snapshot for train/test overlay plot. |
NEWS.md |
Documents the new newdata capability and the training-path polarity fix. |
DESCRIPTION |
Bumps development version. |
dev/plans/2026-05-26-varpro-phase4-predict-isopro-plan.md |
Adds implementation plan documentation for Phase 4b. |
dev/plans/2026-05-26-varpro-phase4-predict-isopro-design.md |
Adds design spec documentation for Phase 4b. |
Files not reviewed (1)
- man/gg_isopro.Rd: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| gg_isopro <- function(object, newdata = NULL, ...) { | ||
| UseMethod("gg_isopro", object) | ||
| } | ||
|
|
||
| #' @export | ||
| gg_isopro.isopro <- function(object, ...) { | ||
| gg_isopro.isopro <- function(object, newdata = NULL, ...) { | ||
| if (!inherits(object, "isopro")) { |
There was a problem hiding this comment.
Good catch — fixed in 26cf62b. Moved newdata after ... in both the generic and the isopro method so it's only matched by name. All existing tests already pass newdata = by name, and the roxygen now explicitly notes the by-name requirement and the back-compat rationale. devtools::test(filter = "gg_isopro") clean.
Address Copilot review on PR #96: placing newdata as the 2nd positional argument would change positional matching for any caller of the PR #94 signature gg_isopro(object, ...). Moving newdata after ... means it can only be matched by name, so existing positional calls are unaffected. All tests already pass newdata by name; no test changes needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Second of three Phase 4 sub-projects. Adds a
newdataargument togg_isopro()so a fittedvarPro::isopromodel can score new observations into the same tidygg_isoproframe, with the same plot / print / summary / autoplot S3 companions as the training extractor.gg_isopro(fit, newdata = test_df)returns the samec("gg_isopro", "data.frame")shape:obs / case.depth / howbad.predict(fit, newdata, quantiles = FALSE)forcase.depthandpredict(fit, newdata, quantiles = TRUE)for the quantile, then computeshowbad = 1 - quantileso the wrapper convention ("higher = more anomalous") holds across train and test.case.depthkeeps varPro's native polarity (lower = more anomalous), so the wrapper isn't hiding the transformation — both polarities are visible. The relationshiphowbad = 1 - predict(fit, newdata, quantiles = TRUE)is named explicitly in the roxygen.methodlabel column;plot.gg_isopro()colour-groups by it (the same column it uses for rnd / unsupv / auto comparisons in PR feat: gg_isopro — varPro Phase 4 anomaly-score wrapper #94).Polarity bug fix in the training path
Building the test-data sanity check (training-as-newdata top-5 overlap) surfaced a real bug in PR #94's training path.
varPro::isoprostores$howbadwith lower = more anomalous polarity (it is the quantile ofcase.depth), butplot.gg_isoproand the docs both assume higher = more anomalous. Train and test scores were anti-correlated (Pearson -1) until this PR's training-path flip (howbad = 1 - object$howbad) brought them into agreement.After the fix the training-as-newdata top-5 overlap is 5/5 on iris (was 0/5 before). The two
vdiffrbaselines recorded in PR #94 (gg-isopro-default,gg-isopro-threshold) are now visually flipped relative to the new behaviour — the elbow rises into the anomalous tail instead of falling. CI skips snapshots (VDIFFR_RUN_TESTS = false) so no failure surfaces; re-record withVDIFFR_RUN_TESTS = truewhen convenient.Test plan
Spec: `dev/plans/2026-05-26-varpro-phase4-predict-isopro-design.md`
Plan: `dev/plans/2026-05-26-varpro-phase4-predict-isopro-plan.md`
Next Phase 4 sub-projects: `gg_beta_varpro`, then `gg_ivarpro`.
🤖 Generated with Claude Code