docs: randomForest finishing work — design spec + implementation plans (PR #87 & #88) by ehrlinger · Pull Request #87 · ehrlinger/ggRandomForests

ehrlinger · 2026-05-21T14:53:28Z

Summary

Design spec: dev/plans/2026-05-21-rf-finishing-work-design.md — covers both PRs, problem statements, exact before/after code, test cases, files-touched table, non-goals
Implementation plan docs: randomForest finishing work — design spec + implementation plans (PR #87 & #88) #87: dev/plans/2026-05-21-rf-87-gg-variable-classification-plan.md — TDD plan for gg_variable.randomForest classification fix + smooth bugs + stale gg_roc()/calc_roc() produces a degenerate ROC (~0.5 AUC, 3 points) for randomForest objects #81/Meta: validate the randomForest engine (plot/ROC correctness + regression coverage) #82 closure
Implementation plan fix: gg_variable.randomForest classification — yhat.* columns + smooth bugs (#87) #88: dev/plans/2026-05-21-rf-88-multiclass-roc-plan.md — TDD plan for gg_roc per_class=TRUE feature (closes Meta: gg_roc enhancements (multi-class + confidence intervals) #72)

No code changes in this PR — docs/plans only.

🤖 Generated with Claude Code

Covers two v2.8.0 completion PRs: - PR #87: gg_variable.randomForest classification yhat fix + stale #81/#82 close - PR #88: multi-class gg_roc with per_class=TRUE (issue #72, no CIs) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…) and PR #88 (per_class ROC)

codecov · 2026-05-21T14:57:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.39%. Comparing base (4c64eed) to head (3e2eb0b).

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #87   +/-   ##
=======================================
  Coverage   86.39%   86.39%           
=======================================
  Files          38       38           
  Lines        3021     3021           
=======================================
  Hits         2610     2610           
  Misses        411      411

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Adds a design spec and two implementation plans for upcoming v2.8.0 “randomForest engine finishing work” (PRs #87 and #88), focusing on gg_variable.randomForest classification output shape and multi-class ROC support in gg_roc.

Changes:

Add a cross-PR design spec describing intended behavior, tests, and files touched for #87 and #88.
Add a step-by-step (TDD-oriented) implementation plan for PR #87 (gg_variable.randomForest classification + plot smooth fixes).
Add a step-by-step (TDD-oriented) implementation plan for PR #88 (multi-class gg_roc via per_class=TRUE + plotting/summary updates).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File	Description
dev/plans/2026-05-21-rf-finishing-work-design.md	Design spec spanning PR #87/#88, including proposed API changes, behavior, and tests.
dev/plans/2026-05-21-rf-87-gg-variable-classification-plan.md	Implementation plan for `gg_variable.randomForest` classification fix + `plot.gg_variable` smooth fixes.
dev/plans/2026-05-21-rf-88-multiclass-roc-plan.md	Implementation plan for `gg_roc(per_class=TRUE)` multi-class OvR curves + plot/summary changes.

Comments suppressed due to low confidence (1)

dev/plans/2026-05-21-rf-finishing-work-design.md:194

In the PR #88 per_class section, the spec says the long-format data frame has columns fpr/tpr, but the package’s ROC data contract is sens/spec/pct (see R/calc_roc.R return docs and tests/testthat/test_gg_roc.R). Recommend specifying that per-class output retains sens/spec/pct (plus class), and that fpr = 1 - spec / tpr = sens are computed in the plot method if desired.

- For forests with > 2 classes: compute one-vs-rest ROC for every class *k*
  (scores = `votes[, k]`, positive = `(y == k)`), stack into a long data frame
  with columns `class` (factor), `fpr`, `tpr`.
- `attr(gg, "auc")` becomes a named numeric vector: one entry per class,
  ordered by descending AUC. Class factor levels follow the same order.
- For binary forests: `per_class = TRUE` is a no-op — returns the single-curve
  result with no `class` column and a scalar `auc` attribute (same as
  `per_class = FALSE`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ehrlinger · 2026-05-21T16:12:33Z

+**New signature:**
+
+```r
+gg_roc(object, which_outcome = "all", per_class = FALSE, ...)