Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: ggRandomForests
Type: Package
Title: Visually Exploring Random Forests
Version: 2.7.0.9001
Date: 2026-03-27
Version: 2.7.1
Date: 2026-04-27
Authors@R: person("John", "Ehrlinger",
role = c("aut", "cre"),
email = "john.ehrlinger@gmail.com")
Expand Down
30 changes: 28 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,33 @@
Package: ggRandomForests
Version: 2.8.0
Version: 2.7.1

ggRandomForests v2.8.0
ggRandomForests v2.7.1
=====================
* Fix `gg_partial_rfsrc()` for survival forests: `partial.rfsrc()` was being
called without `partial.type`, causing a zero-length comparison
(`if (partial.type == "rel.freq") ...`) inside the C-level prediction
routine and aborting the call. Survival forests now pass
`partial.type = "surv"` (default; configurable via the new `partial.type`
argument accepting `"surv"`, `"chf"`, or `"mort"`). This unblocks the
`partial-dep` chunk in the survival vignette.
* Fix `gg_partial_rfsrc()` for survival forests with multiple
`partial.time` values: `get.partial.plot.data()` returns yhat as an
`[length(partial.values) x length(partial.time)]` matrix, but the previous
code assumed a vector and crashed on column-mismatch when assigning
`time`. The result is now reshaped to long form so each `(x, time)` pair
is a single row.
* Improve `plot.gg_partial_rfsrc()` survival layout: predictor value is now
on the x-axis with one curve per (rounded) time point coloured by `Time`,
faceted by variable name. The previous default put time on the x-axis
and one curve per predictor value, producing a saturated legend with
dozens of nearly-identical lines.
* Add `tests/testthat/test_plot_layer_data.R`: regression suite that uses
`ggplot2::layer_data()` to verify each `plot.gg_*()` method renders
non-empty layers for every supported forest family. Catches the
empty-figure class of bug (transform/plot column-name mismatch) without
requiring visual inspection.

ggRandomForests v2.7.0
=====================
* S3 design overhaul: `gg_partial()`, `gg_partialpro()`, and
`gg_partial_rfsrc()` now stamp their return values with S3 classes
Expand Down
61 changes: 49 additions & 12 deletions R/gg_partial_rfsrc.R
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@
#' snapped to the nearest entry in \code{rf_model$time.interest} — see the
#' \strong{Survival forests} section below. When \code{NULL} (default),
#' three quartile points of \code{time.interest} are used.
#' @param partial.type Character; type of predicted value for survival
#' forests, passed through to \code{\link[randomForestSRC]{partial.rfsrc}}.
#' One of \code{"surv"} (default), \code{"chf"}, or \code{"mort"}. Ignored
#' for non-survival forests. \code{partial.rfsrc()} requires a non-\code{NULL}
#' value for survival families; supplying it here avoids a cryptic
#' \dQuote{argument is of length zero} error from the underlying C code.
#' @param cat_limit Variables with fewer than \code{cat_limit} unique values in
#' \code{newx} are treated as categorical; all others are continuous.
#' Defaults to 10.
Expand Down Expand Up @@ -89,6 +95,7 @@ gg_partial_rfsrc <- function(rf_model,
xvar2.name = NULL,
newx = NULL,
partial.time = NULL,
partial.type = c("surv", "chf", "mort"),
cat_limit = 10,
n_eval = 25) {
if (is.null(newx)) {
Expand All @@ -112,17 +119,28 @@ gg_partial_rfsrc <- function(rf_model,
is_surv <- !is.null(rf_model$family) && grepl("surv", rf_model$family)
if (is_surv) {
partial.time <- snap_partial_time(rf_model, partial.time)
# partial.rfsrc() requires a non-NULL partial.type for survival forests;
# NULL triggers a zero-length comparison inside the C code.
partial.type <- match.arg(partial.type)
} else {
partial.type <- NULL
}

if (is.null(xvar2.name)) {
pdta <- partial_no_group(xvar.names, newx, rf_model,
cat_limit, n_eval, is_surv, partial.time)
cat_limit, n_eval, is_surv, partial.time,
partial.type)
} else {
pdta <- partial_with_group(xvar.names, xvar2.name, newx, rf_model,
cat_limit, n_eval, is_surv, partial.time)
cat_limit, n_eval, is_surv, partial.time,
partial.type)
}

split_partial_result(do.call("rbind", pdta))
result <- split_partial_result(do.call("rbind", pdta))
# Carry partial.type so plot.gg_partial_rfsrc() can pick the correct
# y-axis label (Survival / CHF / Mortality).
attr(result, "partial.type") <- partial.type
result
}

## ---- unexported helpers -------------------------------------------------------
Expand Down Expand Up @@ -184,7 +202,7 @@ make_eval_grid <- function(xname, newx, cat_limit, n_eval) {

## Thin wrapper around partial.rfsrc that builds the argument list.
call_partial_rfsrc <- function(rf_model, xname, xval,
is_surv, partial.time,
is_surv, partial.time, partial.type,
xvar2.name = NULL, x2val = NULL) {
args <- list(
object = rf_model,
Expand All @@ -197,44 +215,62 @@ call_partial_rfsrc <- function(rf_model, xname, xval,
}
if (is_surv) {
args$partial.time <- partial.time
args$partial.type <- partial.type
}
do.call(randomForestSRC::partial.rfsrc, args)
}

## Process a single predictor variable and return a tidy data.frame (or NULL).
partial_one_var <- function(xname, newx, rf_model,
cat_limit, n_eval, is_surv, partial.time,
partial.type,
xvar2.name = NULL, x2val = NULL) {
eg <- make_eval_grid(xname, newx, cat_limit, n_eval)
if (is.null(eg)) return(NULL)
xval <- eg$xval
gr <- eg$categorical
partial.obj <- call_partial_rfsrc(rf_model, xname, xval,
is_surv, partial.time,
is_surv, partial.time, partial.type,
xvar2.name, x2val)
pout <- randomForestSRC::get.partial.plot.data(partial.obj, granule = gr)
out_dta <- data.frame(x = pout$x, yhat = pout$yhat)
# Survival forests with >1 partial.time return yhat as an
# [length(partial.values) x length(partial.time)] matrix; expand to long form
# so each (x, time) pair is its own row. For non-survival or single-time
# cases yhat is already a vector of length(partial.values).
if (is.matrix(pout$yhat)) {
pt <- if (!is.null(pout$partial.time)) pout$partial.time else seq_len(ncol(pout$yhat))
out_dta <- data.frame(
x = rep(pout$x, times = length(pt)),
yhat = as.numeric(pout$yhat),
time = rep(pt, each = length(pout$x))
)
} else {
out_dta <- data.frame(x = pout$x, yhat = pout$yhat)
if (!is.null(pout$partial.time)) {
out_dta$time <- pout$partial.time
}
}
out_dta$name <- xname
out_dta$type <- c("continuous", "categorical")[gr + 1L]
if (!is.null(pout$partial.time)) {
out_dta$time <- pout$partial.time
}
out_dta
}

## Compute partial dependence across xvar.names (no grouping variable).
partial_no_group <- function(xvar.names, newx, rf_model,
cat_limit, n_eval, is_surv, partial.time) {
cat_limit, n_eval, is_surv, partial.time,
partial.type) {
pdta <- lapply(xvar.names, partial_one_var,
newx = newx, rf_model = rf_model,
cat_limit = cat_limit, n_eval = n_eval,
is_surv = is_surv, partial.time = partial.time)
is_surv = is_surv, partial.time = partial.time,
partial.type = partial.type)
Filter(Negate(is.null), pdta)
}

## Compute partial dependence across xvar.names for each level of xvar2.name.
partial_with_group <- function(xvar.names, xvar2.name, newx, rf_model,
cat_limit, n_eval, is_surv, partial.time) {
cat_limit, n_eval, is_surv, partial.time,
partial.type) {
xv2 <- unique(newx[[xvar2.name]])
xv2 <- xv2[!is.na(xv2)]
if (length(xv2) == 0L) {
Expand All @@ -248,6 +284,7 @@ partial_with_group <- function(xvar.names, xvar2.name, newx, rf_model,
newx = newx, rf_model = rf_model,
cat_limit = cat_limit, n_eval = n_eval,
is_surv = is_surv, partial.time = partial.time,
partial.type = partial.type,
xvar2.name = xvar2.name, x2val = x2val)
p1dta <- Filter(Negate(is.null), p1dta)
if (length(p1dta) == 0L) return(NULL)
Expand Down
40 changes: 32 additions & 8 deletions R/plot.gg_partial.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,19 @@
####
####**********************************************************************
####**********************************************************************

# Map partial.type ("surv" / "chf" / "mort") to a human y-axis label.
# Falls back to "Predicted Survival" when the attribute is absent (e.g. an
# object built before this attribute was introduced).
partial_surv_y_label <- function(partial.type) {
if (is.null(partial.type)) return("Predicted Survival")
switch(partial.type,
surv = "Predicted Survival",
chf = "Predicted CHF",
mort = "Predicted Mortality",
"Predicted Survival")
}

#' Plot a \code{\link{gg_partial}} object
#'
#' Produces ggplot2 partial dependence curves from the named list returned by
Expand Down Expand Up @@ -85,8 +98,11 @@ plot.gg_partial <- function(x, ...) {
#' For standard (non-survival) forests: continuous predictors are line plots,
#' categorical predictors are bar charts, both faceted by variable name.
#'
#' For survival forests (when a \code{time} column is present): each predictor
#' value is a separate curve over time, faceted by variable name.
#' For survival forests (when a \code{time} column is present): each evaluation
#' time point is a separate curve over the predictor's value, faceted by
#' variable name. The y-axis label adapts to the \code{partial.type} stored on
#' the object (\dQuote{Predicted Survival}, \dQuote{Predicted CHF}, or
#' \dQuote{Predicted Mortality}).
#'
#' For two-variable surface plots (when a \code{grp} column is present):
#' each group level is a separate line, faceted by primary predictor name.
Expand All @@ -109,19 +125,27 @@ plot.gg_partial_rfsrc <- function(x, ...) {
cont <- gg_dta$continuous

if (!is.null(cont$time)) {
## Survival forest: predictor value is the grouping variable; x-axis is time
## Survival forest: predictor value on x-axis, one curve per time point.
## Group/colour by the *full-precision* time so distinct horizons that
## happen to round to the same value are not silently merged. The legend
## is relabelled with rounded values for readability.
time_levels <- sort(unique(cont$time))
cont$.time_factor <- factor(cont$time, levels = time_levels)
legend_labels <- format(round(time_levels, 2), trim = TRUE)
y_lab <- partial_surv_y_label(attr(gg_dta, "partial.type"))
gg_cont <- ggplot2::ggplot(
cont,
ggplot2::aes(
x = .data$time,
x = .data$x,
y = .data$yhat,
color = factor(.data$x),
group = factor(.data$x)
color = .data$.time_factor,
group = .data$.time_factor
)
) +
ggplot2::geom_line() +
ggplot2::facet_wrap(~name, scales = "free") +
ggplot2::labs(x = "Time", y = "Partial Effect", color = "Predictor value")
ggplot2::facet_wrap(~name, scales = "free_x") +
ggplot2::scale_color_discrete(labels = legend_labels) +
ggplot2::labs(x = NULL, y = y_lab, color = "Time")

} else if (!is.null(cont$grp)) {
## Two-variable surface: group is xvar2; x-axis is the primary predictor
Expand Down
29 changes: 18 additions & 11 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,28 @@
This is ggRandomForests package submission v2.7.0
This is ggRandomForests package submission v2.7.1
-------------------------------------------------------------------------
This is a bug-fix and code-quality release. Key changes:
This is a bug-fix release. Key changes:

* Fix critical visual bug: `aes()` calls throughout `plot.gg_rfsrc` and
`plot.gg_roc` used bare string literals instead of `.data[[col]]`,
causing aesthetics to map to constant strings rather than data columns.
* Fix `bootstrap_survival` CI-band indexing and `gg_rfsrc.randomForest`
incorrect use of non-existent `object$xvar` field.
* Fix `seq_len(nvar)` vs `1:nvar` silent bug in `gg_vimp` and `plot.gg_vimp`.
* Full test suite migration to testthat 3.x API.
* Improved GitHub Actions CI (lintr enforcement, warnings-as-errors).
* Fix `gg_partial_rfsrc()` for survival forests: `partial.rfsrc()` is now
called with `partial.type = "surv"` (default; also accepts `"chf"` /
`"mort"`). Without this, a zero-length comparison inside the underlying
C code aborted the call and left the survival-vignette partial-dep chunks
empty.
* Fix `gg_partial_rfsrc()` for multiple `partial.time` values: yhat is
reshaped from the matrix returned by `get.partial.plot.data()` into
long form so each `(x, time)` pair is one row.
* Improve `plot.gg_partial_rfsrc()` survival layout: predictor on the
x-axis with one curve per time point coloured by `Time`, faceted by
variable name.
* New regression test file `test_plot_layer_data.R` uses
`ggplot2::layer_data()` to verify each `plot.gg_*()` method renders
non-empty layers across all forest families, catching empty-figure
regressions without visual inspection.

## R CMD check results
0 errors | 0 warnings | 0 notes

## Test environments
* local R installation (R 4.4, macOS)
* local R installation (R 4.5, macOS)
* GitHub Actions: ubuntu-latest (R devel)
* GitHub Actions: ubuntu-latest (R release)
* GitHub Actions: ubuntu-latest (R oldrel-1)
Expand Down
8 changes: 8 additions & 0 deletions man/gg_partial_rfsrc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 5 additions & 2 deletions man/plot.gg_partial_rfsrc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading