Skip to content

Commit 4c64eed

Browse files
ehrlingerclaude
andauthored
varPro Phase 3: gg_udependent — dependency graph for uvarpro fits (#86)
* docs: add varPro Phase 3 (gg_udependent) design spec Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add varPro Phase 3 (gg_udependent) implementation plan Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: open 2.7.3.9004 dev cycle; add ggraph to Suggests * feat(P3-T1): gg_udependent extractor + print/summary/autoplot (TDD) Implements gg_udependent() to extract cross-variable dependency graphs from uvarpro fits via get.beta.entropy/sdependent, with full tidy edges/nodes/igraph output, provenance attribute, and S3 companions. 25 tests pass (1 skip: ggraph not installed in dev env). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(P3-T1): threshold validation — any positive value is valid (not just (0,1)) * fix(P3-T1): summary.gg_udependent returns invisibly without side-effect print * refactor(P3-T1): move S3 companions to shared method files; document igraph usage and degree semantics - print.gg_udependent, print.summary.gg_udependent → R/print_methods.R - summary.gg_udependent → R/summary_methods.R - autoplot.gg_udependent → R/autoplot_methods.R (calls plot() not plot.gg_udependent directly) - Add @note to gg_udependent roxygen block documenting igraph:: call-site pattern - Expand @return $nodes description with directed/undirected degree semantics - Add inline comment near deg_vec computation explaining out-degree vs total-degree choice Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(P3-T2): plot.gg_udependent ggraph network renderer (TDD) Adds plot.gg_udependent S3 method rendering variable dependency graphs via ggraph; empty-graph guard fires before ggraph check so it works without ggraph installed. 26 pass, 3 skip (ggraph not installed), 0 fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(P3-T2): add importFrom igraph; use match for edge-weight backfill * test(P3-T3): add vdiffr snapshot test stubs (ggraph not in dev env) * docs(P3-T4): update NEWS.md for v2.7.3.9004 / gg_udependent Phase 3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(P3-T4): move igraph to Imports, guard donttest example, prune stale snapshots - igraph moved from Suggests to Imports (required by importFrom in NAMESPACE) - plot.gg_udependent example wrapped in requireNamespace("ggraph") guard - @importFrom igraph added to gg_udependent.R; misleading @note removed - Stale vdiffr snapshots deleted by devtools::test() cleanup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(P3): remove dead requireNamespace(igraph) guard — igraph now in Imports * fix: CI failures — lint, vdiffr guard, undirected mode - Rename single-letter `A` -> `adj_mat` (object_name_linter) - Remove trailing blank line (trailing_blank_lines_linter) - Fix undirected igraph mode: 'undirected' -> 'max' (igraph >= 1.6.0 requires symmetric matrix for mode='undirected'; 'max' symmetrises) - Move gg_udependent vdiffr tests from test_gg_udependent.R into test_snapshots.R under the VDIFFR_RUN_TESTS='true' guard, matching the package convention and preventing CI failures on first-run new snapshots Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: add gg_udependent and plot.gg_udependent to _pkgdown.yml index pkgdown fails with 'topics missing from index' when exported functions are not listed in _pkgdown.yml reference section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address Copilot review — undirected symmetry, summary API, print guard, snapshots Undirected adjacency/weights (R/gg_udependent.R): - Symmetrise adj_mat with pmax(adj_mat, t(adj_mat)) for directed=FALSE so edge existence matches max(I[i,j], I[j,i]) before igraph build; mode can now be "undirected" (no longer requires "max" workaround) - Use max(I[i,j], I[j,i]) as edge weight for undirected graphs - Set igraph::E(g)$weight in the extractor (order-insensitive key for undirected) so the plot method never needs to recompute them Plot weight backfill (R/plot.gg_udependent.R): - Make legacy backfill order-insensitive for undirected graphs via pmin/pmax key matching (guards objects saved before weight was stored) summary API (R/summary_methods.R): - Rewrite summary.gg_udependent() to use .summary_skel() returning c("summary.gg_udependent","summary.gg"), consistent with all other summary.gg_*() methods; body carries edges/nodes/threshold lines print guard (R/print_methods.R): - Add NULL-safe provenance fallback in print.gg_udependent() - Replace print.summary.gg_udependent() body with NextMethod() to delegate to print.summary.gg() which renders the skel format Snapshots (tests/testthat/_snaps/): - Restore 30 SVG baselines from origin/main that were absent from the branch and appeared as deletions in the PR diff Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 6e64c20 commit 4c64eed

18 files changed

Lines changed: 1906 additions & 3 deletions

DESCRIPTION

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Package: ggRandomForests
22
Type: Package
33
Title: Visually Exploring Random Forests
4-
Version: 2.7.3.9003
4+
Version: 2.7.3.9004
55
Date: 2026-05-20
66
Authors@R: person("John", "Ehrlinger",
77
role = c("aut", "cre"),
@@ -23,6 +23,7 @@ Imports:
2323
randomForestSRC (>= 3.4.0),
2424
randomForest,
2525
varPro,
26+
igraph,
2627
survival,
2728
parallel,
2829
tidyr,
@@ -45,7 +46,7 @@ Suggests:
4546
pkgload,
4647
knitr,
4748
plotly,
48-
igraph,
49+
ggraph,
4950
callr
5051
VignetteBuilder: quarto
5152
Config/roxygen2/version: 8.0.0

NAMESPACE

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ S3method(autoplot,gg_partialpro)
99
S3method(autoplot,gg_rfsrc)
1010
S3method(autoplot,gg_roc)
1111
S3method(autoplot,gg_survival)
12+
S3method(autoplot,gg_udependent)
1213
S3method(autoplot,gg_variable)
1314
S3method(autoplot,gg_varpro)
1415
S3method(autoplot,gg_vimp)
@@ -38,6 +39,7 @@ S3method(plot,gg_partialpro)
3839
S3method(plot,gg_rfsrc)
3940
S3method(plot,gg_roc)
4041
S3method(plot,gg_survival)
42+
S3method(plot,gg_udependent)
4143
S3method(plot,gg_variable)
4244
S3method(plot,gg_varpro)
4345
S3method(plot,gg_vimp)
@@ -50,10 +52,12 @@ S3method(print,gg_partialpro)
5052
S3method(print,gg_rfsrc)
5153
S3method(print,gg_roc)
5254
S3method(print,gg_survival)
55+
S3method(print,gg_udependent)
5356
S3method(print,gg_variable)
5457
S3method(print,gg_varpro)
5558
S3method(print,gg_vimp)
5659
S3method(print,summary.gg)
60+
S3method(print,summary.gg_udependent)
5761
S3method(summary,gg_brier)
5862
S3method(summary,gg_error)
5963
S3method(summary,gg_partial)
@@ -63,6 +67,7 @@ S3method(summary,gg_partialpro)
6367
S3method(summary,gg_rfsrc)
6468
S3method(summary,gg_roc)
6569
S3method(summary,gg_survival)
70+
S3method(summary,gg_udependent)
6671
S3method(summary,gg_variable)
6772
S3method(summary,gg_varpro)
6873
S3method(summary,gg_vimp)
@@ -77,6 +82,7 @@ export(gg_partialpro)
7782
export(gg_rfsrc)
7883
export(gg_roc)
7984
export(gg_survival)
85+
export(gg_udependent)
8086
export(gg_variable)
8187
export(gg_varpro)
8288
export(gg_vimp)
@@ -104,9 +110,19 @@ importFrom(ggplot2,geom_ribbon)
104110
importFrom(ggplot2,geom_vline)
105111
importFrom(ggplot2,ggplot)
106112
importFrom(ggplot2,labs)
113+
importFrom(ggplot2,scale_color_manual)
107114
importFrom(ggplot2,scale_fill_manual)
108115
importFrom(ggplot2,theme)
109116
importFrom(ggplot2,theme_minimal)
117+
importFrom(ggplot2,theme_void)
118+
importFrom(igraph,E)
119+
importFrom(igraph,V)
120+
importFrom(igraph,as_data_frame)
121+
importFrom(igraph,degree)
122+
importFrom(igraph,delete_vertices)
123+
importFrom(igraph,edge_attr)
124+
importFrom(igraph,graph_from_adjacency_matrix)
125+
importFrom(igraph,vertex_attr)
110126
importFrom(parallel,mclapply)
111127
importFrom(patchwork,wrap_plots)
112128
importFrom(randomForest,randomForest)
@@ -129,5 +145,7 @@ importFrom(tidyr,all_of)
129145
importFrom(tidyr,pivot_longer)
130146
importFrom(utils,head)
131147
importFrom(utils,tail)
148+
importFrom(varPro,get.beta.entropy)
132149
importFrom(varPro,importance)
133150
importFrom(varPro,partialpro)
151+
importFrom(varPro,sdependent)

NEWS.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,19 @@
11
Package: ggRandomForests
2-
Version: 2.7.3.9003
2+
Version: 2.7.3.9004
33

44
ggRandomForests v2.8.0 (development) — continued
55
=================================================
6+
* **varPro variable dependency: `gg_udependent()` (Phase 3).**
7+
- `gg_udependent()` extracts cross-variable dependency scores from a
8+
`uvarpro` fit using `varPro::get.beta.entropy()` +
9+
`varPro::sdependent()`, and returns a tidy list with `$edges`
10+
(variable_from, variable_to, weight), `$nodes` (variable, degree,
11+
selected), and `$graph` (igraph object).
12+
- `plot.gg_udependent()` renders the dependency network using ggraph
13+
with edge width/opacity scaled by dependency strength and node colour
14+
by signal-variable status. Layout is configurable (`"fr"`, `"kk"`,
15+
`"stress"`, etc.).
16+
- `ggraph` added to `Suggests:`.
617
* **varPro variable importance: `gg_varpro()` (#85).**
718
- `gg_varpro()` extracts per-tree importance scores from a fitted
819
`varpro` object and renders an honest boxplot — hinges at the

R/autoplot_methods.R

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,3 +129,9 @@ autoplot.gg_brier <- function(object, ...) {
129129
autoplot.gg_varpro <- function(object, ...) {
130130
plot(object, ...)
131131
}
132+
133+
#' @rdname autoplot.gg
134+
#' @export
135+
autoplot.gg_udependent <- function(object, ...) {
136+
plot(object, ...)
137+
}

R/gg_udependent.R

Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
##=============================================================================
2+
#' Variable dependency graph from a uvarpro model
3+
#'
4+
#' Extracts cross-variable dependency scores from a fitted \code{uvarpro}
5+
#' object using \code{\link[varPro]{get.beta.entropy}} and
6+
#' \code{\link[varPro]{sdependent}}, and returns a tidy list suitable for
7+
#' \code{plot.gg_udependent}.
8+
#'
9+
#' @param object A fitted \code{uvarpro} object (required).
10+
#' @param threshold Numeric; positive dependency threshold passed to
11+
#' \code{sdependent()}. An edge \eqn{i \to j} is drawn when
12+
#' \code{I[i, j] >= threshold}. Default \code{0.25}.
13+
#' @param q.signal Quantile threshold (0--1) for signal variable selection;
14+
#' passed to \code{sdependent()}. Default \code{0.75}.
15+
#' @param directed Logical; \code{TRUE} (default) builds a directed igraph.
16+
#' @param min.degree Integer or \code{NULL}. When non-\code{NULL}, only nodes
17+
#' with degree \eqn{\ge} \code{min.degree} are retained in \code{$nodes},
18+
#' \code{$edges}, and \code{$graph}.
19+
#' @param ... Additional arguments forwarded to \code{varPro::sdependent()}.
20+
#'
21+
#' @return A named list of class \code{"gg_udependent"} with elements:
22+
#' \describe{
23+
#' \item{\code{$edges}}{Data frame: \code{variable_from}, \code{variable_to},
24+
#' \code{weight} (raw cross-importance value).}
25+
#' \item{\code{$nodes}}{Data frame: \code{variable} (factor, levels by
26+
#' descending degree), \code{degree} (integer; out-degree when
27+
#' \code{directed = TRUE}, total degree when \code{directed = FALSE}),
28+
#' \code{selected} (logical, \code{TRUE} if in \code{sdependent}'s
29+
#' signal set).}
30+
#' \item{\code{$graph}}{igraph object. \code{NULL} if no dependencies
31+
#' detected.}
32+
#' }
33+
#' A \code{"provenance"} attribute carries \code{threshold}, \code{q.signal},
34+
#' \code{directed}, \code{min.degree}, \code{xvar.names}, and \code{n}.
35+
#'
36+
#' @seealso \code{\link{plot.gg_udependent}}
37+
#'
38+
#' @examples
39+
#' \donttest{
40+
#' set.seed(42)
41+
#' uv <- varPro::uvarpro(iris[, -5], ntree = 50)
42+
#' gg <- gg_udependent(uv)
43+
#' print(gg)
44+
#' }
45+
#'
46+
#' @importFrom varPro get.beta.entropy sdependent
47+
#' @importFrom igraph graph_from_adjacency_matrix degree delete_vertices as_data_frame V
48+
#' @export
49+
gg_udependent <- function(object,
50+
threshold = 0.25,
51+
q.signal = 0.75,
52+
directed = TRUE,
53+
min.degree = NULL,
54+
...) {
55+
.validate_udep_inputs(object, threshold, directed)
56+
57+
## ---- Compute cross-variable dependency matrix ----------------------------
58+
imp_mat <- varPro::get.beta.entropy(object)
59+
60+
## ---- Helper: build and return an empty gg_udependent result ---------------
61+
.empty_result <- function(msg) {
62+
warning("gg_udependent: ", msg,
63+
"\nReturning empty structure. Consider lowering threshold.",
64+
call. = FALSE)
65+
empty_edges <- data.frame(variable_from = character(0),
66+
variable_to = character(0),
67+
weight = numeric(0),
68+
stringsAsFactors = FALSE)
69+
empty_nodes <- data.frame(variable = factor(character(0)),
70+
degree = integer(0),
71+
selected = logical(0),
72+
stringsAsFactors = FALSE)
73+
result <- structure(
74+
list(edges = empty_edges, nodes = empty_nodes, graph = NULL),
75+
class = c("gg_udependent", "list")
76+
)
77+
attr(result, "provenance") <- .udep_provenance(object, threshold, q.signal,
78+
directed, min.degree)
79+
result
80+
}
81+
82+
## ---- Build adjacency from threshold; short-circuit if empty --------------
83+
adj_mat <- (imp_mat >= threshold) * 1
84+
diag(adj_mat) <- 0
85+
if (sum(adj_mat) == 0L) {
86+
return(.empty_result(
87+
paste0("no edges found at threshold=", threshold)
88+
))
89+
}
90+
91+
## ---- Call sdependent for signal detection --------------------------------
92+
sdep <- varPro::sdependent(imp_mat, threshold = threshold,
93+
q.signal = q.signal, directed = directed,
94+
min.degree = min.degree, plot = FALSE, ...)
95+
96+
## ---- Handle empty graph (sdependent may also return character) -----------
97+
if (is.character(sdep)) {
98+
return(.empty_result(sdep))
99+
}
100+
101+
## ---- Build igraph from adjacency -----------------------------------------
102+
## For undirected, symmetrise first so edge existence = max(I[i,j], I[j,i])
103+
## and mode = "undirected" is valid (igraph >= 1.6.0 requires symmetry).
104+
if (!isTRUE(directed)) {
105+
adj_mat <- pmax(adj_mat, t(adj_mat))
106+
}
107+
g <- igraph::graph_from_adjacency_matrix(
108+
adj_mat,
109+
mode = if (isTRUE(directed)) "directed" else "undirected",
110+
diag = FALSE
111+
)
112+
isolated <- igraph::degree(g, mode = "all") == 0
113+
g <- igraph::delete_vertices(g, which(isolated))
114+
115+
## ---- Build tidy edge data frame with raw weights -------------------------
116+
edge_df <- igraph::as_data_frame(g, what = "edges")
117+
if (nrow(edge_df) > 0L) {
118+
if (isTRUE(directed)) {
119+
edge_df$weight <- mapply(function(i, j) imp_mat[i, j],
120+
edge_df[[1L]], edge_df[[2L]])
121+
} else {
122+
## Undirected: weight = max of both directions
123+
edge_df$weight <- mapply(
124+
function(i, j) max(imp_mat[i, j], imp_mat[j, i]),
125+
edge_df[[1L]], edge_df[[2L]])
126+
}
127+
} else {
128+
edge_df$weight <- numeric(0)
129+
}
130+
names(edge_df)[1:2] <- c("variable_from", "variable_to")
131+
132+
## ---- Build tidy node data frame ------------------------------------------
133+
vnames <- igraph::V(g)$name
134+
## degree: out-degree for directed (matches sdependent's signal.vars logic),
135+
## total degree for undirected
136+
deg_vec <- if (isTRUE(directed)) {
137+
igraph::degree(g, mode = "out")[vnames]
138+
} else {
139+
igraph::degree(g)[vnames]
140+
}
141+
142+
signal_set <- if (is.null(sdep$signal.vars)) character(0) else sdep$signal.vars
143+
node_df <- data.frame(
144+
variable = factor(vnames, levels = vnames[order(-deg_vec)]),
145+
degree = as.integer(deg_vec),
146+
selected = vnames %in% signal_set,
147+
stringsAsFactors = FALSE,
148+
row.names = NULL
149+
)
150+
151+
## ---- Apply min.degree node filtering (user-requested subsetting) ---------
152+
if (!is.null(min.degree)) {
153+
keep <- node_df$degree >= min.degree
154+
keep_names <- as.character(node_df$variable)[keep]
155+
drop_names <- as.character(node_df$variable)[!keep]
156+
g <- igraph::delete_vertices(g, drop_names)
157+
edge_df <- edge_df[
158+
edge_df$variable_from %in% keep_names &
159+
edge_df$variable_to %in% keep_names, , drop = FALSE]
160+
node_df <- node_df[keep, , drop = FALSE]
161+
rownames(edge_df) <- NULL
162+
rownames(node_df) <- NULL
163+
}
164+
165+
## ---- Set igraph node attributes ------------------------------------------
166+
if (length(igraph::V(g)) > 0L) {
167+
igraph::V(g)$degree <- node_df$degree[
168+
match(igraph::V(g)$name, as.character(node_df$variable))]
169+
igraph::V(g)$selected <- node_df$selected[
170+
match(igraph::V(g)$name, as.character(node_df$variable))]
171+
}
172+
173+
## ---- Set igraph edge weights (order-insensitive for undirected) -----------
174+
if (length(igraph::E(g)) > 0L && nrow(edge_df) > 0L) {
175+
el <- igraph::as_data_frame(g, what = "edges")
176+
if (isTRUE(directed)) {
177+
idx <- match(paste(el$from, el$to),
178+
paste(edge_df$variable_from, edge_df$variable_to))
179+
} else {
180+
key_g <- paste(pmin(el$from, el$to), pmax(el$from, el$to))
181+
key_e <- paste(pmin(edge_df$variable_from, edge_df$variable_to),
182+
pmax(edge_df$variable_from, edge_df$variable_to))
183+
idx <- match(key_g, key_e)
184+
}
185+
igraph::E(g)$weight <- edge_df$weight[idx]
186+
}
187+
188+
## ---- Assemble result ------------------------------------------------------
189+
result <- structure(
190+
list(edges = edge_df, nodes = node_df, graph = g),
191+
class = c("gg_udependent", "list")
192+
)
193+
attr(result, "provenance") <- .udep_provenance(object, threshold, q.signal,
194+
directed, min.degree)
195+
result
196+
}
197+
198+
## ---- Internal helpers -------------------------------------------------------
199+
200+
#' @keywords internal
201+
.validate_udep_inputs <- function(object, threshold, directed) {
202+
if (missing(object) || is.null(object)) {
203+
stop("'object' must be a fitted uvarpro object.", call. = FALSE)
204+
}
205+
if (!inherits(object, "uvarpro")) {
206+
stop("'object' must be a uvarpro fit (class \"uvarpro\").", call. = FALSE)
207+
}
208+
if (!is.numeric(threshold) || length(threshold) != 1L || threshold <= 0) {
209+
stop("'threshold' must be a single positive numeric value.", call. = FALSE)
210+
}
211+
if (!is.logical(directed) || length(directed) != 1L) {
212+
stop("'directed' must be a single logical value.", call. = FALSE)
213+
}
214+
invisible(NULL)
215+
}
216+
217+
#' @keywords internal
218+
.udep_provenance <- function(object, threshold, q.signal, directed, min.degree) {
219+
list(
220+
threshold = threshold,
221+
q.signal = q.signal,
222+
directed = directed,
223+
min.degree = min.degree,
224+
xvar.names = object$xvar.names,
225+
n = nrow(object$x)
226+
)
227+
}

0 commit comments

Comments
 (0)