11---
22title : " Group Sequential Design with Graphical Approaches"
3- output : rmarkdown::html_vignette
3+ output :
4+ rmarkdown::html_vignette :
5+ number_sections : true
6+ toc : true
47vignette : >
58 %\VignetteIndexEntry{Group Sequential Design with Graphical Approaches}
69 %\VignetteEngine{knitr::rmarkdown}
@@ -81,7 +84,11 @@ $$P(Z_1 < b_1, \ldots, Z_k < b_k) = 1 - f(\alpha, t_k)$$
8184
8285where $f(\alpha, t_k)$ is the cumulative spending and
8386$(Z_1, \ldots, Z_k)$ follow the canonical joint distribution with
84- $\text{Cor}(Z_i, Z_j) = \sqrt{t_i / t_j}$ for $i \le j$.
87+ $\text{Cor}(Z_i, Z_j) = \sqrt{t_i / t_j}$ for $i \le j$. We consider
88+ one-sided tests with the upper alternative (i.e., larger effects are better).
89+ At analysis $k$, the null hypothesis is rejected if $Z_k \ge b_k$, or
90+ equivalently, if the observed p-value $p_k \le \Phi(-b_k)$ where $\Phi$ is
91+ the standard normal CDF.
8592
8693``` {r boundaries-example}
8794# Compute boundaries for OBF spending at alpha = 0.025 with 3 equally spaced analyses
@@ -101,13 +108,15 @@ knitr::kable(boundary_table, digits = 6,
101108
102109## Repeated and Sequential P-values
103110
104- Two types of p-values are central to the group sequential graphical procedure:
111+ Two types of p-values are central to the group sequential graphical procedure.
112+ Let $\hat{p}_ k$ denote the repeated p-value and $\tilde{p}_ k$ denote the
113+ sequential p-value at analysis $k$.
105114
106- - ** Repeated p-value** at analysis $k $: the minimum significance level at which
115+ - ** Repeated p-value** $\hat{p} _ k $: the minimum significance level at which
107116 the group sequential boundary * at analysis $k$ specifically* would be crossed.
108117 It only considers the boundary at the current analysis.
109118
110- - ** Sequential p-value** at analysis $k $: the minimum significance level at
119+ - ** Sequential p-value** $\tilde{p} _ k $: the minimum significance level at
111120 which any group sequential boundary * at analyses $1, \ldots, k$* would be
112121 crossed. It equals the cumulative minimum of repeated p-values:
113122 $\tilde{p}_ k = \min_ {l=1}^{k} \hat{p}_ l$.
@@ -140,6 +149,16 @@ since it considers all prior analyses. A hypothesis that nearly crossed its
140149boundary at an earlier analysis will have a much smaller sequential p-value
141150than its repeated p-value at the current analysis.
142151
152+ The ` graph_test_shortcut_gsd() ` function supports two modes controlled by the
153+ ` look_back ` parameter. When ` look_back = FALSE ` (the default), rejection
154+ decisions at each analysis are based on repeated p-values only — i.e., only
155+ the boundary at the current analysis is considered. When ` look_back = TRUE ` ,
156+ rejection decisions are based on sequential p-values, which "look back" at
157+ all prior analyses by taking the cumulative minimum of repeated p-values.
158+ This means that strong evidence from an earlier analysis is carried forward
159+ and can contribute to a rejection at a later analysis. Both modes are
160+ illustrated in the case studies below.
161+
143162## Case Study: Maurer and Bretz (2013), Section 4
144163
145164We replicate the numerical example from Section 4 of Maurer and Bretz (2013).
@@ -159,7 +178,8 @@ The trial has four hypotheses:
159178The testing strategy follows the successiveness principle: secondary hypotheses
160179cannot be tested until their parent primary hypothesis is rejected. Both
161180primary hypotheses start with equal weight (0.5 each), and upon rejection,
162- the full weight propagates to the corresponding secondary hypothesis.
181+ the weight is split equally between the other primary hypothesis and the
182+ corresponding secondary hypothesis.
163183
164184``` {r graph-setup}
165185hypotheses <- c(0.5, 0.5, 0, 0)
@@ -194,26 +214,28 @@ p <- rbind(
194214 H4 = c(0.13, 0.06)
195215)
196216
217+ p_display <- as.data.frame(p)
218+ colnames(p_display) <- paste("Analysis", 1:2)
197219knitr::kable(
198- data.frame(
199- Analysis = 1:2,
200- `Info Fraction` = c("1/3", "2/3"),
201- H1 = p["H1", ], H2 = p["H2", ], H3 = p["H3", ], H4 = p["H4", ],
202- check.names = FALSE
203- ),
204- caption = "Observed nominal p-values (Table 1 of Maurer and Bretz, 2013)"
220+ p_display,
221+ caption = "Observed nominal p-values"
205222)
206223```
207224
208225### Running the Procedure (look_back = FALSE)
209226
210227The default mode is ` look_back = FALSE ` , which means the procedure does ** not**
211- look back at evidence from prior analyses. At each analysis, rejection decisions
212- are based solely on the data observed at that analysis.
228+ look back at test statistics from prior analyses. At each analysis $k$,
229+ rejection decisions are based solely on the repeated p-value $\hat{p}_ k$
230+ computed from the test statistic at analysis $k$, without utilizing test
231+ statistics from previous analyses. Note that the test statistic at analysis
232+ $k$ is computed from all data accumulated up to that point, but the
233+ rejection decision at analysis $k$ does not incorporate the test statistics
234+ (or repeated p-values) from analyses $1, \ldots, k-1$.
213235
214236There are two equivalent ways to understand the rejection decisions:
215237
216- 1 . ** Repeated p-values** (default): A repeated p-value at analysis $k $ is the
238+ 1 . ** Repeated p-values** (default): The repeated p-value $\hat{p} _ k $ is the
217239 minimum significance level at which the group sequential boundary at
218240 analysis $k$ would be crossed. These are passed to the graphical shortcut
219241 procedure (` graph_test_shortcut() ` ) for multiplicity adjustment.
@@ -301,35 +323,41 @@ with their new (increased) weights, potentially enabling further rejections
301323at the same analysis.
302324
303325``` {r test-values-tables}
304- knitr::kable(result$test_values[[1]], digits = 6,
326+ format_test_values <- function(tv) {
327+ tv$Boundary <- formatC(tv$Boundary, format = "f", digits = 6)
328+ tv
329+ }
330+ knitr::kable(format_test_values(result$test_values[[1]]), digits = 6,
305331 caption = "Analysis 1: nominal boundaries and rejection decisions")
306- knitr::kable(result$test_values[[2]], digits = 6,
332+ knitr::kable(format_test_values( result$test_values[[2]]) , digits = 6,
307333 caption = "Analysis 2: nominal boundaries and rejection decisions")
308334```
309335
310336** Analysis 1 (t = 1/3).** The initial weights are $(0.5, 0.5, 0, 0)$. The
311337OBF spending function allocates very little alpha to the first interim
312- analysis — the nominal boundary for $H_1$ and $H_2$ is approximately
313- ` r sprintf("%.5f", result$test_values[[1]]$Boundary[1]) ` . Since both observed
314- p-values ($p_ {1,1} = 0.0062$ and $p_ {2,1} = 0.017$) exceed this boundary, no
315- hypothesis is rejected.
338+ analysis — as shown in the Analysis 1 table above, the nominal boundary for
339+ $H_1$ and $H_2$ is approximately
340+ ` r formatC(result$test_values[[1]]$Boundary[1], format = "f", digits = 6) ` .
341+ Since both observed p-values (0.0062 for $H_1$ and 0.017 for $H_2$) exceed
342+ this boundary, no hypothesis is rejected.
316343
317- An important note from the paper : the nominal significance level
318- $\alpha^ * _ {1,1}(w \cdot \alpha)$ is ** not** equal to
319- $w \cdot \alpha^ * _ {1,1}(\alpha)$ :
344+ An important note: the nominal boundary computed at a fraction of alpha is
345+ ** not** equal to the same fraction of the boundary computed at the full alpha.
346+ For example, the boundary at the first analysis with the OBF spending function :
320347
321348``` {r key-inequality}
322349b_half <- gs_boundaries(0.0125, c(1/3, 2/3, 1), spending_of)
323350b_full <- gs_boundaries(0.025, c(1/3, 2/3, 1), spending_of)
324351cat(sprintf(
325- "alpha*_1( 0.0125) = %.6f\n0.5 * alpha*_1( 0.025) = %.6f\n",
352+ "Boundary at alpha = 0.0125: %.6f\n0.5 * Boundary at alpha = 0.025: %.6f\n",
326353 b_half$bounds_nominal[1],
327354 0.5 * b_full$bounds_nominal[1]
328355))
329356```
330357
331- This demonstrates why one must evaluate the spending function at
332- $w_i \cdot \alpha$, not apply the weight to boundaries computed at $\alpha$.
358+ This demonstrates why the spending function must be evaluated at the
359+ hypothesis-specific significance level (weight times alpha), rather than
360+ applying the weight to boundaries computed at the full alpha.
333361
334362** Analysis 2 (t = 2/3).** The test_values table above shows the boundary for
335363each hypothesis at the point when it is tested, reflecting sequential graph
@@ -505,14 +533,10 @@ via graph update) at a later analysis. We illustrate this using the same graph
505533but with modified p-values and Pocock spending:
506534
507535``` {r look-back-difference}
508- # Same graph as MB case study, but different p-values and spending function
509- # H3 has strong evidence at analysis 1 (p = 0.0008) but just misses at analysis 2
510- p_modified <- rbind(
511- H1 = c(0.02, 0.0002),
512- H2 = c(0.02, 0.003),
513- H3 = c(0.0008, 0.006),
514- H4 = c(0.3, 0.2)
515- )
536+ # Same graph and p-values as MB case study, except H3's p-values are modified
537+ # H3 has strong evidence at analysis 1 (p = 0.0008) but weaker at analysis 2
538+ p_modified <- p
539+ p_modified["H3", ] <- c(0.0008, 0.006)
516540
517541# look_back = FALSE: only considers repeated p-values at each analysis
518542result_no_lb <- graph_test_shortcut_gsd(
@@ -738,18 +762,34 @@ The transition structure follows the hierarchy: within each population, alpha
738762flows from OS to PFS to ORR, and ORR recycles to OS. Between populations,
739763the all-subjects hypotheses share alpha with the subgroup hypotheses.
740764
741- ``` {r oncology-graph-plot, eval = requireNamespace("igraph", quietly = TRUE), fig.height=8 , fig.width=6}
765+ ``` {r oncology-graph-plot, eval = requireNamespace("igraph", quietly = TRUE), fig.height=6 , fig.width=6}
742766onc_layout <- rbind(
743- c(1 , 3), # H1_OS_S
744- c(2, 3), # H2_OS_A
745- c(1, 2 ), # H3_PFS_S
746- c(2.5, 2 ), # H4_PFS_A (shifted right)
747- c(1, 1 ), # H5_ORR_S
748- c(2, 1 ) # H6_ORR_A
767+ c(0 , 3), # H1_OS_S
768+ c(2, 3), # H2_OS_A
769+ c(0, 1.8 ), # H3_PFS_S
770+ c(1.3, 1.8 ), # H4_PFS_A
771+ c(0, 0.5 ), # H5_ORR_S
772+ c(2, 0.5 ) # H6_ORR_A
749773)
750- plot(g_onc, layout = onc_layout, vertex.size = 60,
751- edge_curves = c("H6_ORR_A|H2_OS_A" = 0.01,
752- "H4_PFS_A|H5_ORR_S" = 0.01,
774+ # Edge label positions: NA = auto, explicit coords to move specific labels
775+ # Edge order: 1=H6->H1, 2=H1->H2, 3=H6->H2, 4=H2->H3, 5=H2->H4,
776+ # 6=H3->H4, 7=H4->H5, 8=H4->H6, 9=H5->H6
777+ label_x <- rep(NA, 9)
778+ label_y <- rep(NA, 9)
779+ label_x[1] <- 1.35; label_y[1] <- 1.1 # H6->H1: on the curved edge
780+ label_x[7] <- 0.65; label_y[7] <- 1.1 # H4->H5: between nodes, near arrow
781+
782+ plot(g_onc, layout = onc_layout, vertex.size = 60, asp = 1,
783+ vertex.label.cex = 0.7,
784+ rescale = FALSE,
785+ xlim = c(-1.2, 3.5),
786+ ylim = c(-0.2, 3.8),
787+ edge.label.x = label_x,
788+ edge.label.y = label_y,
789+ edge_curves = c("H6_ORR_A|H2_OS_A" = 0,
790+ "H6_ORR_A|H1_OS_S" = 0.2,
791+ "H4_PFS_A|H6_ORR_A" = 0,
792+ "H4_PFS_A|H5_ORR_S" = 0,
753793 "H3_PFS_S|H4_PFS_A" = 0))
754794```
755795
@@ -939,6 +979,19 @@ For this example, both modes produce the same rejection decisions. This is
939979because the evidence at the rejection analyses is strong enough that looking
940980back at earlier analyses does not change the outcome.
941981
982+ ** Note on differences from gMCPLite.** This case study is adapted from the
983+ [ gMCPLite vignette] ( https://cran.r-project.org/web/packages/gMCPLite/vignettes/huyett-burnett-example.html ) .
984+ The rejection decisions (H1, H3, H5 rejected; H2, H4, H6 not rejected) agree
985+ between the two implementations. However, sequential p-values may differ
986+ slightly for some hypotheses. The reason is that gMCPLite (via gsDesign)
987+ separates * spending time* from * information fraction* : for all-subjects
988+ hypotheses (H2 and H4), gMCPLite uses the subgroup event counts as the
989+ spending time while using the all-subjects event counts for the correlation
990+ structure. In contrast, ` graphicalMCP ` uses ` info_frac ` for both alpha
991+ spending and the correlation structure. This difference affects the group
992+ sequential boundaries and hence the sequential p-values, but in this example
993+ it does not change which hypotheses are rejected.
994+
942995This case study demonstrates that ` graph_test_shortcut_gsd() ` handles trials
943996where different endpoints have different numbers of analyses — a common
944997situation in oncology trials with OS, PFS, and ORR endpoints.
0 commit comments