You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#' Do not call this function on its own. Fits cross-validated glmnet model with fixed effect.
134
134
#'
135
135
#'
136
-
#' @param X.fixed a data.frame (or tibble) with "numeric" and "factor" columns corresponding to covariates or terms that should be treated as fixed effects in the model.
136
+
#' @param X_fixed a data.frame (or tibble) with "numeric" and "factor" columns corresponding to covariates or terms that should be treated as fixed effects in the model.
137
137
#' @param X original data.frame (or tibble) with "numeric" and "factor" columns only. The number of columns, ncol(X) needs to be > 2.
138
138
#' @param y response vector with \code{length(y) = nrow(X)}. Accepts "numeric" (family="gaussian") or binary "factor" (family="binomial"). Can also be a survival object of class Surv
139
139
#' as obtained from y = survival::Surv(time, status).
#' @param y response vector with \code{length(y) = nrow(X)}. Accepts "numeric" (family="gaussian") or binary "factor" (family="binomial"). Can also be a survival object of class Surv
196
196
#' as obtained from y = survival::Surv(time, status).
197
197
#' @param type should be "regression" if y is numeric, "classification" if y is a binary factor variable or "survival" if y is a survival object.
198
-
#' @param ...
199
198
#'
200
199
#' @return importance scores
201
200
#' @export
202
201
#'
203
202
#' @keywords internal
204
-
random_forest_importance_scores<-function(X, y, trt, type="regression", ...){
203
+
random_forest_importance_scores<-function(X, y, trt, type="regression"){
if (is.continuous) warning("Some of the numeric columns of X have suspiciously few distinct values: n_distinct <= 30. Those columns should perhaps not be treated as continuous variables. Please review carefully and read the documentation about the gcm parameter of the knockoff.statistics function.")
#' @param y response vector with \code{length(y) = nrow(X)}. Accepts "numeric", binary "factor", or survival ("Surv") object.
158
158
#' @param X data.frame (or tibble) with "numeric" and "factor" columns only. The number of columns, ncol(X) needs to be > 2.
159
159
#' @param type should be "regression" if y is numeric, "classification" if y is a binary factor variable or "survival" if y is a survival object.
160
-
#' @param M the number of independent knockoff feature statistics that should be calculated.
161
160
#' @param knockoff.method what type of knockoffs to calculate. Defaults to sequential knockoffs, knockoff.method="seq", but other options are "sparseseq" and "mx". The "mx" option only works if all columns of X are continuous.
162
161
#' @param statistic knockoff feature statistic function, defaults to glmnet coefficient difference (statistic="stat_glmnet"; see ?stat_glmnet). Other options include statistic="stat_random_forest" (see ?stat_random_forest), statistic="stat_predictive_glmnet" (see ?stat_predictive_glmnet) or statistic="stat_predictive_causal_forest" (see ?stat_predictive_causal_forest).
163
162
#' @param trt binary treatment (factor) variable required if statistic involves a predictive knockoff filter (i.e. if statistic="stat_predictive_glmnet" or statistic="stat_predictive_causal_forest")
#' @param y response vector with \code{length(y) = nrow(X)}. Accepts "numeric" (family="gaussian") or binary "factor" (family="binomial"). Can also be a survival object of class "Surv" (type="survival")
316
315
#' as obtained from y = survival::Surv(time, status).
317
316
#' @param type should be "regression" if y is numeric, "classification" if y is a binary factor variable or "survival" if y is a survival object.
318
-
#' @param ...
317
+
#' @param ... other parameters passed to \code{random_forest_importance_scores}.
319
318
#'
320
319
#' @return data.frame with knockoff statistics W as column. The number of rows matches the number of columns (variables) of the data.frame X and the variable names are recorded in rownames(W).
Copy file name to clipboardExpand all lines: R/plot.R
+17-11Lines changed: 17 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -1,17 +1,25 @@
1
+
1
2
#' Heatmap of multiple variable selections ordered by importance
2
3
#'
3
-
#' @param S data.frame of variable selections from multiple knockoffs (each entry is either 1 if variable is selected and 0 otherwise). Columns correspond to different knockoffs and rows correspond to the underlying variables. row.names(S) records the variable names.
4
+
#' @param x data.frame of variable selections from multiple knockoffs
5
+
#' (each entry is either 1 if variable is selected and 0 otherwise).
6
+
#' Columns correspond to different knockoffs and rows correspond to the
7
+
#' underlying variables. row.names(x) records the variable names.
8
+
#'
9
+
#' @param ... Additional arguments passed to other plot methods (currently ignored).
10
+
#'
4
11
#' @param nbcocluster bivariate vector c(number of variable clusters, number of selection clusters).
5
-
#' The former number must be specified less than nrow(S) and the latter must be less than ncol(S).
12
+
#' The former number must be specified less than nrow(x) and the latter must be less than ncol(x).
6
13
#'
7
14
#' @details To help visualize most important variables we perform clustering both selections and variables.
0 commit comments