-
-
Notifications
You must be signed in to change notification settings - Fork 68
Expand file tree
/
Copy pathread_cmdstan_csv.Rd
More file actions
161 lines (141 loc) · 6.23 KB
/
read_cmdstan_csv.Rd
File metadata and controls
161 lines (141 loc) · 6.23 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/csv.R
\name{read_cmdstan_csv}
\alias{read_cmdstan_csv}
\alias{as_cmdstan_fit}
\title{Read CmdStan CSV files into R}
\usage{
read_cmdstan_csv(
files,
variables = NULL,
sampler_diagnostics = NULL,
format = getOption("cmdstanr_draws_format", NULL)
)
as_cmdstan_fit(
files,
variables = NULL,
check_diagnostics = TRUE,
format = getOption("cmdstanr_draws_format")
)
}
\arguments{
\item{files}{(character vector) The paths to the CmdStan CSV files. These can
be files generated by running CmdStanR or running CmdStan directly.}
\item{variables}{(character vector) Optionally, the names of the variables
(parameters, transformed parameters, and generated quantities) to read in.
\itemize{
\item If \code{NULL} (the default) then all variables are included.
\item If an empty string (\code{variables=""}) then none are included.
\item For non-scalar variables all elements or specific elements can be selected:
\itemize{
\item \code{variables = "theta"} selects all elements of \code{theta};
\item \code{variables = c("theta[1]", "theta[3]")} selects only the 1st and 3rd elements.
}
}}
\item{sampler_diagnostics}{(character vector) Works the same way as
\code{variables} but for sampler diagnostic variables (e.g., \code{"treedepth__"},
\code{"accept_stat__"}, etc.). Ignored if the model was not fit using MCMC.}
\item{format}{(string) The format for storing the draws or point estimates.
The default depends on the method used to fit the model. See
\link[=fit-method-draws]{draws} for details, in particular the note about speed
and memory for models with many parameters.}
\item{check_diagnostics}{(logical) For models fit using MCMC, should
diagnostic checks be performed after reading in the files? The default is
\code{TRUE} but set to \code{FALSE} to avoid checking for problems with divergences
and treedepth.}
}
\value{
\code{as_cmdstan_fit()} returns a \link{CmdStanMCMC}, \link{CmdStanMLE}, \link{CmdStanLaplace} or
\link{CmdStanVB} object. Some methods typically defined for those objects will not
work (e.g. \code{save_data_file()}) but the important methods like \verb{$summary()},
\verb{$draws()}, \verb{$sampler_diagnostics()} and others will work fine.
\code{read_cmdstan_csv()} returns a named list with the following components:
\itemize{
\item \code{metadata}: A list of the meta information from the run that produced the
CSV file(s). See \strong{Examples} below.
}
The other components in the returned list depend on the method that produced
the CSV file(s).
For \link[=model-method-sample]{sampling} the returned list also includes the
following components:
\itemize{
\item \code{time}: Run time information for the individual chains. The returned object
is the same as for the \link[=fit-method-time]{$time()} method except the total run
time can't be inferred from the CSV files (the chains may have been run in
parallel) and is therefore \code{NA}.
\item \code{inv_metric}: A list (one element per chain) of inverse mass matrices
or their diagonals, depending on the type of metric used.
\item \code{step_size}: A list (one element per chain) of the step sizes used.
\item \code{warmup_draws}: If \code{save_warmup} was \code{TRUE} when fitting the model then a
\code{\link[posterior:draws_array]{draws_array}} (or different format if \code{format} is
specified) of warmup draws.
\item \code{post_warmup_draws}: A \code{\link[posterior:draws_array]{draws_array}} (or
different format if \code{format} is specified) of post-warmup draws.
\item \code{warmup_sampler_diagnostics}: If \code{save_warmup} was \code{TRUE} when fitting the
model then a \code{\link[posterior:draws_array]{draws_array}} (or different format if
\code{format} is specified) of warmup draws of the sampler diagnostic variables.
\item \code{post_warmup_sampler_diagnostics}: A
\code{\link[posterior:draws_array]{draws_array}} (or different format if \code{format} is
specified) of post-warmup draws of the sampler diagnostic variables.
}
For \link[=model-method-optimize]{optimization} the returned list also includes the
following components:
\itemize{
\item \code{point_estimates}: Point estimates for the model parameters.
}
For \link[=model-method-laplace]{laplace} and
\link[=model-method-variational]{variational inference} the returned list also
includes the following components:
\itemize{
\item \code{draws}: A \code{\link[posterior:draws_matrix]{draws_matrix}} (or different format
if \code{format} is specified) of draws from the approximate posterior
distribution.
}
For \link[=model-method-generate-quantities]{standalone generated quantities} the
returned list also includes the following components:
\itemize{
\item \code{generated_quantities}: A \code{\link[posterior:draws_array]{draws_array}} of
the generated quantities.
}
}
\description{
\code{read_cmdstan_csv()} is used internally by CmdStanR to read
CmdStan's output CSV files into \R. It can also be used by CmdStan users as
a more flexible and efficient alternative to \code{rstan::read_stan_csv()}. See
the \strong{Value} section for details on the structure of the returned list.
It is also possible to create CmdStanR's fitted model objects directly from
CmdStan CSV files using the \code{as_cmdstan_fit()} function.
}
\examples{
\dontrun{
# Generate some CSV files to use for demonstration
fit1 <- cmdstanr_example("logistic", method = "sample", save_warmup = TRUE)
csv_files <- fit1$output_files()
print(csv_files)
# Creating fitting model objects
# Create a CmdStanMCMC object from the CSV files
fit2 <- as_cmdstan_fit(csv_files)
fit2$print("beta")
# Using read_cmdstan_csv
#
# Read in everything
x <- read_cmdstan_csv(csv_files)
str(x)
# Don't read in any of the sampler diagnostic variables
x <- read_cmdstan_csv(csv_files, sampler_diagnostics = "")
# Don't read in any of the parameters or generated quantities
x <- read_cmdstan_csv(csv_files, variables = "")
# Read in only specific parameters and sampler diagnostics
x <- read_cmdstan_csv(
csv_files,
variables = c("alpha", "beta[2]"),
sampler_diagnostics = c("n_leapfrog__", "accept_stat__")
)
# For non-scalar parameters all elements can be selected or only some elements,
# e.g. all of the vector "beta" but only one element of the vector "log_lik"
x <- read_cmdstan_csv(
csv_files,
variables = c("beta", "log_lik[3]")
)
}
}