Skip to content

Commit f15b20d

Browse files
committed
add more instructions about where to install packages when making Docker image
1 parent 6c2db9c commit f15b20d

6 files changed

Lines changed: 130 additions & 49 deletions

File tree

book/r-config.qmd

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,26 @@ import os
1414
print(os.environ["PATH"])
1515
```
1616

17+
## Installing R packages
18+
19+
There is a user directory specified by default in the user's home directory. If this is persistent, then packages installed using
20+
```
21+
install.packages()
22+
```
23+
will by default be installed there and will be persistent.
24+
25+
The 2nd and 3rd paths on `.libPaths()` are in the `/usr` directory and will be recreated each time the Jupyter Hub is restarted and thus any package installed there by the user will disappear.
26+
27+
However, this means that if you are installing R package in a Docker image, they will by default go to the `/home/jovyan` user library and that will get wiped out in a Jupyter Hub where the user home is persistent since whatever is in `/home` during the Docker build will be replaced by the user home directory. In a Docker build, make sure to use
28+
```
29+
install.packages(...., lib="${R_HOME}/site-library")
30+
```
31+
or use the helper script plus a `install.R` file in your Docker file:
32+
```
33+
COPY . /tmp2/
34+
RUN /pyrocket_scripts/install-r-packages.sh /tmp2/install.R
35+
```
36+
1737
## Using R in Jupyter Lab
1838

1939
In Jupyter Lab, you select a R kernel from the upper right. You can then use R code in the notebook. It will use the R installation in py-rocket with all the preloaded libraries.

book/r-packages.qmd

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,26 @@
11
# R packages
22

3-
To install extra R packages, use `install.R`. This is treated as an R script which is run. For example, you can have a script like
3+
To install extra R packages in your Docker image, use `install.R` and `Rscript` in your Docker file.
4+
5+
```
6+
COPY install.R /tmp/install.R
7+
RUN Rscript /tmp/install.R
8+
```
9+
Make sure to install to `"${R_HOME}/site-library"` since by default `install.packages()` will install to the user library in `/home` and that will be replaced with the user home directory in Jupyter Hub with a persistent home directory.
410

511
install.R
612
```markdown
713
repo <- "https://p3m.dev/cran/__linux__/jammy/2024-05-13"
8-
list.of.packages <- c("ggplot2","remotes")
9-
install.packages(list.of.packages, repos=repo)
10-
remotes::install_github("hadley/httr@v0.4")
14+
lib <- "${R_HOME}/site-library"
15+
list.of.packages <- c("ggplot2","remotes", lib=lib)
16+
install.packages(list.of.packages, repos=repo, lib=lib)
17+
remotes::install_github("hadley/httr@v0.4", lib=lib)
18+
```
19+
20+
You can also use the helper script which make sure packages go to the site-library:
21+
```
22+
COPY . /tmp2/
23+
RUN /pyrocket_scripts/install-r-packages.sh /tmp2/install.R
1124
```
1225

1326
### Spatial libraries
@@ -23,11 +36,15 @@ There are a few ways to get around this.
2336
* Install the necessary linux packages via apt-get. This can be hard.
2437
* Install via via /rocker_scipts/install_geospatial.sh
2538
To do this include
26-
```markdown
27-
RUN /rocker_scipts/install_geospatial.sh
2839
```
29-
in your Dockerfile.
30-
* Use r2u which has Ubuntu binaries with all the dependencies included.
40+
RUN echo '.libPaths(file.path(Sys.getenv("R_HOME"), "site-library"))' > /tmp/rprofile.site
41+
RUN env R_PROFILE=/tmp/rprofile.site \
42+
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
43+
/rocker_scripts/install_geospatial.sh
44+
RUN rm /tmp/rprofile.site
45+
```
46+
in your Dockerfile. The extra code with a temporary `R_PROFILE` makes sure everything is installed to `"${R_HOME}/site-library"` and that the PATH does not have conda on it, which would break the needed linux installs.
47+
* Use [r2u](https://github.com/eddelbuettel/r2u) which has Ubuntu binaries with all the dependencies included.
3148

3249
### Default CRAN repository
3350

docs/r-config.html

Lines changed: 31 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -192,14 +192,15 @@
192192
<h2 id="toc-title">Table of contents</h2>
193193

194194
<ul>
195-
<li><a href="#using-r-in-jupyter-lab" id="toc-using-r-in-jupyter-lab" class="nav-link active" data-scroll-target="#using-r-in-jupyter-lab"><span class="header-section-number">8.1</span> Using R in Jupyter Lab</a></li>
196-
<li><a href="#using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel" id="toc-using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel" class="nav-link" data-scroll-target="#using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel"><span class="header-section-number">8.2</span> Using Python in R (RStudio or Jupyter Lab with R kernel)</a>
195+
<li><a href="#installing-r-packages" id="toc-installing-r-packages" class="nav-link active" data-scroll-target="#installing-r-packages"><span class="header-section-number">8.1</span> Installing R packages</a></li>
196+
<li><a href="#using-r-in-jupyter-lab" id="toc-using-r-in-jupyter-lab" class="nav-link" data-scroll-target="#using-r-in-jupyter-lab"><span class="header-section-number">8.2</span> Using R in Jupyter Lab</a></li>
197+
<li><a href="#using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel" id="toc-using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel" class="nav-link" data-scroll-target="#using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel"><span class="header-section-number">8.3</span> Using Python in R (RStudio or Jupyter Lab with R kernel)</a>
197198
<ul class="collapse">
198-
<li><a href="#py_require" id="toc-py_require" class="nav-link" data-scroll-target="#py_require"><span class="header-section-number">8.2.1</span> <code>py_require()</code></a></li>
199-
<li><a href="#using-a-conda-environment" id="toc-using-a-conda-environment" class="nav-link" data-scroll-target="#using-a-conda-environment"><span class="header-section-number">8.2.2</span> Using a conda environment</a></li>
199+
<li><a href="#py_require" id="toc-py_require" class="nav-link" data-scroll-target="#py_require"><span class="header-section-number">8.3.1</span> <code>py_require()</code></a></li>
200+
<li><a href="#using-a-conda-environment" id="toc-using-a-conda-environment" class="nav-link" data-scroll-target="#using-a-conda-environment"><span class="header-section-number">8.3.2</span> Using a conda environment</a></li>
200201
</ul></li>
201-
<li><a href="#dealing-with-ssl-mismatch-errors" id="toc-dealing-with-ssl-mismatch-errors" class="nav-link" data-scroll-target="#dealing-with-ssl-mismatch-errors"><span class="header-section-number">8.3</span> Dealing with SSL mismatch errors</a></li>
202-
<li><a href="#developers" id="toc-developers" class="nav-link" data-scroll-target="#developers"><span class="header-section-number">8.4</span> Developers</a></li>
202+
<li><a href="#dealing-with-ssl-mismatch-errors" id="toc-dealing-with-ssl-mismatch-errors" class="nav-link" data-scroll-target="#dealing-with-ssl-mismatch-errors"><span class="header-section-number">8.4</span> Dealing with SSL mismatch errors</a></li>
203+
<li><a href="#developers" id="toc-developers" class="nav-link" data-scroll-target="#developers"><span class="header-section-number">8.5</span> Developers</a></li>
203204
</ul>
204205
<div class="toc-actions"><ul><li><a href="https://github.com/nmfs-opensci/py-rocket-base/edit/main/r-config.qmd" class="toc-action"><i class="bi bi-github"></i>Edit this page</a></li><li><a href="https://github.com/nmfs-opensci/py-rocket-base/blob/main/r-config.qmd" class="toc-action"><i class="bi empty"></i>View source</a></li><li><a href="https://github.com/nmfs-opensci/py-rocket-base/issues/new" class="toc-action"><i class="bi empty"></i>Report an issue</a></li></ul></div></nav>
205206
</div>
@@ -232,15 +233,27 @@ <h1 class="title"><span class="chapter-number">8</span>&nbsp; <span class="chapt
232233
<p>Try this in a Jupyter Notebook in Jupyter Lab:</p>
233234
<pre><code>import os
234235
print(os.environ["PATH"])</code></pre>
235-
<section id="using-r-in-jupyter-lab" class="level2" data-number="8.1">
236-
<h2 data-number="8.1" class="anchored" data-anchor-id="using-r-in-jupyter-lab"><span class="header-section-number">8.1</span> Using R in Jupyter Lab</h2>
236+
<section id="installing-r-packages" class="level2" data-number="8.1">
237+
<h2 data-number="8.1" class="anchored" data-anchor-id="installing-r-packages"><span class="header-section-number">8.1</span> Installing R packages</h2>
238+
<p>There is a user directory specified by default in the user’s home directory. If this is persistent, then packages installed using</p>
239+
<pre><code>install.packages()</code></pre>
240+
<p>will by default be installed there and will be persistent.</p>
241+
<p>The 2nd and 3rd paths on <code>.libPaths()</code> are in the <code>/usr</code> directory and will be recreated each time the Jupyter Hub is restarted and thus any package installed there by the user will disappear.</p>
242+
<p>However, this means that if you are installing R package in a Docker image, they will by default go to the <code>/home/jovyan</code> user library and that will get wiped out in a Jupyter Hub where the user home is persistent since whatever is in <code>/home</code> during the Docker build will be replaced by the user home directory. In a Docker build, make sure to use</p>
243+
<pre><code>install.packages(...., lib="${R_HOME}/site-library")</code></pre>
244+
<p>or use the helper script plus a <code>install.R</code> file in your Docker file:</p>
245+
<pre><code>COPY . /tmp2/
246+
RUN /pyrocket_scripts/install-r-packages.sh /tmp2/install.R</code></pre>
247+
</section>
248+
<section id="using-r-in-jupyter-lab" class="level2" data-number="8.2">
249+
<h2 data-number="8.2" class="anchored" data-anchor-id="using-r-in-jupyter-lab"><span class="header-section-number">8.2</span> Using R in Jupyter Lab</h2>
237250
<p>In Jupyter Lab, you select a R kernel from the upper right. You can then use R code in the notebook. It will use the R installation in py-rocket with all the preloaded libraries.</p>
238251
</section>
239-
<section id="using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel" class="level2" data-number="8.2">
240-
<h2 data-number="8.2" class="anchored" data-anchor-id="using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel"><span class="header-section-number">8.2</span> Using Python in R (RStudio or Jupyter Lab with R kernel)</h2>
252+
<section id="using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel" class="level2" data-number="8.3">
253+
<h2 data-number="8.3" class="anchored" data-anchor-id="using-python-in-r-rstudio-or-jupyter-lab-with-r-kernel"><span class="header-section-number">8.3</span> Using Python in R (RStudio or Jupyter Lab with R kernel)</h2>
241254
<p>The following behavior is specific to R, not the GUI (RStudio or Jupyter Lab with R kernel) that you are using to interact with it.</p>
242-
<section id="py_require" class="level3" data-number="8.2.1">
243-
<h3 data-number="8.2.1" class="anchored" data-anchor-id="py_require"><span class="header-section-number">8.2.1</span> <code>py_require()</code></h3>
255+
<section id="py_require" class="level3" data-number="8.3.1">
256+
<h3 data-number="8.3.1" class="anchored" data-anchor-id="py_require"><span class="header-section-number">8.3.1</span> <code>py_require()</code></h3>
244257
<p>To use Python, you use the <code>reticulate</code> library. If you only need a handful of Python packages, it will simplify things if you use <code>py_require()</code>. Like this</p>
245258
<pre><code>library(reticulate)
246259
py_require("xarray")</code></pre>
@@ -249,8 +262,8 @@ <h3 data-number="8.2.1" class="anchored" data-anchor-id="py_require"><span class
249262
<pre><code>rm ~/.cache/R/reticulate</code></pre>
250263
<p>in a terminal to get reticulate to allow me to use <code>use_conda("notebook")</code> in another R session.</p>
251264
</section>
252-
<section id="using-a-conda-environment" class="level3" data-number="8.2.2">
253-
<h3 data-number="8.2.2" class="anchored" data-anchor-id="using-a-conda-environment"><span class="header-section-number">8.2.2</span> Using a conda environment</h3>
265+
<section id="using-a-conda-environment" class="level3" data-number="8.3.2">
266+
<h3 data-number="8.3.2" class="anchored" data-anchor-id="using-a-conda-environment"><span class="header-section-number">8.3.2</span> Using a conda environment</h3>
254267
<p>You can also use the conda environment with reticulate with all the pre-installed packages.</p>
255268
<pre><code>library(reticulate)
256269
use_condaenv("notebook")</code></pre>
@@ -265,14 +278,14 @@ <h3 data-number="8.2.2" class="anchored" data-anchor-id="using-a-conda-environme
265278
<p>When we use a conda environment, the PATH is altered so that the conda environment directory appears first on the PATH. Any R packages that need a particular system package that is also in conda (like GDAL) are likely to throw mis-match errors.</p>
266279
</section>
267280
</section>
268-
<section id="dealing-with-ssl-mismatch-errors" class="level2" data-number="8.3">
269-
<h2 data-number="8.3" class="anchored" data-anchor-id="dealing-with-ssl-mismatch-errors"><span class="header-section-number">8.3</span> Dealing with SSL mismatch errors</h2>
281+
<section id="dealing-with-ssl-mismatch-errors" class="level2" data-number="8.4">
282+
<h2 data-number="8.4" class="anchored" data-anchor-id="dealing-with-ssl-mismatch-errors"><span class="header-section-number">8.4</span> Dealing with SSL mismatch errors</h2>
270283
<p>When you use reticulate in R, use <code>use_condaenv()</code> and call a function that needs to download data, you are liable to get a OpenSSL mismatch error. py-rocket solves this by adding this to</p>
271284
<pre><code>rsession-ld-library-path=/srv/conda/envs/notebook/lib</code></pre>
272285
<p>to <code>/etc/rstudio/rserver.conf</code>. This let’s R know where to look for SSL links and hopefully doesn’t break R packages. Make sure that <code>.Renviron</code> does not set <code>LD_LIBRARY_PATH</code> or this solution will not work. I don’t know why but it breaks.</p>
273286
</section>
274-
<section id="developers" class="level2" data-number="8.4">
275-
<h2 data-number="8.4" class="anchored" data-anchor-id="developers"><span class="header-section-number">8.4</span> Developers</h2>
287+
<section id="developers" class="level2" data-number="8.5">
288+
<h2 data-number="8.5" class="anchored" data-anchor-id="developers"><span class="header-section-number">8.5</span> Developers</h2>
276289
<p>How is the R kernel created so that it shows up in Jupyter Lab? You don’t need to install R into the conda environment since it already is in the image. We just need to use <code>IRkernel</code> R package to register the kernel with jupyter.</p>
277290
<pre><code>Rscript - &lt;&lt;-"EOF"
278291
install.packages('IRkernel', lib = .Library) # install in system library

0 commit comments

Comments
 (0)