Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:

steps:
- name: Checkout ACRO
uses: actions/checkout@v4
uses: actions/checkout@v5

- name: Setup Python
uses: actions/setup-python@v5
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pkgdown.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:

steps:
- name: Checkout
uses: actions/checkout@v4
uses: actions/checkout@v5

- name: Setup Python
uses: actions/setup-python@v5
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test-coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:

steps:
- name: Checkout ACRO
uses: actions/checkout@v4
uses: actions/checkout@v5

- name: Setup Python
uses: actions/setup-python@v5
Expand Down
28 changes: 14 additions & 14 deletions Installation Guides.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,36 +6,36 @@ Keeping this comprehensive will require input from the community.

So please email sacro.contact@uwe.ac.uk, or [raise an issue on the GitHub repository](https://github.com/AI-SDC/ACRO-R/issues/new/choose) if:
- you have a setting that is not covered, or
- the steps outlined below do not work for you,
- the steps outlined below do not work for you,

**Please note**: most of the scenarios below assume that
**Please note**: most of the scenarios below assume that
- you have a working version of Python 3 (version 3.9 or higher) on your system
- you are able to access a terminal or command prompt to write and execute some commands.

---

## Step 1 create a python virtual environment and install the base python package *acro*
**In every case** we recommend that you create what is called a 'python virtual environment' called **r-acro**.
Virtual environments (*venv's*) are recommended best practice.
This is because they isolate the impact of any changes you make in one venv - such as adding or updating a package- from the rest of your system.
**In every case** we recommend that you create what is called a 'python virtual environment' called **r-acro**.
Virtual environments (*venv's*) are recommended best practice.
This is because they isolate the impact of any changes you make in one venv - such as adding or updating a package- from the rest of your system.

There are many tutorials available on the web if you get stuck.
We do not endorse any particular site, but here are some examples:
- [an overview with examples for windows/linux/mac](https://python.land/virtual-environments/virtualenv)
- [another that also contains instructions for VSCode and Pycharm](https://realpython.com/python-virtual-environments-a-primer/)
**For individual users** we suggest that you do this in your home directory where you should have write permission.

**For individual users** we suggest that you do this in your home directory where you should have write permission.

**To install site-wide** we assume you have access rights and know where your organisation's preferred locations are (for example, this might be ```/usr/local``` on a linux system).

### Make a dedicated virtual environment
You can make a new virtual environment via:
- the Anaconda GUI interface to the conda system
- command line access - by opening a terminal or command prompt and entering the command:
- command line access - by opening a terminal or command prompt and entering the command:
```sh
conda create --n r-acro
```
if you have a version of conda installed or
if you have a version of conda installed or
```sh
python -m venv ./r-acro
```
Expand Down Expand Up @@ -74,16 +74,16 @@ source r-acro/bin/activate
#you should see the your command prompt change to show (r-acro)
python -m pip install acro
#assuming this completes successfuly you can now exit the virtual environment
deactivate
deactivate
```
---

## Step 2 Install the R packages *reticulate* and *acro*

The *reticulate* package is the industry-standard method for supporting communications between R and Python.
It provides the `plumbing` between the R `front-end'
It provides the `plumbing` between the R `front-end'

These commands should work whether you are
These commands should work whether you are
- working on a machine outside the TRE: in which case packages should install from a mirror of the CRAN service
- working on a machine inside a TRE: in which case the administrator should have set up a local mirror of approved packages from CRAN

Expand Down Expand Up @@ -128,7 +128,7 @@ If you follow the menu items from ```Tools->Project Options ->Python``` or ```To
library(reticulate)
library("acro)"
```

### Option 3 - Editing your personal R preferences
In your home directory create (or edit) the file ```.Rprofile``` file, adding the lines

Expand All @@ -139,5 +139,5 @@ Sys.setenv(RETICULATE_PYTHON_ENV=file.path(Sys.getenv("USERPROFILE"),"r-acro"))



### Option 4- Making site-wide changes
### Option 4- Making site-wide changes
You can also edit the [site-wide Rprofile]() file to add these global environment variables, using replacing *~/r-acro* with the path to wherever you created the dedicated virtual environment.
80 changes: 40 additions & 40 deletions example-notebook.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,21 +25,21 @@ acro_init()
```

### Load the data
- The dataset used in this example notebook is the nursery dataset from OpenML.
- The code below reads the data from a folder called "nursery_data" which we assume is at the same level as the folder where you are working.
- The dataset used in this example notebook is the nursery dataset from OpenML.
- The code below reads the data from a folder called "nursery_data" which we assume is at the same level as the folder where you are working.
- The path might need to be changed if the data has been downloaded and stored elsewhere.
```{r}
data = farff::readARFF("nursery_data/nursery.arff")
data = as.data.frame(data)
data <- farff::readARFF("nursery_data/nursery.arff")
data <- as.data.frame(data)

names(data)[names(data) == "class"] <- "recommend"
```

- Convert the children column to integers, replacing 'more' with random int from range 4-10

```{r}
data$children <-as.numeric(as.character(data$children))
data[is.na(data)] <- round(runif(sum(is.na(data)), min = 4, max = 10),0)
data$children <- as.numeric(as.character(data$children))
data[is.na(data)] <- round(runif(sum(is.na(data)), min = 4, max = 10), 0)
unique(data$children)
```

Expand All @@ -48,35 +48,35 @@ unique(data$children)
#### ACRO Crosstab

```{r}
index = data[, c("recommend")]
columns = data[, c("parents")]
values = data[, c("children")]
index <- data[, c("recommend")]
columns <- data[, c("parents")]
values <- data[, c("children")]

# convert the values to an array
values = matrix(values, ncol=1)
values <- matrix(values, ncol = 1)

table = acro_crosstab(index = index, columns= columns, values = values, aggfunc = "sum")
table <- acro_crosstab(index = index, columns = columns, values = values, aggfunc = "sum")
table
```

#### ACRO table

```{r}
index = data[, c("parents")]
columns = data[, c("social")]
index <- data[, c("parents")]
columns <- data[, c("social")]

table = acro_table(index=index, columns=columns, deparse.level=1)
table <- acro_table(index = index, columns = columns, deparse.level = 1)
table
```

#### ACRO pivot table

```{r}
index = "parents"
values = "children"
aggfunc = list("mean", "std")
index <- "parents"
values <- "children"
aggfunc <- list("mean", "std")

table = acro_pivot_table(data, values=values, index=index, aggfunc=aggfunc)
table <- acro_pivot_table(data, values = values, index = index, aggfunc = aggfunc)
table
```
#### ACRO histogram
Expand All @@ -91,9 +91,9 @@ In this example a different data set will be used. The lung dataset from the sur
```{r}
# Load the lung dataset
data(lung)
#head(lung)
# head(lung)

acro_surv_func(time=lung$time, status=lung$status, output ="plot")
acro_surv_func(time = lung$time, status = lung$status, output = "plot")
```

### Examples of producing regression outputs using acro
Expand All @@ -104,27 +104,27 @@ acro_surv_func(time=lung$time, status=lung$status, output ="plot")

```{r}
data$recommend <- as.character(data$recommend)
data$recommend[which(data$recommend=="not_recom")] <- "0"
data$recommend[which(data$recommend=="recommend")] <- "1"
data$recommend[which(data$recommend=="very_recom")] <- "2"
data$recommend[which(data$recommend=="priority")] <- "3"
data$recommend[which(data$recommend=="spec_prior")] <- "4"
data$recommend[which(data$recommend == "not_recom")] <- "0"
data$recommend[which(data$recommend == "recommend")] <- "1"
data$recommend[which(data$recommend == "very_recom")] <- "2"
data$recommend[which(data$recommend == "priority")] <- "3"
data$recommend[which(data$recommend == "spec_prior")] <- "4"
data$recommend <- as.numeric(data$recommend)
```

```{r}
# extract relevant columns
df = data[, c("recommend", "children")]
df <- data[, c("recommend", "children")]
# drop rows with missing values
df = df[complete.cases(df), ]
df <- df[complete.cases(df), ]
# formula to fit
formula = "recommend ~ children"
formula <- "recommend ~ children"
```

#### ACRO Linear Model

```{r}
acro_lm(formula=formula, data=df)
acro_lm(formula = formula, data = df)
```

#### ACRO Logit Model
Expand All @@ -133,25 +133,25 @@ We use a different combination of variables from the original dataset.

```{r}
# extract relevant columns
df = data[, c("finance", "children")]
df <- data[, c("finance", "children")]
# drop rows with missing values
df = df[complete.cases(df), ]
df <- df[complete.cases(df), ]
# convert finance to numeric
df = transform(df, finance = as.numeric(finance))
df <- transform(df, finance = as.numeric(finance))
# subtract 1 to make 1s and 2S into 0a and 1s
df$finance <- df$finance -1
df$finance <- df$finance - 1
# formula to fit
formula = "finance ~ children"
formula <- "finance ~ children"
```

```{r}
acro_glm(formula=formula, data=df, family="logit")
acro_glm(formula = formula, data = df, family = "logit")
```

#### ACRO Probit Model

```{r}
acro_glm(formula=formula, data=df, family="probit")
acro_glm(formula = formula, data = df, family = "probit")
```

### Examples of functionality to let users manage their output
Expand Down Expand Up @@ -185,12 +185,12 @@ acro_add_comments("output_1", "This is a crosstab on the nursery dataset.")
```

#### Finalise
- The users must call finalise() at the end of each session.
- Each output is saved to a CSV file.
- The SDC analysis for each output is saved to a json file or Excel file
- The users must call finalise() at the end of each session.
- Each output is saved to a CSV file.
- The SDC analysis for each output is saved to a json file or Excel file
(depending on the extension of the name of the file provided as an input to the function)

```{r}
#acro_finalise("RTEST", "xlsx")
# acro_finalise("RTEST", "xlsx")
acro_finalise("RTEST", "json")
```
Loading