|
| 1 | +README |
| 2 | +================ |
| 3 | +William G. Ryan V |
1 | 4 |
|
2 | | -<!-- README.md is generated from README.Rmd. Please edit that file --> |
3 | | - |
4 | | -# 3PodR |
| 5 | +- [Introduction](#introduction) |
| 6 | +- [Installation & Usage](#installation--usage) |
| 7 | + - [Clone the Repository](#clone-the-repository) |
| 8 | + - [Open the Project](#open-the-project) |
| 9 | + - [Restore the Environment](#restore-the-environment) |
| 10 | + - [Prepare Your Input Data](#prepare-your-input-data) |
| 11 | + - [Configure the Analysis](#configure-the-analysis) |
| 12 | + - [Render the Report](#render-the-report) |
| 13 | +- [Troubleshooting](#troubleshooting) |
| 14 | + - [Verify Installation](#verify-installation) |
| 15 | + - [Verify Species](#verify-species) |
| 16 | + - [Validate Input File Formats](#validate-input-file-formats) |
| 17 | + - [Check Gene Annotation](#check-gene-annotation) |
| 18 | + - [Review Configuration](#review-configuration) |
5 | 19 |
|
| 20 | +<!-- README.md is generated from README.Rmd. Please edit that file --> |
6 | 21 | <!-- badges: start --> |
| 22 | + |
| 23 | +[](https://rmarkdown.rstudio.com/) |
7 | 25 | <!-- badges: end --> |
8 | 26 |
|
9 | | -3PodR. |
| 27 | +# Introduction |
| 28 | + |
| 29 | +3PodR is an R bookdown site for performing comprehensive differential |
| 30 | +gene expression analysis. |
| 31 | + |
| 32 | +This template processes differential gene expression data from |
| 33 | +transcriptomic studies and generates several key outputs, including Gene |
| 34 | +Set Enrichment Analysis (GSEA), Over-representation Analysis (ORA) and |
| 35 | +drug prediction analysis (iLINCS). |
| 36 | + |
| 37 | +# Installation & Usage |
| 38 | + |
| 39 | +Follow these step-by-step instructions to install and run 3PodR: |
| 40 | + |
| 41 | +## Clone the Repository |
| 42 | + |
| 43 | +Open a terminal and execute: |
| 44 | + |
| 45 | +``` bash |
| 46 | +git clone https://github.com/willgryan/3PodR_bookdown.git |
| 47 | +``` |
| 48 | + |
| 49 | +## Open the Project |
| 50 | + |
| 51 | +Change into the repository directory: |
| 52 | + |
| 53 | +``` bash |
| 54 | +cd 3PodR_bookdown |
| 55 | +``` |
| 56 | + |
| 57 | +Alternatively, open `3PodR_bookdown.Rproj` in |
| 58 | +[RStudio](https://posit.co/download/rstudio-desktop/). |
| 59 | + |
| 60 | +## Restore the Environment |
| 61 | + |
| 62 | +In R, run: |
| 63 | + |
| 64 | +``` r |
| 65 | +renv::restore() |
| 66 | +``` |
| 67 | + |
| 68 | +This command ensures that all required packages are installed. |
| 69 | + |
| 70 | +## Prepare Your Input Data |
| 71 | + |
| 72 | +3PodR takes one or more CSV files containing the results of testing for |
| 73 | +differential gene expression. |
| 74 | + |
| 75 | +Each file must contain the first three columns in the following order. |
| 76 | + |
| 77 | +**CSV Format Requirements:** |
| 78 | + |
| 79 | +- **Column 1:** Gene Symbols (character vector) |
| 80 | +- **Column 2:** Log2 Fold Change (numeric vector) |
| 81 | +- **Column 3:** Unadjusted P-values (numeric vector) |
| 82 | + |
| 83 | +> **Important:** Gene symbols must conform to the annotation standard |
| 84 | +> for the species used (HGNC for human, RGD for rat, or MGI for mouse). |
| 85 | +> Non-standard IDs (e.g., Ensembl or Entrez) will cause errors. |
| 86 | +
|
| 87 | +**Place Your CSV File:** |
| 88 | + |
| 89 | +Copy your differential gene expression CSV file into the `extdata/` |
| 90 | +folder (e.g., `extdata/YOUR_DGE_FILE.csv`). |
| 91 | + |
| 92 | +**Optional: Sample Metadata and Gene Counts** |
| 93 | +If you have sample metadata (e.g., group information) and gene counts, |
| 94 | +place them in the `extdata/` folder. |
| 95 | + |
| 96 | +Sample metadata should be in a CSV file named `design.csv`. The file |
| 97 | +should contain the following columns: |
| 98 | + |
| 99 | +**CSV Format Requirements:** |
| 100 | + |
| 101 | +- **Column 1:** Sample (character vector) |
| 102 | +- **Column 2:** Group (character vector) |
| 103 | + |
| 104 | +Gene counts in log units should be in a CSV file named `counts.csv.` The |
| 105 | +file should contain the following columns: |
| 106 | + |
| 107 | +**CSV Format Requirements:** |
| 108 | + |
| 109 | +- **Column 1:** Gene Symbols (character vector) |
| 110 | +- **Column 2-N:** Sample Names (numeric vector) |
| 111 | + |
| 112 | +> **Important:** Ensure that the gene symbols in the `design.csv` and |
| 113 | +> `counts.csv` files match those in the differential gene expression |
| 114 | +> file. The sample names in the `design.csv` file must match those in |
| 115 | +> the counts.csv columns. The group names in the `design.csv` file must |
| 116 | +> match those in the report configuration file detailed below. |
| 117 | +
|
| 118 | +## Configure the Analysis |
| 119 | + |
| 120 | +Edit the `extdata/configuration.yml` file as follows: |
| 121 | + |
| 122 | +**Species** |
| 123 | + |
| 124 | +- By default, the species is set to human. To use rat or mouse data, set |
| 125 | + the `species` variable to the appropriate species (e.g., |
| 126 | + `species: human`, `species: rat`, or `species: mouse`). You must also |
| 127 | + update the species indicated in the gmt file name by changing Human to |
| 128 | + Rat or Mouse |
| 129 | + |
| 130 | +> **Important:** Ensure that the species in the configuration file is |
| 131 | +> lowercase. However, you must use the species name in the gmt file name |
| 132 | +> with the first letter capitalized (e.g., `Human`, `Rat`, or `Mouse`). |
| 133 | +
|
| 134 | +**For Count and Expression Data:** |
| 135 | + |
| 136 | +- If you are not using count data, you **must** comment out lines |
| 137 | + related to `design.csv` and `counts.csv` to not cause errors. |
| 138 | + |
| 139 | +**File Name Update:** |
| 140 | + |
| 141 | +- Change the `file` variable (default is `file: DAvsCA.csv` or |
| 142 | + `file: DBvsCB.csv`) to match your CSV file (e.g., |
| 143 | + `file: YOUR_INPUT_FILE.csv`). |
| 144 | + |
| 145 | +## Render the Report |
| 146 | + |
| 147 | +Generate the report by running in R: |
| 148 | + |
| 149 | +``` r |
| 150 | +rmarkdown::render_site(encoding = 'UTF-8') |
| 151 | +``` |
| 152 | + |
| 153 | +Alternatively, in RStudio, click the **Build Book** or **Build Website** |
| 154 | +button (a wrench) in the Build pane, typically in the top right of the |
| 155 | +window. |
| 156 | + |
| 157 | +# Troubleshooting |
| 158 | + |
| 159 | +If you encounter any issues, follow these troubleshooting steps: |
| 160 | + |
| 161 | +## Verify Installation |
| 162 | + |
| 163 | +- Run the example files provided in the `extdata/` folder. |
| 164 | +- Ensure all dependencies specified in the `renv.lock` file are |
| 165 | + installed. |
| 166 | + |
| 167 | +## Verify Species |
| 168 | + |
| 169 | +- Confirm that the species in the configuration file matches the species |
| 170 | + in the gmt file name. |
| 171 | +- Ensure that the species in the configuration file is lowercase and the |
| 172 | + species in the gmt file name is capitalized. |
| 173 | + |
| 174 | +## Validate Input File Formats |
| 175 | + |
| 176 | +- Confirm your differential gene expression CSV files are formatted |
| 177 | + correctly shown above. |
| 178 | +- If you are using count and expression data, ensure the `design.csv` |
| 179 | + and `counts.csv` files are correctly formatted as shown above. |
| 180 | + |
| 181 | +## Check Gene Annotation |
| 182 | + |
| 183 | +- Make sure gene symbols are in the proper format (HGNC for human, RGD |
| 184 | + for rat, or MGI for mouse). |
| 185 | +- Incorrect gene identifier formats (e.g., Ensembl or Entrez IDs) will |
| 186 | + lead to errors. |
| 187 | + |
| 188 | +## Review Configuration |
| 189 | + |
| 190 | +- Verify that the `extdata/configuration.yml` file is correctly set up, |
| 191 | + including file names and group names if using count data. |
| 192 | +- You can refer to the example datasets provided in the `extdata/` |
| 193 | + folder for guidance on formatting. |
| 194 | + |
| 195 | +For additional help, open an issue for support. |
| 196 | + |
| 197 | +Happy 3PodR’ing! |
0 commit comments