-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathPreprocessing.qmd
More file actions
355 lines (216 loc) · 11 KB
/
Copy pathPreprocessing.qmd
File metadata and controls
355 lines (216 loc) · 11 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
---
title: "IMComplete-Image Processing"
format: html
editor: visual
---
# **Part A: Preprocessing of imaging data**
<i>**Latest update**</i> - 1.0.0 (May 2025)
#### **Authors:**
[Thomas O'Neil](https://github.com/DrThomasOneil) (thomas.oneil\@sydney.edu.au) \| [Oscar Dong](https://github.com/Awesomenous) (oscardong4\@gmail.com) \| [Heeva Baharlou](heeva.baharlou@sydney.edu.com)
##### The purpose of this notebook is to provide a consolidated approach to IMC analysis and forms the prerequisite steps to the IMComplete R package workflow. We focused
Nature Method of the Year in 2024 was [**spatial proteomics**](https://www.nature.com/articles/s41592-024-02565-3).
> Computational tools for spatial proteomics are the focus of the second Comment, from Yuval Bussi and Leeat Keren. These authors note that current image processing and analysis workflow are **well defined but fragmented**, with various steps happening back to back **rather than in an integrated fashion**. They envision a future for the field where **image processing and analysis steps work in concert** for improved biological discovery.
In alignment with these comments, we have committed to provide a comprehensive and dynamic workflow. In part, we aimed to achieve this by compiling as much as we could into this pre-processing workflow.
Particularly, we have emphasised tools that can be performed in <strong>*one*</strong> linear workflow. For example, we provide the function `PyProfiler`, a tool that performs the same functions as CellProfiler in extracting cell features, and RegisterImages to register IMC to IF in Python, and allowing users remain in this linear pipeline and not have to install additional applications.
<hr>
Some scripts adapted from [BodenmillerGroup/ImcSegmentationPipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline)
<i>**Therefore, make sure to also reference these studies:**</i>\
- Windhager, J., Zanotelli, V.R.T., Schulz, D. et al. An end-to-end workflow for multiplexed image processing and analysis. [Nat Protoc](https://doi.org/10.1038/s41596-023-00881-0) (2023).
<br>
<hr>
## Folder structure
``` text
ImagingAnalysis/ (root directory)
├── IMComplete-Workflow
├── ImcSegmentationPipeline
├── Experiment_name_1
│ └── raw
│ └── Sample1.zip
│ └── Sample2.zip
│ └── ...
│ └── analysis
│ └── 1_image_out
│ └── 2_cleaned
│ └── 3_segmentation
│ └── 3a_cellpose_crop
│ └── 3b_cellpose_full
│ └── 3c_cellpose_mask
│ └── 3d_compartments
│ └── 4_pyprofiler_output
│ └── panel.csv
├── ...
├── Experiment_name_n
```
<br>
<hr>
<hr>
## Workflow
1. Set up (`CheckSetup()`)
2. Create a new project (`NewProject()`)
3. Prep the raw folder and `panel.csv`
4. Extract images from the raw folder (`ExtractImages()`)
- *Optional 1:* Check filter parameters of IF data (`CheckExtract()`)
- *Optional 2:* Filter images (`FilterImages()`)
- *Optional 3:* Select crop regions for segmentation training (`CropSelector()`)
5. Prepare the images for Segmentation model training (`PrepCellpose()`)
- *Optional 4:* Register low-resolution images with high-resolution images to improve cell segmentation (`RegisterImages()`)
6. Train a segmentation model (`cellpose`)
- *Optional 5:* You have the option to not train a segmentation model and use a generic model.
7. Batch segment the images and generate cell masks (`BatchSegment()`)
8. Extract data from your images using the cell segment masks (`PyProfiler()`)
<hr>
<hr>
# 1. Set up
<details>
<summary>Set up the environment</summary>
Anaconda is needed to run this workflow. Follow the steps below to set up Anaconda and a `conda` environment:
Install [**Anaconda**](https://www.anaconda.com/download) and navigate to the relevant command line interface: <br>
::: {align="left"}
| Windows | macOS |
|----------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
| 1\. Search for **'Anaconda Prompt'** in the taskbar search <br> 2. Select **Anaconda Prompt** <br> | 1\. Use `cmd + space` to open Spotlight Search <br> 2. Type **'Terminal'** and press `return` to open <br> |
:::
<br>
<hr>
<hr>
### *Using Anaconda...*
#### **Step 1:** Set your directory to the analysis folder (or the `root directory` for image analysis)
``` bash
cd /Desktop/ImageAnalysis
```
<hr>
#### **Step 2:** Clone the IMComplete repository.
<storng>*From Github*</strong>\
Go to the [Github page](https://github.com/CVR-MucosalImmunology/IMComplete-Workflow) and near the top click the `code` button and download the zip. Unzip the folder into the `root` directory. This will contain the IMComplete-Workflow documents and allow ready access to the necessary files.
</strong>*Using Git*</strong> in command line
<details>
<summary>Install Git</summary>
Git needs to be installed on your system. Find the instructions [here](https://git-scm.com/downloads)
<hr>
</details>
``` bash
git clone --recursive https://github.com/CVR-MucosalImmunology/IMComplete-Workflow.git
```
<hr>
#### **Step 3:** Clone the extra repositories:
- [BodenmillerGroup/ImcSegmentationPipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline): Windhager, J., Zanotelli, V.R.T., Schulz, D. et al. An end-to-end workflow for multiplexed image processing and analysis. [Nat Protoc](https://doi.org/10.1038/s41596-023-00881-0) (2023).
``` bash
git clone --recursive https://github.com/BodenmillerGroup/ImcSegmentationPipeline.git
```
````{=html}
<!---
- [deMirandaLab/PENGUIN](https://github.com/deMirandaLab/PENGUIN): Sequeira, A. M., Ijsselsteijn, M. E., Rocha, M., & de Miranda, N. F. (2024). PENGUIN: A rapid and efficient image preprocessing tool for multiplexed spatial proteomics. [Computational and Structural Biotechnology Journal](https://doi.org/10.1101/2024.07.01.601513)
```bash
git clone --recursive https://github.com/deMirandaLab/PENGUIN.git
```
<--->
````
<hr>
#### **Step 4:** Create a conda environment and install some packages (in one line)
``` bash
conda env create -f IMComplete-Workflow/environment.yml
```
*This can take some time so be patient!*
<hr>
#### **Step 5:** Activate the newly created conda environment
``` bash
conda activate IMComplete
```
<hr>
#### **Step 6:** Activate and ensure your GPU-acceleration is accessible
Unfortunately, parts of this workflow will require GPU-acceleration: Cell segmentation, Denoise, PyProfiler (will run quicker, but not necessary).
You will need to install Pytorch and pytorch-cuda versions that are suitable for your PC. Instructions are found [here](https://pytorch.org/get-started/previous-versions/). The code will look like this:
``` bash
conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.4 -c pytorch -c nvidia
```
<hr>
#### **Step 7:** Install cellpose
Cellpose is used for cell segmentation. We'll install the gui version for the user-friendly version. If you experience errors installing cellpose, refer to the [cellpose installation instructions](https:://cellpose.readthedocs.io/en/latest/installation.html)
``` bash
python -m pip install cellpose[gui]
```
<hr>
#### **Step 8:** Select the IMComplete kernel in your IDE
If you are using VSCode, you'll see this option in the top right of the window.
If you are using a jupyter notebook, you will see this...[\[\[**TO ADD**\]\]]{style="color:yellow; background:red"}
</details>
<hr>
<hr>
## **CheckSetup()**
```{python}
#| eval: false
from PyMComplete import CheckSetup
CheckSetup()
```
You can check the installation requirements with the following function:
<details>
<summary>`Function: CheckSetup()`</summary>
``` bash
CheckSetup(
torch=1
)
```
================================================================
**Arguments:**\
- `torch`: Default is `1` which simply checks that GPU is installed and ready. This can be turned off if you're using a Mac and/or aware that GPU is not properly setup.
================================================================
**Expected Outputs:**
``` text
Checking required packages in the current Conda environment...
All required packages are installed and meet the required versions.
-----------------
Checking that CUDA has been installed properly...
GPU acceleration has not been prepared. Consult https://pytorch.org/get-started/previous-versions/
and try again
```
================================================================
**Packages:**\
- pkg_resources
================================================================
</details>
<hr>
<hr>
# 2. Set up a new Project for Imaging Analysis
The following function will create the folder structure for this workflow and generate a template `panel.csv` and `image.csv`.
Set `rootdir` as your **ImageAnalysis** folder directory and `projdir` as your **project** folder name.
**Important**: These need to established each time you open this workflow, as all subsequent folders will rely on these values.
```{python}
rootdir = "/Users/thomasoneil/Desktop/ImagingAnalysis"
projdir = "Project_1"
```
`Function: NewProject()`
The workflow is designed to utilize both a `rootdir` and a `projdir` for better organization and efficiency.
- `rootdir`: This directory is intended to store commonly used GitHub Repositories and other relevant resources.
- `projdir`: This directory is specific to individual projects.
By structuring the directories this way, users with multiple projects can benefit from a consistent workflow. They only need to install the repositories once in the `rootdir`, and all projects can access these resources without duplication. This approach eliminates the need to repeatedly refer to or duplicate distant folders, streamlining the workflow and ensuring all relevant files are easily accessible.
<details>
<summary>Information</summary>
``` bash
NewProject(
rootdir=rootdir,
projdir=projdir
)
```
================================================================
**Arguments:**\
- `rootdir`: This directory is intended to store commonly used GitHub Repositories and other relevant resources.
- `projdir`: This directory is specific to individual projects.
================================================================
**Expected Outputs:**
``` text
Project '2025_ProjectName' created successfully.
```
================================================================
**Packages:**\
- os - csv
================================================================
</details>
```{python}
#| eval: false
from PyMComplete import NewProject
NewProject(
rootdir=rootdir,
projdir=projdir)
```
<hr>
<hr>