You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+59Lines changed: 59 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -209,6 +209,65 @@ This feature is still under development and will be released in a future update.
209
209
This feature is still under development and will be released in a future update.
210
210
211
211
212
+
### **Advanced Usage**
213
+
214
+
**Stopping the Pipeline Early**
215
+
216
+
The `--early_stop_stage` parameter allows you to stop the pipeline at an intermediate stage and save the outputs for later use. This is useful when you want to inspect intermediate files or when you plan to re-run downstream steps separately.
217
+
218
+
-`--early_stop_stage bam`: Stops after genome alignment. BAM files are saved to `output/bam/`.
219
+
-`--early_stop_stage rds`: Stops after Bambu read class construction. Read class `.rds` files are saved to `output/read_class/`.
220
+
221
+
```bash
222
+
# Stop after read class construction
223
+
nextflow run $PWD/bambu-singlecell-spatial \
224
+
--input samplesheet.csv \
225
+
--genome reference.fa \
226
+
--annotation reference.gtf \
227
+
--early_stop_stage rds \
228
+
-profile singularity,hpc
229
+
```
230
+
231
+
**Restarting from a Specific Stage**
232
+
233
+
Because the pipeline accepts FASTQ, BAM, or RDS files as input, you can restart from any intermediate stage by providing the corresponding files in your samplesheet. This avoids re-running expensive preprocessing and alignment steps when they have already been completed.
234
+
235
+
*Example: Incremental sample addition*
236
+
237
+
A common use case is to process an initial set of samples through to read class `.rds` files, then re-run the full pipeline once additional samples are available. Transcript discovery and quantification in Bambu is performed jointly across all samples, so adding new samples requires re-running only from the `.rds` stage onward.
238
+
239
+
**Step 1** — Run the first batch of samples from FASTQ to `.rds`:
240
+
```bash
241
+
nextflow run $PWD/bambu-singlecell-spatial \
242
+
--input samplesheet_batch1.csv \
243
+
--genome reference.fa \
244
+
--annotation reference.gtf \
245
+
--early_stop_stage rds \
246
+
-profile singularity,hpc
247
+
```
248
+
249
+
This produces `output/read_class/sample1_readClassFile.rds`, `output/read_class/sample2_readClassFile.rds`, etc.
250
+
251
+
**Step 2** — When new samples are ready, run all samples together from `.rds` for transcript discovery and quantification. Point the `path` column at the existing `.rds` files for the original samples and at the new FASTQ/BAM files for the new samples:
The pipeline will skip preprocessing and alignment for `sample1` and `sample2`, process `sample3` from FASTQ through to `.rds`, and then perform transcript discovery and quantification jointly across all three samples.
269
+
270
+
212
271
### **Additional Information**
213
272
UMI correction is done at the barcode level. The longest read for each unique barcode-UMI combination is kept for analysis.
0 commit comments