Skip to content
Open

V2 #25

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
fec349f
major update
ch99l Feb 9, 2026
14e50db
bug fix
ch99l Feb 10, 2026
c8bcd5c
fix reverse_complement_fastq.py script
ch99l Feb 11, 2026
d337468
fix container logic, additional improvements, and bug fixes
ch99l Feb 13, 2026
1fcb0ff
fix comments
ch99l Feb 13, 2026
795ef4b
fix docker system dependencies
ch99l Feb 13, 2026
98a3c91
fix issue with file extensions
ch99l Feb 13, 2026
e3f9463
remove unused local variable
ch99l Feb 13, 2026
039c9c1
refactor main.nf and bug fix
ch99l Feb 19, 2026
b9d61e4
fix comments in nextflow.config
ch99l Feb 19, 2026
3f30490
allow early termination & update pipeline logic when starting from ba…
ch99l Feb 20, 2026
ebffe99
fix 10x 3prime adapter trimming & minimap process output
ch99l Mar 2, 2026
7ee17a0
fix reverse_complement.py, preprocess_fastq scaling, and other minor …
ch99l Mar 8, 2026
6581f42
fix preprocess_fastq comments
ch99l Mar 9, 2026
dd73cdb
fix flexiplex stdin
ch99l Mar 9, 2026
5c508ad
refactor main.nf, update output dir structure, include spatial sample…
ch99l Mar 23, 2026
3deefb5
update bambu_construct_read_class.nf output dir
ch99l Mar 24, 2026
ddcc243
update publishDir
ch99l Mar 24, 2026
a54cdb2
update bambu process output
ch99l Mar 24, 2026
42dc60a
bug fix
ch99l Mar 24, 2026
fe9784a
updated README
ch99l Mar 24, 2026
35753d1
update README
ch99l Mar 24, 2026
2f57156
fix sampleData logic
ch99l Mar 24, 2026
ad28090
bug fix
ch99l Mar 25, 2026
eeef6ce
update README
ch99l Mar 25, 2026
e11574d
update README
ch99l Mar 25, 2026
0f6bd90
update README
ch99l Mar 25, 2026
6b63cab
update README
ch99l Mar 25, 2026
41215ed
update README
ch99l Mar 25, 2026
57f9876
update params
ch99l Mar 25, 2026
d8c30c4
update quantification_mode
ch99l Mar 25, 2026
9b9ff39
update README
ch99l Mar 25, 2026
fc28a35
update README
ch99l Mar 25, 2026
f07db09
update README
ch99l Mar 25, 2026
154b30f
update README
ch99l Mar 26, 2026
b9eb7b4
include dynamic retries
ch99l Mar 26, 2026
dfd875f
add param/input check
ch99l Mar 26, 2026
8c3a757
update samplesheet check
ch99l Mar 26, 2026
ca8f5af
update formatting
ch99l Mar 26, 2026
32d6c9f
refactor: simplify samplesheet validation using loops
ch99l Mar 27, 2026
348bd52
update README
ch99l Mar 27, 2026
8466545
update README
ch99l Mar 27, 2026
271fed2
remove dividers from README
ch99l Mar 27, 2026
e7a7763
refactor: move clustering to separate process
ch99l Mar 27, 2026
b20469e
add bambu_path param to load local bambu via devtools
ch99l Mar 30, 2026
debed83
add devtools
ch99l Mar 30, 2026
8c72147
update README.md
ch99l Mar 30, 2026
4010340
update README
ch99l Mar 30, 2026
801b424
fix collision issue
ch99l Apr 16, 2026
bc41122
remove test_case.md
ch99l Apr 16, 2026
891ddb3
add visium-hd routing and migrate containers to wave community images
ch99l Apr 20, 2026
a443459
create assets/ dir
ch99l Apr 20, 2026
dec6328
replace deprecated when directives
ch99l Apr 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clare (human) comments :)
suggested to have a more comprehensive gitignore e.g.

.nextflow.log*
work/
results/
*.command.*
*.pyc
.DS_Store
.vscode/```

Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
subworkflows/prepare_input_visium_hd.nf
.claude/
56 changes: 0 additions & 56 deletions Dockerfile

This file was deleted.

391 changes: 214 additions & 177 deletions README.md
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clare (human) comment:
Note - all the samplesheets currently in examples/ have incorrect headers for how the pipeline runs now (so need to be updated)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another note: the sample sheet that Clare generated for review uses the example data in the repo, where it only contains 10x 5' v2 ONT reads. Both files are actually also identical, which means that Bambu is most likely going to generate the same read class files for both.

Also, this makes it hard to validate if the pipeline still truly functions as intended when using other chemistries/technologies. Maybe an extra 1 to 2 samples can be sourced and included in the sample sheet to reflect the functionality?

Large diffs are not rendered by default.

11 changes: 11 additions & 0 deletions assets/10x_config/adapter_seq_config.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
technology,fwd_primer_f,fwd_primer_r,rev_primer_f,rev_primer_r,TSO_f,TSO_r
10x3v2,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCATGTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTACATGGG,,
10x3v3,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCCTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGGGG,,
10x3v4,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCCTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGGGG,,
10x5v2,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,GTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTAC,TTTCTTATATGGG,CCCATATAAGAAA
10x5v3,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,GTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTAC,TTTCTTATATGGG,CCCATATAAGAAA
visium-v1,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCATGTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTACATGGG,,
visium-v2,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCATGTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTACATGGG,,
visium-v3,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCATGTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTACATGGG,,
visium-v4,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCATGTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTACATGGG,,
visium-v5,CTACACGACGCTCTTCCGATCT,AGATCGGAAGAGCGTCGTGTAG,CCCATGTACTCTGCGTTGATACCACTGCTT,AAGCAGTGGTATCAACGCAGAGTACATGGG,,
11 changes: 11 additions & 0 deletions assets/10x_config/barcode_coordinate_config.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
technology,barcode_path,spatial_coordinate_path
10x3v2,737K-august-2016.txt,
10x3v3,3M-february-2018_TRU.txt.gz,
10x3v4,3M-3pgex-may-2023_TRU.txt.gz,
10x5v2,737K-august-2016.txt,
10x5v3,3M-5pgex-jan-2023.txt.gz,
visium-v1,visium-v1.txt,visium-v1_coordinates.txt
visium-v2,visium-v2.txt,visium-v2_coordinates.txt
visium-v3,visium-v3.txt,visium-v3_coordinates.txt
visium-v4,visium-v4.txt,visium-v4_coordinates.txt
visium-v5,visium-v5.txt,visium-v5_coordinates.txt
11 changes: 11 additions & 0 deletions assets/10x_config/flank_seq_config.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
technology,left_flank,barcode,umi,right_flank
10x3v2,CTACACGACGCTCTTCCGATCT,????????????????,??????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
10x3v3,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
10x3v4,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
10x5v2,CTACACGACGCTCTTCCGATCT,????????????????,??????????,TTTCTTATATGGG
10x5v3,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTCTTATATGGG
visium-v1,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
visium-v2,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
visium-v3,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
visium-v4,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
visium-v5,CTACACGACGCTCTTCCGATCT,????????????????,????????????,TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
116 changes: 116 additions & 0 deletions bin/reverse_complement_fastq.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
#!/usr/bin/env python3
import argparse
import sys

DNA_COMPLEMENT = str.maketrans("ACGTNacgtn", "TGCANtgcan")

def parse_args():
parser = argparse.ArgumentParser(description = "Reverse Complementing FASTQ files generated by Flexiplex")
parser.add_argument("-i", "--input", default = '-', help = "Input FASTQ file")
parser.add_argument("-o", "--output", default = '-', help = "Output FASTQ file")
args = parser.parse_args()

return args

def modify_read_id(id):
"""
Reverses strand direction tag in the Read ID. Example: GGAATCTCAAGCGCAA_TGGTCTTATTAA#9862034a-576a-44ad-bab9-30e8e9927dde_+1of1
will be modified to GGAATCTCAAGCGCAA_TGGTCTTATTAA#9862034a-576a-44ad-bab9-30e8e9927dde_-1of1

Args:
id (str): Read ID

Returns:
str: Modified Read ID
"""
rev_dict = {'+': '-', '-': '+'}
id_list = id.split('_')

# Exception handling in the event of unusual Flexiplex Read ID format
try:
strand_dir = id_list[-1][0]
id_list[-1] = rev_dict.get(strand_dir, strand_dir) + id_list[-1][1:]
except IndexError:
pass

return "_".join(id_list)

def modify_read_description(header):
"""
Reverses strand direction tag in the Description Header. Example: GGAATCTCAAGCGCAA_TGGTCTTATTAA#9862034a-576a-44ad-bab9-30e8e9927dde_+1of1 CB:Z:GGAATCTCAAGCGCAA UB:Z:TGGTCTTATTAA
will be modified to GGAATCTCAAGCGCAA_TGGTCTTATTAA#9862034a-576a-44ad-bab9-30e8e9927dde_-1of1 CB:Z:GGAATCTCAAGCGCAA UB:Z:TGGTCTTATTAA

Args:
header (str): Description header

Returns:
str: Modified description header
"""

header_list = header.split()
original_read_id = header_list[0]
modified_read_id = modify_read_id(original_read_id)

# Preserves trailing tags like CB:Z: / UB:Z:
return modified_read_id + header[len(original_read_id):]

def reverse_complement_seq(seq):
"""
Reverse complements a DNA sequence

Args:
seq (str): DNA sequence

Returns:
str: Sequence of the reverse complement
"""
return seq[::-1].translate(DNA_COMPLEMENT)

def reverse_phred_scores(phred_scores):
"""
Reverses Phred Quality Sequence

Args:
phred_scores (str): Phred quality score of the forward strand

Returns:
str: Phred quality score of the reverse complement
"""
return phred_scores[::-1]

if __name__ == "__main__":
# Parse arguments
args = parse_args()
f_in = sys.stdin if args.input == '-' else open(args.input, 'r')
f_out = sys.stdout if args.output == '-' else open(args.output, 'w')

# Track number of reads processed
reads_processed = 0

while True:
# Retrieve information for each read (stored in 4 lines)
header = f_in.readline().rstrip()
# Stop once header is empty
if not header:
break

dna_seq = f_in.readline().rstrip()
f_in.readline() # Read separator line but do not store it
phred_seq = f_in.readline().rstrip()

# Get header, DNA sequence and Phred sequence for reverse complement
rc_header = modify_read_description(header)
rc_dna_seq = reverse_complement_seq(dna_seq)
rc_phred_seq = reverse_phred_scores(phred_seq)

# Write output
f_out.write(f"{rc_header}\n{rc_dna_seq}\n+\n{rc_phred_seq}\n")

# Increment read counter
reads_processed += 1
if reads_processed % 1000000 == 0:
sys.stderr.write(f"\rProcessed {reads_processed/1000000} million reads")

# Close file handles if not stdin/stdout
if args.input != '-': f_in.close()
if args.output != '-': f_out.close()
11 changes: 11 additions & 0 deletions containers/alignment/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM mambaorg/micromamba:git-c0f93d2-amazon2023

RUN micromamba install -y -n base \
-c conda-forge \
-c bioconda \
minimap2=2.30 \
samtools=1.23 \
procps-ng \
&& micromamba clean -ay

ENV PATH=/opt/conda/bin:$PATH
13 changes: 13 additions & 0 deletions containers/preprocess/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM mambaorg/micromamba:git-c0f93d2-amazon2023

RUN micromamba install -y -n base \
-c conda-forge \
-c bioconda \
chopper=0.12.0 \
flexiplex=1.02.5 \
cutadapt=5.2 \
pigz=2.8 \
procps-ng \
&& micromamba clean -ay

ENV PATH=/opt/conda/bin:$PATH
10 changes: 10 additions & 0 deletions containers/r/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
FROM rocker/r-base:4.4.1

# install system dependencies
RUN apt-get update && apt-get install -y \
libcurl4-openssl-dev \
procps && rm -rf /var/lib/apt/lists/*

# install Seurat Object (v5.3.0), Seurat (v5.4.0), and Bambu
RUN R -e "install.packages(c('pak', 'devtools', 'BiocManager'), repos='https://cloud.r-project.org')"
RUN R -e "pak::pkg_install(c('SeuratObject@5.3.0', 'Seurat@5.4.0', 'GoekeLab/bambu@devel_pre_v4'))"
Loading