Skip to content

Commit 99eb7f0

Browse files
committed
Adding new rule to create filtered transcripts FASTA file.
1 parent bf0579c commit 99eb7f0

3 files changed

Lines changed: 61 additions & 4 deletions

File tree

workflow/Snakefile

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,18 @@ rule all:
154154
case=provided(case_groups, quantify_transcripts != "None"),
155155
control=provided(ctrl_groups, quantify_transcripts != "None"),
156156
),
157+
# Create filtered transcripts FASTA file,
158+
# conditionally runs if the quantify
159+
# transcripts option is provided and
160+
# there are groups/contrasts.
161+
# @imported from rules/isoformswitchanalyzer.smk
162+
# @output of rule isoformswitchanalyzer_isoformfasta
163+
expand(
164+
join(workpath, "differential_switching", batch_id, "{case}_vs_{control}", "{case}-{control}_top_isoform_switches.fa"),
165+
zip,
166+
case=provided(case_groups, quantify_transcripts != "None"),
167+
control=provided(ctrl_groups, quantify_transcripts != "None"),
168+
),
157169

158170

159171
# Import rules

workflow/rules/isoformswitchanalyzer.smk

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,6 @@ rule isoformswitchanalyzer_diffswitching:
311311
vec = join(workpath, "differential_switching", batch_id, "{case}_vs_{control}", "sample_vector.tsv"),
312312
output:
313313
swt = join(workpath, "differential_switching", batch_id, "{case}_vs_{control}", "{case}-{control}_top_isoform_switches.tsv"),
314-
315314
params:
316315
rname = "diffswitch",
317316
outdir = join(workpath, "differential_switching", batch_id, "{case}_vs_{control}"),
@@ -340,3 +339,45 @@ rule isoformswitchanalyzer_diffswitching:
340339
--control_group {wildcards.control} \\
341340
--method saturn
342341
"""
342+
343+
344+
rule isoformswitchanalyzer_isoformfasta:
345+
"""
346+
Data-processing step to write isoform sequences based on the
347+
IsoformSwitchAnalyzeR results. This rule will create a FASTA
348+
file containing the sequence of each transcript of an gene
349+
with a significant isoform switch. So if a gene contains N
350+
transcripts and it had a signficant switching event (based
351+
on the fdr_filter), then the transcript sequences of all
352+
N transcripts will be written to the FASTA file.
353+
@Input:
354+
Differential isoform switching results (indirect-gather-per-contrast),
355+
Splicing annotation file
356+
@Output:
357+
Differential isoform switching results
358+
"""
359+
input:
360+
swt = join(workpath, "differential_switching", batch_id, "{case}_vs_{control}", "{case}-{control}_top_isoform_switches.tsv"),
361+
spl = join(workpath, "temp", "splicing_annotation.tsv"),
362+
output:
363+
fa = join(workpath, "differential_switching", batch_id, "{case}_vs_{control}", "{case}-{control}_top_isoform_switches.fa"),
364+
params:
365+
rname = "isofasta",
366+
pyscript = join(workpath, "workflow", "scripts", "isoform_sequences.py"),
367+
transcripts = quantify_transcripts,
368+
fdr_filter = 0.1,
369+
resources:
370+
mem = allocated("mem", "isoformswitchanalyzer_isoformfasta", cluster),
371+
time = allocated("time", "isoformswitchanalyzer_isoformfasta", cluster),
372+
threads: int(allocated("threads", "isoformswitchanalyzer_isoformfasta", cluster))
373+
container: config["images"]["isoformswitchanalyzer"]
374+
shell: """
375+
# Create the filtered transcripts FASTA file
376+
{params.pyscript} \\
377+
-i {input.swt} \\
378+
-s {input.spl} \\
379+
-t {params.transcripts} \\
380+
-o {output.fa} \\
381+
-f {params.fdr_filter} \\
382+
-d '|'
383+
"""

workflow/rules/leafcutter.smk

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@ rule leafcutter_gtf2exons:
1818
@Input:
1919
Input GTF file (singleton)
2020
@Output:
21-
Exons TSV file
21+
Exons TSV file,
22+
Exon annotation file,
23+
Splicing annotation file
2224
"""
2325
input:
2426
gtf = gtf_file
@@ -403,9 +405,11 @@ rule leafcutter_prepleafviz:
403405
Cluster significant table output by leafcutter_ds.R,
404406
Per-junction effect sizes table output by leafcutter_ds.R,
405407
All introns TSV file,
406-
408+
Splicing annotation file
407409
@Output:
408-
Rdata file to input to leafviz shiny app
410+
Rdata file to input to leafviz shiny app,
411+
Intron annotation file,
412+
Merged and annotated leafcutter results
409413
"""
410414
input:
411415
num = join(workpath, "junctions", "leafcutter_perind_numers.counts.gz"),

0 commit comments

Comments
 (0)