Commit 277d34e
Add dotseq/dotseq (#11742)
* Add dotseq/dotseq module
DOTSeq is a Bioconductor package for detecting differential ORF usage
(DOU) and ORF-level differential translation efficiency (DTE) from
Ribo-seq with matched RNA-seq. Module wraps DOTSeqDataSetsFromFeatureCounts
+ DOTSeq() + getContrasts() and emits per-ORF TSVs for the DOU and DTE
interaction contrasts plus the serialised DOTSeqDataSets object.
Pre-requisites (in flight):
- Bioconda recipe: bioconda/bioconda-recipes#65677
- Test data: nf-core/test-datasets#2072
* Use Wave community container for bioconductor-dotseq 1.0.0
Bioconda recipe (bioconda/bioconda-recipes#65677) merged; biocontainer
image is not yet built so swap the placeholder quay.io/depot URLs for a
Wave community container built from the now-merged bioconda package.
Also widen the singularity guard to include 'apptainer' and add the
versions topic block in meta.yml (via nf-core modules lint --fix).
* Add native plotDOT outputs, simplify template, tidyverse syntax
- Restructure the R template around optparse + readr + dplyr + purrr +
ggplot2; drop the homemade parse_args / read_delim_flexible helpers
in favour of the standard package idioms and native pipe.
- Output set is now what DOTSeq itself emits natively: per-ORF DTE
contrasts (translation.dotseq.results.tsv), DOU contrasts
(dou.dotseq.results.tsv), optional dou_strategy / dte_strategy
per-condition Ribo-vs-RNA contrasts, plus the four plotDOT() PNGs
(volcano / composite / venn / heatmap) and a DTE p-value distribution
histogram drawn directly from DOTSeq's padj column.
- Container picks up r-eulerr + r-ggsignif (required for plotDOT venn)
and explicit r-ggplot2 so the histogram has a stable ggplot version.
- plotDOT() default of force_new_device=TRUE was killing our png()
device on each call; pass FALSE so the PNGs land where Nextflow
expects them.
* Simplify R template helpers, add heatmap sorf_type fallback
- Drop the homemade read_delim_flexible() and write_results_tsv() wrappers
in favour of read_tsv() / read_csv() / write_tsv() directly. The earlier
to_orf_tibble() conditional is also gone now that we know
getContrasts() always returns a frame with orf_id as a column (per the
DOTSeq source in posthoc.R + main.R).
- plotDOT(heatmap) requires gene-paired mORF + sorf entries; try uORF
first (the package default) and fall back to dORF when no significant
gene has both. tryCatch in safe_plot_dot still makes either a no-op
when neither succeeds.
* Address code-review feedback: stub block, validation hardening, plot fallback robustness
- Add stub: block to main.nf matching the proteus/readproteingroups precedent.
- Read sample sheet with read_delim() picking comma/tab from the file
extension so the meta.yml-advertised TSV variant actually works.
- Refuse to clobber an existing canonical column (e.g. an existing
'condition' column when --contrast_variable=treatment is supplied).
- Dedupe multi-lane sample sheets and validate that both Ribo and RNA
strategies are present (DOTSeq's interaction design is unestimable
otherwise).
- Add an is_set() predicate that catches NULL / empty stringent + required
options before the tri-state switch silently returns NULL.
- safe_plot_dot now unlinks the partially-written PNG on plotDOT error and
returns success so the heatmap fallback (uORF then dORF) keys off whether
the first call actually drew, not file.exists() of a stale handle.
- getContrasts(type='interaction') errors propagate (headline outputs);
type='strategy' stays tryCatch'd because absence is legitimate.
- Cache getDOU(d) / getDTE(d) once and share across contrasts + plotDOT.
- Drop redundant file.exists() walk - Nextflow's path staging already
guarantees the inputs exist.
- Expand the test to assert volcano / composite / venn plot emission and
add a -stub test.
* TEMPORARY: point test at the pending test-datasets PR fork branch
Lets CI verify the module is actually green; revert this commit once
nf-core/test-datasets#2072 merges and the canonical modules-branch URL
resolves.
* refactor(dotseq/dotseq): take a count-matrix shape for consumer parity
Aligns the module's input contract with deltate / anota2seq so that
consumers can dispatch between the three ORF-DTE methods without
maintaining a separate prep step for dotseq. The four
featureCounts/GTF/BED inputs collapse to a per-ORF count matrix
(orf_id + sample columns) plus a per-ORF annotation TSV (orf_id +
gene_id + optional orf_type/coords). The R template now calls
DOTSeqDataSetsFromSummarizeOverlaps() and builds the required GRanges
in-process from the annotation TSV; the model fit, contrast tables,
and plotDOT outputs are unchanged.
Test fixtures updated alongside in
nf-core/test-datasets#2072 (commit 8c9b27c).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(dotseq/dotseq): synthesize `replicate` column when absent
DOTSeq's parse_condition_table() requires a `replicate` column for
stable ordering of samples within strategy+condition. Pipeline
samplesheets often have a `pair` column (or none at all), so the R
template now treats the column as optional: when present it is
renamed to `replicate` as before; when absent the template assigns a
per-(strategy, condition) row counter so the model fit is unaffected.
This matches how anota2seq/deltate consume the same samplesheets.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(dotseq/dotseq): support running a single module
Dropping a module from --modules left DOTSeq()'s skipped slot unfitted
(a bare DESeqDataSet for DTE), and getContrasts() has no method for it,
so a DOU-only run crashed when extracting the DTE interaction table.
Gate interaction and strategy contrast extraction on the selected
modules, and write each module's interaction table only when that module
ran. Mark the translation and dou outputs optional to match, and add a
DOU-only regression test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent bbd6423 commit 277d34e
8 files changed
Lines changed: 913 additions & 0 deletions
File tree
- modules/nf-core/dotseq/dotseq
- templates
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
0 commit comments