Summary
rustar-aligner --runMode genomeGenerate rejects the STAR-compatible --limitGenomeGenerateRAM flag at startup. STAR exposes this flag and any pipeline that wraps STAR derives a RAM cap from job resources and passes it through (e.g. the nf-core/rnaseq STAR_GENOMEGENERATE Nextflow module passes a value computed from task.memory).
STAR reference behaviour
--limitGenomeGenerateRAM <int> is a documented STAR flag — see the STAR manual §3 — capping the resident-set used during genome generation. Default 31000000000 (~31 GB). It is independent from --limitBAMsortRAM (the BAM-sort memory cap).
Reproducer
#!/usr/bin/env bash
set -euo pipefail
mkdir -p /tmp/rustar-mre-25 && cd /tmp/rustar-mre-25
BASE=https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a
curl -fsLO $BASE/reference/genome.fasta
curl -fsL $BASE/reference/genes_with_empty_tid.gtf.gz | gunzip -c > genes.gtf
RUSTAR=ghcr.io/scverse/rustar-aligner:dev
STAR=community.wave.seqera.io/library/htslib_samtools_star_gawk:ae438e9a604351a4
echo "=== STAR with --limitGenomeGenerateRAM 31000000000 ==="
docker run --rm -v $PWD:/w -w /w $STAR STAR --runMode genomeGenerate \
--genomeDir /tmp/star-idx --genomeFastaFiles genome.fasta --sjdbGTFfile genes.gtf \
--sjdbOverhang 100 --genomeSAindexNbases 7 \
--limitGenomeGenerateRAM 31000000000 2>&1 | tail -3
echo
echo "=== rustar with the same flag ==="
docker run --rm -v $PWD:/w -w /w $RUSTAR rustar-aligner --runMode genomeGenerate \
--genomeDir /tmp/rustar-idx --genomeFastaFiles genome.fasta --sjdbGTFfile genes.gtf \
--sjdbOverhang 100 --genomeSAindexNbases 7 \
--limitGenomeGenerateRAM 31000000000 2>&1 | tail -3
Observed (verified on commit 5f8ad08 + STAR 2.7.11b)
STAR completes:
May 12 15:24:18 ... writing Suffix Array to disk ...
May 12 15:24:18 ... writing SAindex to disk
May 12 15:24:18 ..... finished successfully
rustar fails at the CLI parser, before any work:
error: unexpected argument '--limitGenomeGenerateRAM' found
tip: a similar argument exists: '--limitBAMsortRAM'
Suggested fix
In src/params.rs (near the existing --limitBAMsortRAM field around line 363), add:
#[arg(long = "limitGenomeGenerateRAM", default_value_t = 31_000_000_000_u64)]
pub limit_genome_generate_ram: u64,
Two acceptable behaviours, in order of preference:
- Accept and honour — cap the resident-set during suffix-array / index construction (or at least the in-memory genome chunks).
- Accept and warn-ignore initially (
log::warn!("--limitGenomeGenerateRAM accepted but not enforced yet; rustar uses its own memory management")). This unblocks every STAR-compatible caller without committing to the cap implementation.
Either is dramatically better than failing at the CLI parser, because the flag isn't user-facing — it's emitted by every pipeline that wraps STAR.
Why this matters
rustar-aligner positions itself as a STAR drop-in. Rejecting a flag STAR has accepted for years makes the drop-in claim conditional. nf-core/rnaseq's STAR_GENOMEGENERATE Nextflow module is one of many wrappers that will need a workaround otherwise.
Filed during nf-core/rnaseq integration testing (nf-core/rnaseq#1855). All sibling issues from this exercise: author:pinin4fjords or grep for nf-core/rnaseq#1855.
Summary
rustar-aligner --runMode genomeGeneraterejects the STAR-compatible--limitGenomeGenerateRAMflag at startup. STAR exposes this flag and any pipeline that wraps STAR derives a RAM cap from job resources and passes it through (e.g. the nf-core/rnaseqSTAR_GENOMEGENERATENextflow module passes a value computed fromtask.memory).STAR reference behaviour
--limitGenomeGenerateRAM <int>is a documented STAR flag — see the STAR manual §3 — capping the resident-set used during genome generation. Default31000000000(~31 GB). It is independent from--limitBAMsortRAM(the BAM-sort memory cap).Reproducer
Observed (verified on commit
5f8ad08+ STAR 2.7.11b)STAR completes:
rustar fails at the CLI parser, before any work:
Suggested fix
In
src/params.rs(near the existing--limitBAMsortRAMfield around line 363), add:Two acceptable behaviours, in order of preference:
log::warn!("--limitGenomeGenerateRAM accepted but not enforced yet; rustar uses its own memory management")). This unblocks every STAR-compatible caller without committing to the cap implementation.Either is dramatically better than failing at the CLI parser, because the flag isn't user-facing — it's emitted by every pipeline that wraps STAR.
Why this matters
rustar-alignerpositions itself as a STAR drop-in. Rejecting a flag STAR has accepted for years makes the drop-in claim conditional. nf-core/rnaseq'sSTAR_GENOMEGENERATENextflow module is one of many wrappers that will need a workaround otherwise.Filed during nf-core/rnaseq integration testing (nf-core/rnaseq#1855). All sibling issues from this exercise:
author:pinin4fjordsor grep fornf-core/rnaseq#1855.