Skip to content

Commit c0d9edb

Browse files
ewelsclaude
andauthored
Add XDG config discovery and deep-merge support (#88)
* feat: add XDG config discovery, deep merging, and env var support (#82) Add layered configuration loading following the XDG Base Directory Specification. Config files are now auto-discovered from system (/etc/xdg/rustqc/rustqc.yml) and user (~/.config/rustqc/rustqc.yml) locations, plus the RUSTQC_CONFIG env var and explicit -c flag. Multiple config files are deep-merged at the leaf level so higher- priority files only override specific fields without clobbering siblings. Priority order: system < user < RUSTQC_CONFIG < -c < CLI. Also adds RUSTQC_* environment variables for every CLI argument via clap's env attribute, enabling workflows like `export RUSTQC_THREADS=8`. https://claude.ai/code/session_01GSLiquucJVtdt7xg3MeJzL * refactor: eliminate clones in deep_merge, harden config tests - Take overlay by value in deep_merge() to avoid unnecessary cloning of YAML subtrees during config merging - Fix fragile test_collect_config_paths tests by clearing RUSTQC_CONFIG env var to prevent interference from the test environment - Add XDG spec rationale to the break comment in collect_config_paths https://claude.ai/code/session_01GSLiquucJVtdt7xg3MeJzL * style: apply cargo fmt formatting https://claude.ai/code/session_01GSLiquucJVtdt7xg3MeJzL --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 75af5d2 commit c0d9edb

6 files changed

Lines changed: 475 additions & 28 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ path = "src/main.rs"
1919

2020
[dependencies]
2121
# CLI argument parsing
22-
clap = { version = "4", features = ["derive"] }
22+
clap = { version = "4", features = ["derive", "env"] }
2323

2424
# BAM file reading
2525
rust-htslib = { version = "1", features = ["static"] }
@@ -32,6 +32,7 @@ plotters-svg = "0.3"
3232
# Configuration
3333
serde = { version = "1", features = ["derive"] }
3434
serde_yaml_ng = "0.10"
35+
dirs = "6"
3536

3637
# Interval trees
3738
coitrees = "0.4"

docs/src/content/docs/usage/configuration.md

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,142 @@ and tool-specific parameter overrides where applicable.
2424
CLI flags take precedence over config file values.
2525
:::
2626

27+
## Config file discovery
28+
29+
RustQC automatically searches for configuration files in standard locations
30+
following the [XDG Base Directory Specification](https://specifications.freedesktop.org/basedir-spec/latest/).
31+
Multiple config files are loaded and **deep-merged** so that higher-priority
32+
files only override the specific fields they set, leaving all other settings
33+
from lower-priority files intact.
34+
35+
### Search order (lowest to highest priority)
36+
37+
| Priority | Source | Path |
38+
|----------|--------|------|
39+
| 1 | System config | First match in `$XDG_CONFIG_DIRS` (default `/etc/xdg/`) at `rustqc/rustqc.yml` |
40+
| 2 | User config | `$XDG_CONFIG_HOME/rustqc/rustqc.yml` (default `~/.config/rustqc/rustqc.yml`) |
41+
| 3 | `RUSTQC_CONFIG` env var | Path to any YAML file |
42+
| 4 | `-c` / `--config` flag | Path to any YAML file |
43+
| 5 | CLI flags / env vars | Override individual settings from any config source |
44+
45+
Use `-v` (verbose mode) to see which config files were loaded:
46+
47+
```bash
48+
rustqc rna sample.bam --gtf genes.gtf -v
49+
# Output includes:
50+
# Loaded config: /home/user/.config/rustqc/rustqc.yml (user)
51+
```
52+
53+
### Merge behaviour
54+
55+
When multiple config files are found, they are deep-merged at the leaf level.
56+
Only the fields explicitly set in a higher-priority file override the
57+
corresponding fields from lower-priority files — sibling fields are preserved.
58+
59+
For example, given a system config:
60+
61+
```yaml
62+
# /etc/xdg/rustqc/rustqc.yml
63+
rna:
64+
preseq:
65+
seed: 1
66+
n_bootstraps: 200
67+
tin:
68+
enabled: false
69+
```
70+
71+
And a user config:
72+
73+
```yaml
74+
# ~/.config/rustqc/rustqc.yml
75+
rna:
76+
preseq:
77+
seed: 42
78+
```
79+
80+
The merged result is:
81+
82+
```yaml
83+
rna:
84+
preseq:
85+
seed: 42 # overridden by user config
86+
n_bootstraps: 200 # preserved from system config
87+
tin:
88+
enabled: false # preserved from system config
89+
```
90+
91+
## Environment variables
92+
93+
Every CLI flag can also be set via an environment variable using the `RUSTQC_`
94+
prefix. This is useful for CI pipelines, container environments, or shell
95+
profiles where you want persistent defaults without a config file.
96+
97+
CLI flags always take precedence over environment variables.
98+
99+
:::tip
100+
Run `rustqc rna --help` to see the associated environment variable for each flag.
101+
:::
102+
103+
### Config file path
104+
105+
| Variable | Description |
106+
|----------|-------------|
107+
| `RUSTQC_CONFIG` | Path to a YAML config file. Merged between XDG discovery and the `-c` flag. |
108+
109+
### Input / Output
110+
111+
| Variable | CLI flag | Description |
112+
|----------|----------|-------------|
113+
| `RUSTQC_GTF` | `--gtf` | GTF gene annotation file |
114+
| `RUSTQC_REFERENCE` | `--reference` | Reference FASTA (for CRAM) |
115+
| `RUSTQC_OUTDIR` | `--outdir` | Output directory |
116+
| `RUSTQC_SAMPLE_NAME` | `--sample-name` | Override sample name |
117+
| `RUSTQC_FLAT_OUTPUT` | `--flat-output` | Flat output directory (`true`/`false`) |
118+
| `RUSTQC_JSON_SUMMARY` | `--json-summary` | JSON summary output path |
119+
120+
### Library
121+
122+
| Variable | CLI flag | Description |
123+
|----------|----------|-------------|
124+
| `RUSTQC_STRANDED` | `--stranded` | `unstranded`, `forward`, or `reverse` |
125+
| `RUSTQC_PAIRED` | `--paired` | Paired-end mode (`true`/`false`) |
126+
127+
### General
128+
129+
| Variable | CLI flag | Description |
130+
|----------|----------|-------------|
131+
| `RUSTQC_THREADS` | `--threads` | Number of threads |
132+
| `RUSTQC_MAPQ` | `--mapq` | MAPQ quality cutoff |
133+
| `RUSTQC_BIOTYPE_ATTRIBUTE` | `--biotype-attribute` | GTF biotype attribute name |
134+
| `RUSTQC_SKIP_DUP_CHECK` | `--skip-dup-check` | Skip duplicate-marking check |
135+
| `RUSTQC_QUIET` | `--quiet` | Suppress output |
136+
| `RUSTQC_VERBOSE` | `--verbose` | Show additional detail |
137+
138+
### Tool parameters
139+
140+
| Variable | CLI flag |
141+
|----------|----------|
142+
| `RUSTQC_INFER_EXPERIMENT_SAMPLE_SIZE` | `--infer-experiment-sample-size` |
143+
| `RUSTQC_MIN_INTRON` | `--min-intron` |
144+
| `RUSTQC_JUNCTION_SATURATION_SEED` | `--junction-saturation-seed` |
145+
| `RUSTQC_JUNCTION_SATURATION_MIN_COVERAGE` | `--junction-saturation-min-coverage` |
146+
| `RUSTQC_JUNCTION_SATURATION_PERCENTILE_FLOOR` | `--junction-saturation-percentile-floor` |
147+
| `RUSTQC_JUNCTION_SATURATION_PERCENTILE_CEILING` | `--junction-saturation-percentile-ceiling` |
148+
| `RUSTQC_JUNCTION_SATURATION_PERCENTILE_STEP` | `--junction-saturation-percentile-step` |
149+
| `RUSTQC_INNER_DISTANCE_SAMPLE_SIZE` | `--inner-distance-sample-size` |
150+
| `RUSTQC_INNER_DISTANCE_LOWER_BOUND` | `--inner-distance-lower-bound` |
151+
| `RUSTQC_INNER_DISTANCE_UPPER_BOUND` | `--inner-distance-upper-bound` |
152+
| `RUSTQC_INNER_DISTANCE_STEP` | `--inner-distance-step` |
153+
| `RUSTQC_TIN_SEED` | `--tin-seed` |
154+
| `RUSTQC_SKIP_TIN` | `--skip-tin` |
155+
| `RUSTQC_SKIP_READ_DUPLICATION` | `--skip-read-duplication` |
156+
| `RUSTQC_SKIP_PRESEQ` | `--skip-preseq` |
157+
| `RUSTQC_PRESEQ_SEED` | `--preseq-seed` |
158+
| `RUSTQC_PRESEQ_MAX_EXTRAP` | `--preseq-max-extrap` |
159+
| `RUSTQC_PRESEQ_STEP_SIZE` | `--preseq-step-size` |
160+
| `RUSTQC_PRESEQ_N_BOOTSTRAPS` | `--preseq-n-bootstraps` |
161+
| `RUSTQC_PRESEQ_SEG_LEN` | `--preseq-seg-len` |
162+
27163
## Full example
28164

29165
```yaml

0 commit comments

Comments
 (0)