Skip to content

ADR: Module Parameters and Tool Arguments#7260

Open
pditommaso wants to merge 2 commits into
masterfrom
adr-module-params-tool-args
Open

ADR: Module Parameters and Tool Arguments#7260
pditommaso wants to merge 2 commits into
masterfrom
adr-module-params-tool-args

Conversation

@pditommaso

Copy link
Copy Markdown
Member

Summary

Consolidates two separate ADR drafts — #6769 (module parameters) and #6770 (module tool arguments) — into a single ADR, on the premise that a tool flag is just a parameter whose string form is -K <value>.

Highlights:

  • One params {} model for both module parameters and tool arguments — no separate tools.*.args namespace.
  • Replaces opaque, scattered task.ext.args strings with first-class, documented, schema-backed parameters accessible across CLI, config, and APIs.
  • Override binding resolved by the included process name (config / CLI / params-file), so the same module included twice configures independently.
  • Presents and compares two candidate grouping designs:
    • Option A — a group attribute + a single ToolArg type (nested blocks in params {}).
    • Option B — record-typed params (ToolOpt / ToolArgs), flat params {}author's recommendation.
  • Programmatic access (.name / .value), tool-option serialization rules, validation/errors, and migration from ext.args.

Supersedes and replaces #6769 and #6770, which are being closed in favor of this unified ADR.

🤖 Generated with Claude Code

Consolidates the module-parameters (#6769) and module-tool-arguments (#6770) drafts into a single specification, on the premise that a tool flag is just a parameter whose string form is `-K <value>`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@netlify

netlify Bot commented Jun 25, 2026

Copy link
Copy Markdown

Deploy Preview for nextflow-docs ready!

Name Link
🔨 Latest commit 46b381c
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs/deploys/6a43cd946a6c9b00075e7e07
😎 Deploy Preview https://deploy-preview-7260--nextflow-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

This was referenced Jun 25, 2026

Module parameters are declared in a module-level `params {}` block, reusing the workflow-params syntax (`name: Type` / `name: Type = default`). The two options below differ only in how grouped/tool params are declared *inside* this block.

> The original *Module Parameters* draft (PR #6769) weighed a **process-level** `params:` block against this **module-level** `params {}` block, and the module-level block is adopted here for continuity with workflow params. Note that the "Option A / Option B" labels in *this* ADR denote a **different axis** — how grouping is expressed — than the A/B options in PR #6769 (which were about host placement).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ADR doesn't really explain why the process params: option was abandoned. That question was never resolved, and if anything, it seems like people favored the process params: option

Using the params block creates an ambiguity when there is a process and an entry workflow, or when there are multiple processes

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point — I should spell out the rationale in the ADR.

The idea is to keep a single, consistent way to define parameters rather than spread the concept across pipeline, process and sub-workflow levels. Underlying that, it's also about what a param is: an immutable value resolved once and overridable at launch (CLI + config) — essentially the counterpart to a process/workflow input. A process-level params: block tends to blur that boundary, whereas anything that needs to vary per task is really an input.

On the ambiguity you mention (process + entry workflow, or multiple processes): agreed it needs care. The override path is scoped to the included process name (-process.<NAME>.params.…, withName:), so the same module included twice can be configured independently — I'll make that link clearer in the text.

@bentsherman

Copy link
Copy Markdown
Member

I saw module params and tool arguments as two alternative approaches... I think trying to combine them has made the ADR less coherent

Might be better to throw out the references to previous ADRs and just frame this ADR around the problem you are trying to solve, whether that be replacing ext.args or something else

The main challenge with ext.args is that it often depends on the task inputs. So any approach that is not dynamic per-task is not a feasible replacement for ext.args. It may have some other use case, but it doesn't cover ext.args

Some ideas we discussed on the previous ADRs:

  • a tools map namespaced by command/subcommand (e.g. tools.samtools_view.*)

  • for each tool, storing both an args list and kwargs map to support arbitrary CLI options

  • declaring args / prefix / suffix in the input record so that they are both typed and dynamic per-task (likely adds operator boilerplate to workflow logic)

Comment thread adr/20260624-module-parameters.md Outdated
Comment on lines +148 to +155
| Value | Renders as |
|-------|------------|
| `K = 100000000` | `-K 100000000` |
| `Y = true` | `-Y` |
| `output_fmt = "cram"` | `--output_fmt cram` |
| unset / null / false | `` (omitted) |

**Prefix inference** (overridable via `<Type>:prefix`): a single-character name uses `-`; a name of two or more characters uses `--`. **Boolean** `true` emits only `prefix+name`. An `enum` constrains the allowed values; a `default` supplies the value when unset. Components are accessible via `.name` and `.value`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the serialization imposes space between the prefix+name and value. It disables options like -02, -j4 or --name=value. I think we should add a kind of <Type>:separator to cover this case

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks — the space-only form is too rigid and can't express -O2, -j4 or --name=value.

I've added a <Type>:separator attribute (ToolArg:separator / ToolOpt:separator) next to the existing :prefix, defaulting to a single space. Set it to '' for glued forms (-O2, -j4) or '=' for --name=value; booleans ignore it. Updated the serialization section to reflect this.

Document a `<Type>:separator` attribute (ToolArg:separator / ToolOpt:separator) so tool options can render glued (-O2, -j4) or =-joined (--name=value) forms, not only space-separated. Addresses review feedback from @jorgee.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso

Copy link
Copy Markdown
Member Author

@bentsherman I think the main thing driving this is how unwieldy ext.args has become: the values are scattered across many config files, hidden, and neither documented nor validated. The aim is to normalise them as regular params — scoped to the target module/process, documented, typed and schema-backed — and to avoid overloading the syntax with yet another layer dedicated only to tool params.

On the dynamic-per-task point: that's intentionally out of scope. As noted on the other thread, a param is an immutable launch-time value, so anything that has to vary per task belongs in a process input rather than a param. This isn't meant as a drop-in for those per-task uses of ext.args.

It's admittedly a bit of a painful change, but I think it's worth it to bring module settings into the open as well-documented, actionable parameters.

@bentsherman

Copy link
Copy Markdown
Member

On the dynamic-per-task point: that's intentionally out of scope. As noted on the other thread, a param is an immutable launch-time value, so anything that has to vary per task belongs in a process input rather than a param. This isn't meant as a drop-in for those per-task uses of ext.args.

Thanks for the clarification. It is tricky though -- as a module author, I don't know how other users will use my module in their pipeline. They might provide a static setting or a per-sample setting, depending on what they're doing. So if my options as the module author are (1) a process input (which can be dynamic if needed) or (2) a module param (which can't), I will always choose a process input to be safe.

In other words, the per-task use case comes from users of a module, not the module itself

It might be useful to take a mid-sized nf-core pipeline like methylseq -- maybe even isolate just a subworkflow -- and have an agent reason about how to replace the ext settings with a type-safe approach

@bentsherman

Copy link
Copy Markdown
Member

From our discussion today, there are two core problems:

  1. Documenting available CLI options for a tool so that an agent knows what can be used
  2. Providing a way to supply tool args via nextflow module run instead of ext.args in a companion config file

Related requirements:

  • Users don't want to be forced to document/maintain every single tool option
  • Ideal interface should support both direct usage via nextflow module run and pipeline composition (dynamic per-task tool args from a channel)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants