feat: add --output-dir, --skip-plots, and summary.tsv for pipeline integration#31
Merged
Conversation
…ine use Add three features for batch/pipeline integration (e.g., Nextflow): 1. --output-dir: Explicit output directory that overrides the auto-generated path (folder-outputs/run_name/params). When set, all outputs go to the specified directory with deterministic paths. The auto-generated path remains the default for interactive use. 2. --skip-plots: Skip generating heatmap and logo SVG plots in the consensus step. Plots use matplotlib/seaborn/logomaker which is slow and unnecessary for headless pipeline execution. Available on both `instanexus` (main CLI) and `python -m instanexus.consensus` (module CLI). 3. summary.tsv: A stable summary file written to the experiment output directory at the end of the pipeline. Contains run metadata, paths, and consensus statistics in a single TSV for easy downstream parsing. All 2 existing tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The V2 handler clean_winnow_rescored() already checks multiple column name candidates, but main() only checked 'preds' and 'prediction_untokenised'. Winnow outputs use 'prediction', which was missed. Unified the column detection to match the V2 candidates list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
marcoreverenna
approved these changes
Apr 12, 2026
marcoreverenna
left a comment
Collaborator
There was a problem hiding this comment.
Everything looks good! Thanks for contributing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add three features to make InstaNexus suitable for automated pipeline execution (e.g., Nextflow/INFlow):
1.
--output-dir— Deterministic output pathsWhen set, all outputs go to the specified directory instead of the auto-generated
folder-outputs/run_name/assembly_mode_params/path. The auto-generated path remains the default for interactive use.2.
--skip-plots— Headless executionSkip generating heatmap (seaborn) and logo (logomaker) SVG plots in the consensus step. These are slow and unnecessary for pipeline execution. Available on both the main
instanexusCLI and thepython -m instanexus.consensusmodule CLI.3.
summary.tsv— Stable summary fileWritten to the experiment output directory at the end of the pipeline. Contains run metadata, output paths, and consensus statistics in a single TSV for easy downstream parsing by pipeline orchestrators.
Motivation
The INFlow pipeline (nf-core/denovoproteomics) wraps InstaNexus as separate Nextflow processes. It needs:
Test plan
instanexus --helpshows new flags🤖 Generated with Claude Code