Skip to content

Wrapping rhea.py into a python package#9

Open
ppreshant wants to merge 10 commits into
mainfrom
python_package_i8
Open

Wrapping rhea.py into a python package#9
ppreshant wants to merge 10 commits into
mainfrom
python_package_i8

Conversation

@ppreshant
Copy link
Copy Markdown
Member

@ppreshant ppreshant commented May 8, 2026

This pull request refactors the rhea.py codebase to improve usability and compatibility with pip git+ based installation inside other packages. other changes include

  • removing reliance on global variables which are incompatible with the wrapping of the script within a main() function, and
  • updating the documentation and environment setup to reflect these improvements.

fixes #8

Testing notes

  • Prashant tested by running rhea on the t0.fasta and t1.fasta examples. (8/May/26)

Details of improvements

Packaging and CLI improvements:

  • Added a pyproject.toml file to define rhea as a Python package, including metadata, dependencies, and a CLI entry point (rhea command now runs the tool).
  • Introduced a rhea/__init__.py with a main() function to serve as the CLI entry point, enabling pip install -e . usage and direct invocation via the rhea command.

Refactoring for maintainability and testability:

  • Refactored rhea.py to remove global variables (e.g., NODE_LENGTH_DICT, SEQ_BP_DICT, COVERAGE_DICTS), passing necessary state explicitly between functions. This improves modularity and makes the code easier to test and extend [1] [2] [3] [4] [5] [6].
  • Updated multiple function signatures and internal logic to accept explicit arguments for coverage and sequence dictionaries, and to return these as needed [1] [2] [3] [4] [5].

Documentation and environment updates:

  • Updated README.md to clarify installation and usage, including instructions for package installation, running via the CLI, and environment setup [1] [2].
  • Added pip to the environment.yml dependencies to support package installation.

Detailed changes:

1. Packaging and CLI improvements

  • Added pyproject.toml to enable rhea as a pip-installable package with a CLI entry point (rhea).
  • Created rhea/__init__.py with a main() function that loads and runs the main script, supporting both CLI and package usage.

2. Refactoring for maintainability

  • Replaced global mutable state (coverage and node dictionaries) with explicit function arguments and return values throughout rhea.py, improving code clarity and testability [1] [2] [3] [4] [5] [6].
  • Updated function signatures and logic to propagate these changes, including coverage calculation, normalization, and SV detection [1] [2] [3] [4] [5].

3. Documentation and environment updates

  • Improved README.md with clearer setup and usage instructions, reflecting the new CLI and installation process [1] [2].
  • Added pip to environment.yml to support editable installs (pip install -e .).

ppreshant added 4 commits May 8, 2026 16:39
- Goal: for install with `pip git+`
- using the `pyproject.toml` approach with an entry script (`__init__.py`) inside a `rhea/` dir that counts as the `module`
… the conda dependency list

- Note: `flye` and `minigraph` are the two dependancies not available in pip; hence we need the `conda` still
- Pass them as arguments to all functions.
- Variables are now scoped to where they are used.
- Modified variables include:
  - COVERAGE_DICTS
  - COVERAGE_EDGES_DICTS
  - NODE_LENGTH_DICT
  - SEQ_BP_DICT
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the existing rhea.py CLI tool to be installable as a Python package (supporting pip install -e . / pip install git+...) and reduces reliance on global mutable state by passing coverage/state explicitly through functions.

Changes:

  • Added Python packaging metadata (pyproject.toml) and a rhea console-script entry point.
  • Introduced a rhea/__init__.py wrapper main() intended to run the CLI via the rhea command.
  • Refactored rhea.py coverage/SV detection flow to pass coverage dictionaries explicitly instead of using globals; updated README/env setup accordingly.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
rhea/__init__.py Adds CLI entry point wrapper that dynamically loads rhea.py.
pyproject.toml Defines the project and installs a rhea script entry point.
rhea.py Refactors coverage/state handling to reduce global variables; adds main() function.
README.md Updates usage/install instructions to prefer rhea ... CLI invocation.
environment.yml Adds pip to support editable installs.
.gitignore Adds ignores for build artifacts and standardizes results directory name.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rhea/__init__.py Outdated
Comment thread rhea/__init__.py Outdated
Comment thread pyproject.toml
Comment thread README.md
Comment thread rhea/core.py Outdated
Comment thread rhea/core.py Outdated
ppreshant and others added 3 commits May 8, 2026 18:57
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- rest of the functions moved to `rhea/core.py`
- all code should be within the internal `rhea/` dir for the python project
- updated pyproject.toml with simple `packages = ["rhea"]` to define package contents explicitly
- expanded the return list in `create_coverage_df`
@ppreshant
Copy link
Copy Markdown
Member Author

@copilot do a final review

Agent-Logs-Url: https://github.com/treangenlab/rhea/sessions/c9b84e03-0627-4380-93be-df25f005d03d

Co-authored-by: ppreshant <14856479+ppreshant@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI commented May 9, 2026

@copilot do a final review

Final review complete. I fixed one packaging issue so pip installs now pull required Python deps and enforce Python >=3.8 in pyproject.toml (commit a36c01f). I also revalidated with pip install -e ., rhea --version, pip check, and ran parallel validation (Code Review + CodeQL).

ppreshant added 2 commits May 8, 2026 21:20
- fix #10
- Updated the channel order in yml to install latest packages including `pandas=3.0.2` which causes this error
Fix pandas' deprecated `applymap` -> 'map`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Need to wrap rhea.py into a python package for installation with pip git+

3 participants