Parallelization - 144 break uprecombine larger genomes by keshav-gandhi · Pull Request #154 · ncsa/NEAT

keshav-gandhi · 2025-08-29T04:22:56Z

Solved parallelization issues!

You can run this command if additional testing is wanted:
neat parallel -c config_template/keshav_config.yml

More information is in README.md.

joshfactorial

Okay, one general comment, it's coming together. Let's try a slight change to the structure. Instead of putting it under read_simulator/utils, this could be it's own submodule, as we have done with model_fragment_lengths and gen_mut_model. It's already working now as a primary submodule, which is what we want. But in terms of structure and maintainability, let's move it out and group the scripts together:

├── cli
├── common
├── gen_mut_model
├── __init__.py
├── __main__.py
├── model_fragment_lengths
├── models
├── model_sequencing_error
├── __pycache__
├── read_simulator
├── parallel_read_simulator
└── variants

I think explicitly calling it parallel_read_simulator would be more clear as well. But in the parallel_read_simulator folder, you'd add first __init__.py, then parallelize.py, split_inpts.py, and splice_inputs.py. Then, check the other __init__.py commands, but basically it's just an import, which signals to poetry to add that feature to the application. The cli/commands/parallel.py is fine where it is. Then it will be more clear when we come back to this in six months which parts are specific to that feature.

joshfactorial

One more general comment. It would be great if we could add some unit tests to these new functions.

There's an existing tests folder that you can add to, or doctests are fine too. A couple of tests for parallelize, split_inputs and stitch_outputs especially are what is important.

joshfactorial · 2025-09-04T01:51:38Z

Do a git fetch and then git pull origin main you may need to add the --rebase flag to finalize changes. There were a couple of edits to the readme. That will ensure that your readme commits won't get lost in the merge.

joshfactorial

Looking good!

keshav-gandhi added 10 commits July 9, 2025 22:36

Stitching output script implementation.

1b7935b

Splitting functions.

eea1c0f

Parallelization runner.

eebfacb

Removed Bio dependency.

7a69785

Added option to remove temporary files.

f9931f0

Option to not split if already done.

b1fa382

Double-checking exactness between constant-seed runs.

e0e79a6

Command neat parallel implementation before current edits.

3904c88

Partial progress on config errors.

49eca2e

Final touches on parallelization.

d9f77a6

keshav-gandhi linked an issue Aug 29, 2025 that may be closed by this pull request

Break up/recombine larger genomes #144

Closed

keshav-gandhi requested a review from joshfactorial August 29, 2025 04:23

joshfactorial reviewed Sep 4, 2025

View reviewed changes

keshav-gandhi and others added 2 commits September 9, 2025 01:19

Pull request comments.

834a416

Merge branch 'main' into 144-break-uprecombine-larger-genomes

cd76ec3

joshfactorial approved these changes Sep 9, 2025

View reviewed changes

joshfactorial merged commit ca2fc5e into main Sep 9, 2025
1 check passed

joshfactorial deleted the 144-break-uprecombine-larger-genomes branch September 9, 2025 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization - 144 break uprecombine larger genomes#154

Parallelization - 144 break uprecombine larger genomes#154
joshfactorial merged 12 commits into
mainfrom
144-break-uprecombine-larger-genomes

keshav-gandhi commented Aug 29, 2025

Uh oh!

joshfactorial left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joshfactorial left a comment

Uh oh!

joshfactorial commented Sep 4, 2025

Uh oh!

joshfactorial left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

keshav-gandhi commented Aug 29, 2025

Uh oh!

joshfactorial left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joshfactorial left a comment

Choose a reason for hiding this comment

Uh oh!

joshfactorial commented Sep 4, 2025

Uh oh!

joshfactorial left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshfactorial left a comment •

edited

Loading