Skip to content

Split population generation from running the model.#134

Open
plietar wants to merge 1 commit into
mrc-ide:mainfrom
plietar:refactor-init
Open

Split population generation from running the model.#134
plietar wants to merge 1 commit into
mrc-ide:mainfrom
plietar:refactor-init

Conversation

@plietar
Copy link
Copy Markdown
Member

@plietar plietar commented Jul 28, 2025

Generating the population data (eg. assigning individuals to locations) can take significant amounts of time. By refactoring the interface a little bit, we can split off the generation step into its own function, which allows it to be re-used, saving time along the way.

This is especially useful during development or benchmarking of the main model. For producing scientifically useful results it may be preferable to keep generating population data each time.

By default run_simulation will generate a new population each time it is called, but it can be passed a population_data argument to avoid this. The population_data is obtained by calling generate_population_data.

The population data object is free of any objects from the individual package: it has normal value semantics and it can be serialized to a file.

I've imported strict_list from reside.utils and used it to define the parameter and variable list. It made refactoring much easier and I think is a good addition, though it could of course be removed.

@plietar plietar requested a review from cwhittaker1000 July 28, 2025 17:05
@plietar
Copy link
Copy Markdown
Member Author

plietar commented Jul 28, 2025

This should allow implementing #133

Generating the population data (eg. assigning individuals to locations)
can take significant amounts of time. By refactoring the interface a
little bit, we can split off the generation step into its own function,
which allows it to be re-used, saving time along the way.

This is especially useful during development or benchmarking of the main
model. For producing scientifically useful results it may be preferable
to keep generating population data each time.

By default `run_simulation` will generate a new population each time it
is called, but it can be passed a `population_data` argument to avoid
this.  The `population_data` is obtained by calling
`generate_population_data`.

The population data object is free of any objects from the individual
package: it has normal value semantics and it can be serialized to a
file.

I've imported strict_list from [reside.utils](reside.utils) and used it
to define the parameter and variable list. It made refactoring much
easier and I think is a good addition, though it could of course be
removed.

[reside.utils]: https://github.com/reside-ic/reside.utils
@cwhittaker1000
Copy link
Copy Markdown
Collaborator

Huge thank you for this @plietar - things are hectic with the move atm, but will get to this over the weekend or early next week. From a quick skim, looks like it will be very helpful, thank you!

@cwhittaker1000
Copy link
Copy Markdown
Collaborator

@claude can you review this for me and let me know what you think please?

@claude
Copy link
Copy Markdown

claude Bot commented Feb 17, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@cwhittaker1000
Copy link
Copy Markdown
Collaborator

@claude can you review this for me and let me know what you think please?

@claude
Copy link
Copy Markdown

claude Bot commented Feb 17, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants