feat(run_nlme): add copy_dataset option to keep dataset in place#89
feat(run_nlme): add copy_dataset option to keep dataset in place#89roninsightrx wants to merge 2 commits into
Conversation
Add a `copy_dataset` argument to `run_nlme()` (default `FALSE`) controlling whether the input dataset is copied into the per-run folder. When `FALSE`, the dataset is left in its existing location and the model's `$DATA` record points at its absolute path; when `TRUE`, the prior behaviour (copy into the run folder as `data.csv`) is preserved. Only applies to on-disk datasets; in-memory data frames are always written into the run folder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a copy_dataset option to run_nlme() / prepare_run_folder() to control whether an on-disk input CSV is copied into the per-run folder or referenced in-place via an absolute $DATA path.
Changes:
- Add
copy_datasetargument torun_nlme()(defaultFALSE) and plumb through toprepare_run_folder(). - Update
prepare_run_folder()to optionally keep the dataset in its existing location instead of copying todata.csv. - Add a test covering
prepare_run_folder(copy_dataset = TRUE/FALSE)behavior and update generated docs / version.
Reviewed changes
Copilot reviewed 4 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
tests/testthat/test-run_nlme.R |
Adds test coverage for copy_dataset behavior in prepare_run_folder(). |
R/run_nlme.R |
Introduces copy_dataset parameter and forwards it to run-folder preparation. |
R/prepare_run_folder.R |
Implements conditional dataset copying vs. in-place absolute-path $DATA. |
man/run_nlme.Rd |
Documents new copy_dataset parameter for run_nlme(). |
man/prepare_run_folder.Rd |
Updates usage signature to include copy_dataset. |
DESCRIPTION |
Bumps package version. |
Files not reviewed (2)
- man/prepare_run_folder.Rd: Language not supported
- man/run_nlme.Rd: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| obj <- prepare_run_folder( | ||
| id = id, | ||
| model = model, | ||
| path = path, | ||
| data = data, | ||
| force = force, | ||
| auto_stack_encounters = auto_stack_encounters, | ||
| copy_dataset = copy_dataset, | ||
| verbose = verbose |
There was a problem hiding this comment.
Good catch — this was a real bug. run_nlme() now records whether the input was an in-memory data.frame (data_in_memory) before it is converted to a tempfile, and forces the run-folder write via copy_dataset = copy_dataset || data_in_memory. So data.frame input is always materialized into the run folder and $DATA never points at the ephemeral tempfile. Fixed in 30fefcd.
| if(!copy_dataset) { | ||
| ## Leave the dataset in its existing location and point $DATA at its | ||
| ## absolute path. The file is not modified (so no quoted-header | ||
| ## rewrite); the user is responsible for the dataset being NONMEM-ready. | ||
| if(verbose) cli::cli_process_start("Using dataset in existing location (not copying into run folder)") | ||
| dataset_path <- normalizePath(data, mustWork = TRUE) | ||
| } else { |
There was a problem hiding this comment.
Agreed. The $DATA-resolved branch now honors copy_dataset: when FALSE it points $DATA at the resolved file’s absolute path instead of unconditionally copying into fit_folder/data.csv. Note this branch only triggers when data = NULL and model$dataset is NULL; the in-memory model$dataset case still always writes into the run folder by design. Fixed in 30fefcd.
| dataset_path <- normalizePath(data, mustWork = TRUE) | ||
| } else { | ||
| if(verbose) cli::cli_process_start("Copying dataset") | ||
| file.copy(from = data, to = dataset_path) |
There was a problem hiding this comment.
Done — both file.copy() calls now check isTRUE(file.copy(...)) and abort with a clear message identifying the source and destination paths on failure. Fixed in 30fefcd.
- Force the run-folder write for in-memory data.frame input. Such input is converted to an ephemeral tempfile before prepare_run_folder(), so copy_dataset = FALSE would otherwise leave $DATA pointing at a tempfile that disappears at session end. - Honor copy_dataset when the dataset is resolved from the model's $DATA record (a genuine on-disk file), instead of always copying. - Check file.copy() return values and abort with a clear message on failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a
copy_datasetargument torun_nlme()(defaultFALSE) controlling whether the input dataset is copied into the per-run folder. WhenFALSE, the dataset is left in its existing location and the model's$DATArecord points at its absolute path; whenTRUE, the prior behaviour (copy into the run folder asdata.csv) is preserved. Only applies to on-disk datasets; in-memory data frames are always written into the run folder.