incumbent spokes: optionally write solution on each new best (#285)#709
Merged
Merged
Conversation
Previously the first-stage solution was written only at end-of-run via WheelSpinner.write_first_stage_solution. For long runs, users want a snapshot every time an xhat-finder spoke discovers a new best inner bound -- e.g., so downstream tooling can pick up a usable solution before the cylinder system finishes. - Add CLI flag --incumbent-on-improvement-filename-prefix <prefix> (mpisppy.utils.config.Config.popular_args). - Forward cfg.incumbent_on_improvement_filename_prefix through cfg_vanilla.shared_options so it lands in opt_kwargs.options. - In _BoundSpoke.update_if_improving, after a new best is broadcast, call self.opt.write_first_stage_solution to <prefix>_<NNNN>.csv and <prefix>_<NNNN>.npy, where <NNNN> is a per-spoke zero-padded counter starting at 0000. The writer self-gates on cylinder_rank == 0, so the helper is safe to call on every rank. - Fail soft: if write_first_stage_solution raises (e.g., for xhatter paths like xhatlooper / xhatshuffle that pass update_best_solution_cache=False and don't populate the spbase best-solution cache), warn once on rank 0 and disable further writes from that spoke. xhatxbar / xhatspecific (which use the default update_best_solution_cache=True) are the happy path. Fixes Pyomo#285 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #709 +/- ##
==========================================
+ Coverage 71.28% 71.32% +0.03%
==========================================
Files 154 154
Lines 19438 19458 +20
==========================================
+ Hits 13857 13879 +22
+ Misses 5581 5579 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Adds mpisppy/tests/test_incumbent_writing.py with 13 unit tests across three classes: - TestConfigRegistration: Config.popular_args() registers the new option with default None. - TestSharedOptionsForwarding: cfg_vanilla.shared_options forwards the prefix into the options dict on both hub and spoke surfaces. - TestMaybeWriteIncumbent: drives InnerBoundSpoke._maybe_write_incumbent_on_improvement against a SimpleNamespace stub to cover the no-op (prefix None / already disabled), happy-path (csv + npy writes, 4-digit zero-padded counter, counter advances across calls), and fail-soft branches (RuntimeError → rank-0 warning, disabled flag set, counter NOT incremented; silent on rank != 0; subsequent calls short-circuit). Wired into run_coverage.bash and .github/workflows/test_pr_and_main.yml unit-tests job so codecov/patch picks up the new patch lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
|
Is there an issue with |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WheelSpinner.write_first_stage_solution. This PR adds an opt-in CLI flag so xhat-finder spokes snapshot the first-stage solution on every new best inner bound — useful for long runs where downstream tooling wants the current incumbent without waiting for the cylinder system to finish.--incumbent-on-improvement-filename-prefix <prefix>(registered inConfig.popular_args). DefaultNone(disabled)._BoundSpoke.update_if_improvingwrites<prefix>_<NNNN>.csvand<prefix>_<NNNN>.npyon each new best, where<NNNN>is a per-spoke zero-padded counter starting at0000so successive incumbents don't overwrite each other.cfg_vanilla.shared_optionsso all hub/spoke setups built through the vanilla factories pick it up automatically.write_first_stage_solutionraises (typically because the spoke usesupdate_best_solution_cache=Falseand the cache isn't populated —xhatlooper,xhatshuffle), warn once on rank 0 and disable further per-improvement writes from that spoke. Happy paths:xhatxbarandxhatspecific(which useupdate_best_solution_cache=True, sobest_solution_cacheis populated before the writer is called).Open questions
<prefix>.csvoverwritten on each improvement): I went with counters so the trajectory is recoverable. Easy to change if rolling is preferred.write_tree_solutioninstead of just first-stage; held off because tree writes are heavier and most users want the first-stage candidate.Test plan
ruff checkclean on changed filesmpisppy/tests/test_incumbent_writing.py(13 tests, all passing locally) covering:Config.popular_args()registersincumbent_on_improvement_filename_prefixwith defaultNonecfg_vanilla.shared_optionsforwards the prefix into the options dict on both hub and spokeInnerBoundSpoke._maybe_write_incumbent_on_improvementno-ops when prefix isNoneor the per-spoke disabled flag is set<prefix>_<NNNN>.csvthen<prefix>_<NNNN>.npy(with the npy serializer), increments the counter, and zero-pads to four digits; counter advances across repeated callsRuntimeErrorfromwrite_first_stage_solutionproduces exactly one warning on rank 0, sets the disabled flag, does NOT bump the counter, and short-circuits subsequent calls; non-zero ranks stay silent but still set the disabled flagrun_coverage.bashand.github/workflows/test_pr_and_main.ymlunit-tests job so codecov/patch picks up the new lines🤖 Generated with Claude Code