Skip to content

incumbent spokes: optionally write solution on each new best (#285)#709

Merged
DLWoodruff merged 4 commits into
Pyomo:mainfrom
DLWoodruff:incumbent-on-improvement-285
May 17, 2026
Merged

incumbent spokes: optionally write solution on each new best (#285)#709
DLWoodruff merged 4 commits into
Pyomo:mainfrom
DLWoodruff:incumbent-on-improvement-285

Conversation

@DLWoodruff
Copy link
Copy Markdown
Collaborator

@DLWoodruff DLWoodruff commented May 17, 2026

Summary

  • Fixes Incumbent finders should optionally write a new solution immediately #285 (good-first-issue). The first-stage solution was previously written only at end-of-run via WheelSpinner.write_first_stage_solution. This PR adds an opt-in CLI flag so xhat-finder spokes snapshot the first-stage solution on every new best inner bound — useful for long runs where downstream tooling wants the current incumbent without waiting for the cylinder system to finish.
  • New flag: --incumbent-on-improvement-filename-prefix <prefix> (registered in Config.popular_args). Default None (disabled).
  • When set, _BoundSpoke.update_if_improving writes <prefix>_<NNNN>.csv and <prefix>_<NNNN>.npy on each new best, where <NNNN> is a per-spoke zero-padded counter starting at 0000 so successive incumbents don't overwrite each other.
  • Forwarded via cfg_vanilla.shared_options so all hub/spoke setups built through the vanilla factories pick it up automatically.
  • Fail-soft: if write_first_stage_solution raises (typically because the spoke uses update_best_solution_cache=False and the cache isn't populated — xhatlooper, xhatshuffle), warn once on rank 0 and disable further per-improvement writes from that spoke. Happy paths: xhatxbar and xhatspecific (which use update_best_solution_cache=True, so best_solution_cache is populated before the writer is called).

Open questions

  • Counter-suffixed filenames vs. rolling overwrite (<prefix>.csv overwritten on each improvement): I went with counters so the trajectory is recoverable. Easy to change if rolling is preferred.
  • Could also write the full tree solution via write_tree_solution instead of just first-stage; held off because tree writes are heavier and most users want the first-stage candidate.

Test plan

  • ruff check clean on changed files
  • New unit tests in mpisppy/tests/test_incumbent_writing.py (13 tests, all passing locally) covering:
    • Config.popular_args() registers incumbent_on_improvement_filename_prefix with default None
    • cfg_vanilla.shared_options forwards the prefix into the options dict on both hub and spoke
    • InnerBoundSpoke._maybe_write_incumbent_on_improvement no-ops when prefix is None or the per-spoke disabled flag is set
    • Happy path writes <prefix>_<NNNN>.csv then <prefix>_<NNNN>.npy (with the npy serializer), increments the counter, and zero-pads to four digits; counter advances across repeated calls
    • Fail-soft branch: a RuntimeError from write_first_stage_solution produces exactly one warning on rank 0, sets the disabled flag, does NOT bump the counter, and short-circuits subsequent calls; non-zero ranks stay silent but still set the disabled flag
  • Test file wired into both run_coverage.bash and .github/workflows/test_pr_and_main.yml unit-tests job so codecov/patch picks up the new lines
  • CI on this PR

🤖 Generated with Claude Code

Previously the first-stage solution was written only at end-of-run via
WheelSpinner.write_first_stage_solution. For long runs, users want a
snapshot every time an xhat-finder spoke discovers a new best inner
bound -- e.g., so downstream tooling can pick up a usable solution
before the cylinder system finishes.

- Add CLI flag --incumbent-on-improvement-filename-prefix <prefix>
  (mpisppy.utils.config.Config.popular_args).
- Forward cfg.incumbent_on_improvement_filename_prefix through
  cfg_vanilla.shared_options so it lands in opt_kwargs.options.
- In _BoundSpoke.update_if_improving, after a new best is broadcast,
  call self.opt.write_first_stage_solution to <prefix>_<NNNN>.csv and
  <prefix>_<NNNN>.npy, where <NNNN> is a per-spoke zero-padded counter
  starting at 0000. The writer self-gates on cylinder_rank == 0, so the
  helper is safe to call on every rank.
- Fail soft: if write_first_stage_solution raises (e.g., for xhatter
  paths like xhatlooper / xhatshuffle that pass
  update_best_solution_cache=False and don't populate the spbase
  best-solution cache), warn once on rank 0 and disable further writes
  from that spoke. xhatxbar / xhatspecific (which use the default
  update_best_solution_cache=True) are the happy path.

Fixes Pyomo#285

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.32%. Comparing base (bbe8fba) to head (0c637cc).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #709      +/-   ##
==========================================
+ Coverage   71.28%   71.32%   +0.03%     
==========================================
  Files         154      154              
  Lines       19438    19458      +20     
==========================================
+ Hits        13857    13879      +22     
+ Misses       5581     5579       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

DLWoodruff and others added 3 commits May 17, 2026 16:22
Adds mpisppy/tests/test_incumbent_writing.py with 13 unit tests across
three classes:

- TestConfigRegistration: Config.popular_args() registers the new option
  with default None.
- TestSharedOptionsForwarding: cfg_vanilla.shared_options forwards the
  prefix into the options dict on both hub and spoke surfaces.
- TestMaybeWriteIncumbent: drives InnerBoundSpoke._maybe_write_incumbent_on_improvement
  against a SimpleNamespace stub to cover the no-op (prefix None /
  already disabled), happy-path (csv + npy writes, 4-digit zero-padded
  counter, counter advances across calls), and fail-soft branches
  (RuntimeError → rank-0 warning, disabled flag set, counter NOT
  incremented; silent on rank != 0; subsequent calls short-circuit).

Wired into run_coverage.bash and .github/workflows/test_pr_and_main.yml
unit-tests job so codecov/patch picks up the new patch lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@DLWoodruff DLWoodruff merged commit b5edc38 into Pyomo:main May 17, 2026
31 checks passed
@DLWoodruff DLWoodruff deleted the incumbent-on-improvement-285 branch May 17, 2026 23:51
@bknueven
Copy link
Copy Markdown
Collaborator

Is there an issue with xhatshuffle which this PR brings to light? I'm a bit confused that all paths aren't "happy".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incumbent finders should optionally write a new solution immediately

2 participants