Skip to content

Pr/bo only#187

Open
ju-he wants to merge 40 commits into
Helmholtz-AI-Energy:mainfrom
ju-he:pr/bo-only
Open

Pr/bo only#187
ju-he wants to merge 40 commits into
Helmholtz-AI-Energy:mainfrom
ju-he:pr/bo-only

Conversation

@ju-he
Copy link
Copy Markdown

@ju-he ju-he commented Feb 13, 2026

Description

This PR adds a full Bayesian Optimization workflow to Propulate and integrates it into the public API, docs, tutorials, and tests.

Summary of changes:

  • Adds new propagator BayesianOptimizer in propulate/propagators/bayesopt.py.
  • Implements GP-based BO on CPU (SingleCPUFitter, scikit-learn backend) with robust GP fitting (_robust_lbfgs) and diagnostics.
  • Adds acquisition functions and factory support:
    • EI, PI, UCB (implemented in minimization form as mu - kappa * sigma)
    • create_acquisition, rank-stretch scaling across MPI ranks
    • optional annealing and mid-run acquisition switching
  • Adds mixed search-space support (float, integer, ordinal integer, categorical via one-hot + projection).
  • Adds sparse subsampling for GP training (top_m elites + diversity sampling) and safer handling of non-finite losses.
  • Adds BO initial design strategies (sobol default, lhs, random) and hyperparameter optimization scheduling.
  • Exposes BO components in propulate/propagators/__init__.py and adds LCB alias export.
  • Updates propulate/population.py:
    • normalizes numpy scalar types to canonical Python types
    • fixes categorical slice reset in Individual.__setitem__
  • Adds comprehensive BO test suite in tests/test_bayesian_optimizer.py.
  • Adds docs and usage/tutorial integration:
    • docs/bayesian_optimizer.rst
    • docs/tut_bayesopt.rst
    • references from docs/usage.rst, docs/tut_propulator.rst, docs/algos_explained.rst
    • adds docs/autoapi/index.rst
  • Adds runnable example: tutorials/bayesian_optimizer_example.py.

Motivation/context:

  • Enables sample-efficient optimization for expensive black-box objectives.
  • Extends Propulate beyond population-based heuristics with a GP/acquisition-based optimizer.
  • Supports heterogeneous search spaces needed in practical HPO/NAS settings.

Dependencies required for this change:

  • Added runtime deps: scikit-learn, scipy

Validation notes:

  • All tests pass

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

ju-he added 30 commits February 12, 2026 11:04
- Scale inputs to [0,1]^d and evaluate the acquisition in scaled space.
- Use C * RBF (ARD) + WhiteKernel with sensible bounds; set small alpha.
- Honor `optimize_hyperparameters` and disable HPO until n >= 5d points.
- Handle PI’s sigma==0 path explicitly to avoid divide-by-zero warnings.

These changes prevent RBF length-scales from pegging upper bounds on
data-poor, deterministic benchmarks, improving BO stability and removing
the warning spam.
Implement mixed-type search space support through continuous relaxation
and projection for the Bayesian optimizer propagator.

Features:
- Integer parameters: continuous optimization with post-optimization rounding
- Categorical parameters: one-hot encoding with softmax projection
- Mixed spaces: support for float, int, and categorical parameters together
- Input validation with clear error messages
- Warning for high-dimensional position spaces (>100 dims)

Implementation:
- Add _project_to_discrete() helper for type-aware projection
- Update __init__ with type detection and dimension handling
- Modify __call__ to apply projection at all generation points
- Use position_dim for kernel and optimizer sizing
- Update documentation with mixed-type examples

Testing:
- 14 new comprehensive tests in test_bayesian_optimizer_mixed_types.py
- Integration test in test_bayesian_optimizer.py
- Verify backward compatibility with existing float-only tests
- All tests passing (25/25)
ju-he and others added 10 commits February 13, 2026 19:11
Adds acquisition_type="TS" to BayesianOptimizer. TS draws one joint GP
posterior sample over a candidate set and returns its argmin, bypassing
L-BFGS-B and requiring no xi/kappa tuning. Each MPI rank draws an
independent sample, providing natural parallel diversity without
rank_stretch. Includes unit/integration tests and documentation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant