Beef up `GaussianProcessSurrogate` by AdrianSosic · Pull Request #745 · emdgroup/baybe

AdrianSosic · 2026-02-10T16:25:26Z

Rough agenda:

#743 (Cleanup)

Clean up GP code

#746 (GP Components)

Extract current GP setting into preset
Add EDBO / SMOOTHED_EDBO preset
Generalize kernel machinery into component machinery using generics
Add configurability for mean / likelihood
Turn current mean / noise factories into serializable classes using new generic mechanism
Add low-level support for gpytorch GP components

#747 (Index Kernel)

Add BayBE IndexKernel class
Add BayBE PositiveIndexKernel class

#763 (Kernel Arithmetic)

improved + and * operations for Kernels

#748 (Active Dimensions)

Enable active dimension control for kernels
Absorb hardcoded IndexKernel logic into kernel factory
Implement deprecation mechanism for breaking change

#776 (Kernel Factory Validation)

Ensure kernel factories check if they support their given parameters

#752 (CHEN Preset)

Add CHEN preset

#757 (BoTorch Preset)

Add BOTORCH preset

#789 (FitCriterion)

Make optimization criterion configurable

Must TODO

Stratification for multitask likelihoods
Add decorator for extending presets to apply mechanism they don't cover themselves (e.g. transfer learning)
Decide on new BayBE defaults (i.e. do we use BOTORCH, CHEN, etc, or some heuristic on top)
Documentation

Optional TODO

Add HVARFNER preset
Enable serialization for gpytorch GP components
Add preset Setting attribute

To be added but out of scope

Make optimizer configurable

* Cleans up the `GaussianProcessSurrogate` class * Adds transfer learning tests asserting that the mechanism works regardless of which tasks are represented in the training data

* Define mean functions for the available presets * Pass both mean and likelihood when instantiating preset

DevPR, parent is #745 Makes the optimization criterion of the `GaussianProcessSurrogate` model configurable, in the form of a new `FitCriterion` enum. Potentially, this might be generalized to a class-based approach in the future if more configuration options are required, but for now the simpler solution serves all existing use cases.

Now properly handles the regular/task parameter split

DevPR, parent is #745 Adds the `BOTORCH` preset for GPs. ### Important information * I think it's critical to actually assert that the preset exactly recovers the BoTorch behavior, in the form of a test, for mainly two reasons: 1. The construction involves quite a few things to be configured, i.e. handling both singletask/multitask (the latter even requiring a new custom gpytorch module), setting all sorts of priors correctly, etc. Blindly believing that everything is correct and then just claiming `this is the BoTorch behavior` seems like a bad idea. The test ensures this explicitly. 2. It also as an automatic alert mechanism for all situations when something is changed on the BoTorch side, informing us about breaking changes that yield different behavior but are not fully documented (which happened already several times). * I've also invested quite some effort to test the new multitask mean logic, i.e. that it not only recovers the BoTorch logic but that it also fills *one* of the missing gaps that will ultimately make our transfer learning model *truely scale-invariant* w.r.t. the different input tasks. In particular, I made sure that the only missing piece is the noise model stratification, which should be added in a follow-up PR and is the analogous to the stratification over means shipped by this PR. (This explains the changes to the streamlit script.)

AdrianSosic and others added 4 commits February 10, 2026 17:18

Add transfer learning test

2c1da48

Deduplicate task parameter logic in search space

cefd5d6

Clean up Gaussian process class

d23945d

GP Cleanup (#743)

9092b90

* Cleans up the `GaussianProcessSurrogate` class * Adds transfer learning tests asserting that the mechanism works regardless of which tasks are represented in the training data

AdrianSosic self-assigned this Feb 10, 2026

AdrianSosic added enhancement Expand / change existing functionality new feature New functionality on hold PR progress is awaiting for something else to continue refactor and removed on hold PR progress is awaiting for something else to continue labels Feb 10, 2026

AdrianSosic added 18 commits February 10, 2026 17:54

Extract current GP defaults in into separate EDBO module

5e8c7de

Add EDBO presets

6b48716

Introduce generic component factories

577abd2

Make serialization utilties handle generics

19220bd

Correctly (un)structure generic classes

b0cd172

Add execution path for non-generic classes

50e531e

Add support for GPyTorch kernels

23a71a7

Block serialization of GPyTorch kernels

f669a21

Enable configuration of GP mean

8234082

Enable configuration of GP likelihood

e15e243

Complete the current preset framework

108e082

* Define mean functions for the available presets * Pass both mean and likelihood when instantiating preset

Update CHANGELOG.md

50d0a2a

Update attribute docstrings

aeb5d8a

Reorganize modules into subpackage

ffb8702

Add missing components to TypeVar

d908fcb

Add missing entries to __init__.py

9df998f

Deduplicate code in default preset

7a509d4

Add execution path for non-generic classes

b1f6215

AdrianSosic added the dev label Feb 11, 2026

Fix typing issues

0056011

This was referenced May 7, 2026

Transfer Learning Decorator #790

Open

Add Preset documentation #791

Closed

AdrianSosic and others added 26 commits May 11, 2026 16:05

Rename preset factory aliases

2c49cc8

Fall back to BayBE fit criterion for unspecified cases

f11e111

Merge branch 'main' into dev/gp

4f89af4

Enable multitask mode for surrogate streamlit

316251d

Add BOTORCH preset

805d680

Extend BoTorch preset test to multitask case

45d4a62

Add custom GPyTorch components to replicate BoTorch logic

6dfa935

Extend BoTorch factories to multitask case

781515b

Add kernel active dimension validation to ICMKernelFactory

fca6ac8

Fix KernelFactory return types

15b10be

Make BotorchKernelFactory support parameter selection

46b96a4

Fix active dimensions validation

994e1a8

Bypass kernel warning for presets

5b312d9

Update CHANGELOG.md

95f2305

Rename on-task/off-task to target/source in streamlit

472d597

Fix missing fit_criterion_factory renamings

5be27f3

Fix dimension handling in BotorchKernelFactory

3050a6f

Now properly handles the regular/task parameter split

Add temporary ignore to pytest.ini

9c54d20

Fix deprecated .evaluate() call in test_kernels.py

873b8f6

Fix lazy imports

fbe2440

Fix dimension validation in ICMKernelFactory

fbe87ec

Fix kernel factory return types

0609e92

Drop duplicated kernel creation

f6f42a7

Fix vlines argument in streamlit script

7996e1e

Hardwire MLL as criterion for BoTorch preset

2deed6c

AdrianSosic mentioned this pull request May 20, 2026

Surrogates User Guide #801

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beef up `GaussianProcessSurrogate`#745

Beef up `GaussianProcessSurrogate`#745
AdrianSosic wants to merge 200 commits into
mainfrom
dev/gp

AdrianSosic commented Feb 10, 2026 •

edited by Scienfitz

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

AdrianSosic commented Feb 10, 2026 • edited by Scienfitz Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

#743 (Cleanup)

#746 (GP Components)

#747 (Index Kernel)

#763 (Kernel Arithmetic)

#748 (Active Dimensions)

#776 (Kernel Factory Validation)

#752 (CHEN Preset)

#757 (BoTorch Preset)

#789 (FitCriterion)

Must TODO

Optional TODO

To be added but out of scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

AdrianSosic commented Feb 10, 2026 •

edited by Scienfitz

Loading