Skip to content

fix: replace Manager().list() with plain list to prevent GC crash#266

Closed
saschabuehrle wants to merge 2 commits intopymc-devs:mainfrom
saschabuehrle:fix/issue-265
Closed

fix: replace Manager().list() with plain list to prevent GC crash#266
saschabuehrle wants to merge 2 commits intopymc-devs:mainfrom
saschabuehrle:fix/issue-265

Conversation

@saschabuehrle
Copy link
Copy Markdown

Fixes #265

The Manager() in BART.__new__ is a local variable that gets garbage-collected after the method returns. Once its server process dies, all_trees (a ListProxy) becomes a dangling reference — any access to it (e.g. during sample_posterior_predictive with new data) raises TypeError: only integer scalar arrays can be converted to a scalar index.

Since nothing in the sampling step actually needs cross-process shared state (each chain gets its own copy of the step method), a plain list() works fine here and avoids the lifetime issue entirely.

— saschabuehrle

@aloctavodia
Copy link
Copy Markdown
Member

Please, check locally before pushing. Also, notice that mypy is failing.

@saschabuehrle
Copy link
Copy Markdown
Author

Fixed — added the type annotation mypy wanted. pre-commit should pass now.

@aloctavodia
Copy link
Copy Markdown
Member

Did you run the tests locally?

@saschabuehrle
Copy link
Copy Markdown
Author

Haven't run the full suite locally yet, setting it up now. The change itself is straightforward though - Manager().list() creates a multiprocessing ListProxy backed by a server process, which gets GC'd after predict() returns. A plain list doesn't have that lifecycle issue.

@aloctavodia
Copy link
Copy Markdown
Member

The tests should fail, you need Manager().list() so different chains can add the trees to the same list.

@juanitorduz
Copy link
Copy Markdown
Contributor

Can we maybe try fix1 from #265?

@saschabuehrle
Copy link
Copy Markdown
Author

Good point, agreed on Manager().list() for cross-chain aggregation.

I'll rework this toward the fix1 direction from #265 so we keep shared writes without the GC failure path, then push an updated patch.

Greetings, saschabuehrle

@saschabuehrle
Copy link
Copy Markdown
Author

Closing for now instead of keeping a broken approach open. I can send a fresh PR with the fix1 direction from #265 once I have it validated.

Greetings, saschabuehrle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Manager lifetime bug causes TypeError in sample_posterior_predictive for out-of-sample predictions

3 participants