Named StateSpaceModel v2 – rebase and updates after #607#654
Named StateSpaceModel v2 – rebase and updates after #607#654opherdonchin wants to merge 7 commits intopymc-devs:mainfrom
Conversation
|
It looks like you checked in several files related to your local dev environment, could you clean it up? |
cbea84d to
1bf3670
Compare
|
Sorry about that. The current PR includes only the changed files and a test: statespace.py |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #654 +/- ##
===========================================
+ Coverage 42.43% 69.24% +26.81%
===========================================
Files 63 73 +10
Lines 7207 7738 +531
===========================================
+ Hits 3058 5358 +2300
+ Misses 4149 2380 -1769
🚀 New features to boost your workflow:
|
jessegrabowski
left a comment
There was a problem hiding this comment.
Nice, this is looking really clean. Just a few requests.
Also be sure to install pre-commit (pip install pre-commit; pre-commit install; pre-commit run --all) so your PR gets formatted and linted correctly.
| ) | ||
|
|
||
| if name in self._tensor_variable_info: | ||
| gname = self.graph_name(name) |
There was a problem hiding this comment.
As above. gname is too cryptic. Another option would be to change name to base_name and make name the prefixed version, since this will now be the "canonical" name that users see.
There was a problem hiding this comment.
Under the new scheme (base_name for the unprefixed name and name for the name in the graph) it makes most sense to just call this variable name.
| raise ValueError( | ||
| f"{name} is not a model parameter. All placeholder variables should correspond to model " | ||
| f"parameters." | ||
| f"{name} is not a model data-variable. All placeholder variables should correspond to model " |
There was a problem hiding this comment.
| f"{name} is not a model data-variable. All placeholder variables should correspond to model " | |
| f"{name} is not a model data variable. All placeholder variables should correspond to model " |
| f"{name} is not a model parameter. All placeholder variables should correspond to model " | ||
| f"parameters." | ||
| f"{name} is not a model data-variable. All placeholder variables should correspond to model " | ||
| f"data-variables." |
There was a problem hiding this comment.
| f"data-variables." | |
| f"data variables." |
| n_obs=self.ssm.k_endog, | ||
| obs_coords=obs_coords, | ||
| register_data=True, | ||
| data_name=self.graph_name("data"), |
There was a problem hiding this comment.
Let's let users choose the data name in the PyMCStateSpace constructor in this PR as well, instead of hard coding it to "data". The default can still be data, but then this line becomes data_name = self.graph_name(self.data_name)
| from pymc_extras.statespace.core.statespace import PyMCStateSpace | ||
|
|
||
|
|
||
| def test_two_statespace_models_can_coexist_with_names(monkeypatch): |
There was a problem hiding this comment.
put this test in the test_model file, it doesn't need a new file.
|
The new commit includes all changes suggested and the pre-commit processing. |
|
Really like where this landed, it's looking very clean. Could you double-check that we don't need to make any changes to Component or StructuralTimeSeries? Those are in the models/structural module, and are a sort-of parallel StatespaceModel that needs to be cleaned up some day. |
ec6a4bd to
27a28ea
Compare
|
I'm working on the fix for structural models and in working on the Components create placeholders during I see a few options and I need help choosing:
Please let me know which direction you'd like to take it. |
|
We should be able to use You can check |
|
Any help from me needed here @opherdonchin @jessegrabowski ? |
27a28ea to
e950db5
Compare
|
@AlexAndorra: thanks for the nudge! I think what I've done makes sense and follows @jessegrabowski's suggestion. The branch is now rebased onto current main (v0.10.0) with a clean 3-commit history. What changed in the new commit:
I added appropriate tests in Let me know if this is a good fix and if anything else is needed before this is ready to merge? Thanks! |
jessegrabowski
left a comment
There was a problem hiding this comment.
some small quibbles, but this looks pretty much good to go!
|
|
||
| return new_ssm, new_variable_info, new_data_info | ||
|
|
||
| def _validate_symbolic_info(self): |
There was a problem hiding this comment.
This function is never called outside of tests. If you don't want a check in __init__ we should make this a test utility.
|
|
||
|
|
||
| class TestGraphReplacePlaceholderNamespacing: | ||
| """Tests for the graph_replace-based placeholder namespacing in StructuralTimeSeries.""" |
There was a problem hiding this comment.
| """Tests for the graph_replace-based placeholder namespacing in StructuralTimeSeries.""" |
The docstring adds nothing the name doesn't already tell me
There was a problem hiding this comment.
All of the tests need an AI cleanup pass. Tell it to make the tests maximally self-documenting with minimal additional comments
| ) | ||
|
|
||
|
|
||
| def test_two_statespace_models_can_coexist_with_names(monkeypatch): |
There was a problem hiding this comment.
I'd rather see a real end-to-end test here (including sampling, but using the mock_sample pymc.testing fixture)
| Creates new placeholder variables whose names are prefixed with the model name, | ||
| applies graph_replace to swap them into all SSM matrices, and rebuilds the | ||
| symbolic info objects to reference the new placeholders. |
There was a problem hiding this comment.
| Creates new placeholder variables whose names are prefixed with the model name, | |
| applies graph_replace to swap them into all SSM matrices, and rebuilds the | |
| symbolic info objects to reference the new placeholders. |
| matrices = [getattr(ssm, name) for name in LONG_MATRIX_NAMES] | ||
| replaced_matrices = graph_replace(matrices, replace=replacements, strict=False) | ||
|
|
||
| new_ssm = ssm.copy() |
|
Hi Jesse, thanks for the review! I think I've addressed all your suggestions:
In the process of writing the end-to-end test I also found and fixed a bug: Let me know if anything needs further adjustment! |
|
looks like you need to run pre-commit. The P0 special handling is a code smell. The name of variable is just syntactic sugar in pytensor. When we do graph_replace it's done by object, so it doesn't matter what the name is. I think it should be safe to just rename P0 as well. Could you try and see? Give it 5 minutes of effort, if it doesn't work it's fine we can leave it for later. Can you also fire up a notebook and make sure that all post estimation methods (sample_conditional_posterior/prior, sample_unconditional_posterior/prior, impulse_response_function, forecast) work fine with multiple models in one context? You don't have to write tests for all of them, just check once and verify. |
|
Thanks again for the detailed review, @jessegrabowski. I wish I had this kind of comfort with the codebase. Here's what I've addressed since the last round:
Let me know if there's anything else worth doing before this is merge-ready! |
|
Some tests are failing because the test model was being passed a name (like "data"). This was a harmless no-op before, but now results in the variable names changing and the test failing. Solution is to just not name these models. Other failures outside statespace are not your problem. |
Fixes in this update (commits
|
|
Thanks for the changes. I'm not an MCP server, please speak to me like a human. To answer your question, you can do either. It's in scope for the PR, but I also understand if you want to wrap it up and open an issue for later. |
|
Sorry. It's not about the AI assist so much as this being my first PR and I'm really not sure about how things work. If it's ok to put this out there, that would be great. I have colleagues who want to be able to use it from the official version and they don't need the other structures. I'll open an issue and get the other structures working, though. It shouldn't take much time. Opher |
|
AI's great and this PR is great too. Sorry I'm just on edge because I've been dealing with a slop avalanche elsewhere. We can merge after tests pass (except the ones that arent your issue) |
|
I appreciate it. Thanks! This was really fun and educaitonal to work on. If you want to point me at something else, feel free. I'll start on it after I get the other models done. Opher |
|
Sorry. Previous push didn't actually fix the bugs. Try now. |
Did you push? I'll trigger the CI as soon as you do |
|
Yes. I pushed. |
|
Are you sure? There's no new changes since your "try now" comment |
|
I'm pretty sure I'm sure. When I said "try now" that was right after I pushed what I think is the last change. I just check the code in the commit that is in the pull request and the name parmeters that were causing the bugs have been removed, the dead-end self._name is gone from StructuralTimeSeries and the misleading doc strings have been updated. We could run the tests again, but I'm at least 64% sure. Opher |
|
Could you update your local main and rebase onto main? I just merged a PR that clears the test failures that we were seeing here (but that were unrelated to your changes). I can help with it if you're not up for a git adventure. |
…nique graph naming Create a test to ensure multiple state space models can coexist with distinct names.
…d move namespace test
StructuralTimeSeries.__init__ now uses pytensor.graph_replace to create fresh, prefixed placeholder variables in all SSM matrices when a model name is provided. This replaces the previous metadata-only rename approach, which left the actual graph placeholders unchanged and caused silent aliasing when the same Component instance was reused across multiple named StructuralTimeSeries models. Key changes: - Add _prefix_placeholder_variables() method that builds a replacement mapping from old to new (prefixed) placeholder variables, applies graph_replace across all SSM matrices, and rebuilds SymbolicVariableInfo and SymbolicDataInfo with aligned names. - Add _validate_symbolic_info() diagnostic helper. - Pass name=name through to PyMCStateSpace.__init__. - Add 8 focused tests covering the aliasing bug, name alignment, graph correctness, and the validation helper.
- Drop fragile sv.name == sv.symbolic_variable.name assertions: the
tensor .name attribute is overwritten by PytensorRepresentation
__setitem__ ("initial_state_cov"), but graph_replace works by
object identity, not name, so correctness is not affected.
- Replace the P0 if-guards in every test with cleaner prefix-only
assertions; update test_corrupted_metadata_name_raises to match the
revised helper error message.
- Rename test_unnamed_model_preserves_original_placeholders to
test_unnamed_model_has_no_prefix_on_variable_names and give it a
meaningful assertion (registered names == param_names).
- Format test_statespace.py with ruff-format (long line wrap).
… for param lookup
…eries and build()
db0527c to
2b4f02e
Compare
Summary
This PR reintroduces the Named
StateSpaceModelchanges after rebasing ontomainfollowing the merge of #607.The previous version was submitted as #611. This update incorporates the changes introduced in #607 and resolves the resulting conflicts while preserving the intended named state space behavior.
No new conceptual changes are introduced beyond what was discussed in #611.
Changes
mainTests
Environment:
Results:
tests/statespace/corenutpie)tests/statespace/filterstest_kalman_filter_jax[cholesky])The single failing test occurs in the JAX Cholesky filter path and appears unrelated to the named model changes. All standard (non-JAX) filter paths and core state space tests pass.
Notes
main.