Change sample_posterior_predictive API wrt to volatility#8209
Change sample_posterior_predictive API wrt to volatility#8209ricardoV94 merged 2 commits intopymc-devs:v6from
sample_posterior_predictive API wrt to volatility#8209Conversation
Documentation build overview
232 files changed ·
|
8a6137a to
9999e3d
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## v6 #8209 +/- ##
=====================================
Coverage ? 91.76%
=====================================
Files ? 124
Lines ? 20131
Branches ? 0
=====================================
Hits ? 18473
Misses ? 1658
Partials ? 0
🚀 New features to boost your workflow:
|
9999e3d to
4f7810c
Compare
sample_posterior_predictive API wrt to volatility
4f7810c to
6025611
Compare
6025611 to
c3eb5fd
Compare
c56451f to
c997f8e
Compare
32f1ce6 to
7fdaffd
Compare
7fdaffd to
5d85bee
Compare
5d85bee to
a6c7134
Compare
cbf271e to
73f2f5e
Compare
a35c162 to
c5fd97f
Compare
c5fd97f to
83975e3
Compare
e2b73af to
f5d5eb3
Compare
|
There is one test failing due to pymc-devs/pytensor#2051 Verified that the patch fixes it. Our step samplers don't eval before multiprocess splitting so numba compiles in each process... wasteful and shows the bug here. We'll just wait but we may want to tweak the steps to eval once after compile. Regular nuts pipeline does eval before split (in init nuts jitter) |
f5d5eb3 to
83975e3
Compare
jessegrabowski
left a comment
There was a problem hiding this comment.
General comment: the process of walking the fgraph and classifying nodes as volatile or not is exactly the function of a Feature, is there a reason why you didn't use one here?
jessegrabowski
left a comment
There was a problem hiding this comment.
Some comments.
With this refactor I think we should really think about renaming this function to something like conditional_resample then implement sample_posterior_predictive as a special case that only generates the posterior predictive. My logic is that the functionality we now offer goes well beyond just sampling the posterior predictive distribution, which is a well-defined object in the literature.
| # Default copies old `x**2` values from the trace. | ||
| pm.sample_posterior_predictive(idata, var_names=["det"], **kwargs) | ||
| # Force recomputation using the new `x**3` graph. | ||
| pm.sample_posterior_predictive(idata, var_names=["det"], sample_vars=["det"], **kwargs) |
There was a problem hiding this comment.
use sample_vars without var_names in this example?
There was a problem hiding this comment.
I think this is easier to follow, I'll add a note saying var_names wasn't needed as it would defaulted to the same
There was a problem hiding this comment.
sorry I thought this was on the section on volatility. Agree. Also removed the first example. The nudge being users shouldn't think about var_names (or at least not first). but sample_vars.
| Note: The samples of ``y`` are equivalent to its prior, since it does not depend on any other variables. | ||
| In contrast, the samples of ``z`` and ``det`` depend on the new samples of ``y`` and the posterior samples of | ||
| ``x`` found in the trace. | ||
| Note: The samples of y are equivalent to its prior, since it does not depend on any other variables. |
There was a problem hiding this comment.
We might want this to be in a .. warning:: directive
There was a problem hiding this comment.
done, but it's weird with the In contrast part after ... ?
A Feature is specifically about adding/keep meta information while a graph is being rewritten, we are not doing any graph rewrite here, just a snapshot analysis. I can't think of any angle where it would help |
This was discussed in the original issue (and I think more internally) and rejected, Back then I was advocating for it, although I didn't have as good a name as yours (except the re- in resample, as we are not always "re-sampling"). But now I don't agree so much. What is the strict criteria for distinguishing whether this is a So... I'm inclined to think there is no obvious set of arguments/restrictions that would constrict this function to only do "sample posterior predictive". A bit like "sample" may very well be used to get prior / prior_predictive draws (or compute deterministics), the meaning of the operation is not something you can pin-point from the computation being run. If you agree with this, the argument becomes why we should have function with another name that has exactly the same signature/behavior? Since for most users the function does what they want, let the power users make their own alias if they want, and keep official naming simple |
I don't agree that the ppd is hard to pointpoint though. It's defined by the fact that we're sampling the observed distribution, conditioned on the posterior samples -- it's the base case. The advantage would be to hide complexity from the median user. That said, it is likely sufficient that If this was already discussed let's move on instead of re-litigating it ad nauseam. I accept the judgement of the collective. |
As long as you use even one sample from the posterior it fits the mold. I don't think it must use all samples from the posterior (trivial case: dead path) or can't introduce new variables (e.g., marginalized variables that are being recovered, you are using a stochastic approximation to the posterior, conjugate posterior with closed-form solution)? Also maybe you didn't mark your variable as observed from PyMC's standpoint (but you know this is your observation). I don't think the meaning can be decided in general. |
cd189b9 to
0925446
Compare
0925446 to
d96c1da
Compare
Keep in mind this is like our second most important function in PyMC, and it was easily a footgun whenever var_names was involved or there were unobserved RVs that depended on modified data/coordinates (see #7069).
After this RV variables in the trace are never resampled by default, even if it may mean the graph is invalid (shape or semantically). A warning is issued, directing users to explicitly add variables "deemed volatile" into
sample_vars(to resample them) orfreeze_vars(to confirm this was intended).Regular uses that don't touch
var_namesshould stay the same as before. After these I thinkvar_namesbecomes very intuitive (and less important as well)Main knobs
sample_vars— variables to force sampling. RVs present in the trace will never be resampled unless they show up here. Deterministics are still recomputed automatically based on volatility, but can be included here in case the volatility analysis wouldn't trigger them (there's a pm.do use case in the tests/docstrings). Variables that don't exist in the trace (including observeds) are always sampled, but can be added here for completeness.freeze_vars— variables to explicitly reuse from the trace. Should also be used for RVs in the trace that seem volatile but are intended to not be resampled. Also works for deterministics — e.g. the HSGP-style case of a training-time mean-centering that shouldn't be rederived from prediction data.ImplicitFreezeWarninginpymc.exceptions— emitted when a trace RV is deemed volatile but not explicitly insample_varsorfreeze_vars. This is the knob to alert users they need to make a decision, reducing silent "footguns".var_namesis strictly output-gating now. It just determines what variables appear in the returned trace.Other changes
var_names/sample_vars/freeze_vars, and everything aftermodel, are now keyword-only.pm.Datareplaced by aTensorConstantwith a different value (e.g. viapm.do(…, make_interventions_shared=False)orfreeze_dims_and_data) now participates in volatility detection (closessample_posterior_predictivedoes not consider changes in "ConstantData" and "Constant coords" #6876).set_datapredictions, freezing/forcing a deterministic, out-of-model predictions), detailed volatility rules only after.sample_varsis specified.givens_dictargument in sample posterior predictive internal function #6614)Closes #7069
Closes #6876
Closes #6614
TODO: