Skip to content

Commit e438352

Browse files
authored
Rerun survival (pymc-devs#865)
* rerun weibull * rerun censored * rerun param
1 parent 709c906 commit e438352

6 files changed

Lines changed: 437 additions & 287 deletions

File tree

examples/survival_analysis/bayes_param_survival.ipynb

Lines changed: 269 additions & 81 deletions
Large diffs are not rendered by default.

examples/survival_analysis/bayes_param_survival.myst.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ jupytext:
55
format_name: myst
66
format_version: 0.13
77
kernelspec:
8-
display_name: pymc
8+
display_name: arviz_1
99
language: python
1010
name: python3
1111
---
@@ -17,7 +17,7 @@ kernelspec:
1717
```{code-cell} ipython3
1818
import warnings
1919
20-
import arviz.preview as az
20+
import arviz as az
2121
import numpy as np
2222
import pymc as pm
2323
import pytensor.tensor as pt
@@ -231,7 +231,7 @@ az.plot_energy(weibull_trace);
231231
The $\hat{R}$ statistics also indicate convergence.
232232

233233
```{code-cell} ipython3
234-
az.rhat(weibull_trace).to_array().max()
234+
az.rhat(weibull_trace).ds.to_array().max()
235235
```
236236

237237
Below we plot posterior distributions of the parameters.
@@ -341,7 +341,7 @@ az.plot_energy(log_logistic_trace);
341341
```
342342

343343
```{code-cell} ipython3
344-
az.rhat(log_logistic_trace).to_array().max()
344+
az.rhat(log_logistic_trace).ds.to_array().max()
345345
```
346346

347347
Again, we calculate the posterior expected survival functions for this model.
@@ -396,6 +396,7 @@ This post has been a short introduction to implementing parametric survival regr
396396
- Updated by [George Ho](https://eigenfoo.xyz/) on July 18, 2018.
397397
- Updated by @fonnesbeck on September 11, 2024.
398398
- Updated by Osvaldo Martin on December 2025.
399+
- Updated by Osvaldo Martin on April 2026.
399400

400401
```{code-cell} ipython3
401402
%load_ext watermark

examples/survival_analysis/censored_data.ipynb

Lines changed: 43 additions & 59 deletions
Large diffs are not rendered by default.

examples/survival_analysis/censored_data.myst.md

Lines changed: 10 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ jupytext:
55
format_name: myst
66
format_version: 0.13
77
kernelspec:
8-
display_name: Python [conda env:base] *
8+
display_name: arviz_1
99
language: python
10-
name: conda-base-py
10+
name: python3
1111
---
1212

1313
(censored_data)=
@@ -22,7 +22,7 @@ kernelspec:
2222
```{code-cell} ipython3
2323
from copy import copy
2424
25-
import arviz.preview as az
25+
import arviz as az
2626
import matplotlib.pyplot as plt
2727
import numpy as np
2828
import pymc as pm
@@ -36,42 +36,22 @@ rng = default_rng(1234)
3636
az.style.use("arviz-variat")
3737
```
3838

39-
[This example notebook on Bayesian survival
40-
analysis](https://www.pymc.io/projects/examples/en/latest/survival_analysis/survival_analysis.html) touches on the
41-
point of censored data. _Censoring_ is a form of missing-data problem, in which
42-
observations greater than a certain threshold are clipped down to that
43-
threshold, or observations less than a certain threshold are clipped up to that
44-
threshold, or both. These are called right, left and interval censoring,
45-
respectively. In this example notebook we consider interval censoring.
39+
[This example notebook on Bayesian survival analysis](https://www.pymc.io/projects/examples/en/latest/survival_analysis/survival_analysis.html) touches on the point of censored data. _Censoring_ is a form of missing-data problem, in which observations greater than a certain threshold are clipped down to that threshold, or observations less than a certain threshold are clipped up to that threshold, or both. These are called right, left and interval censoring,respectively. In this example notebook we consider interval censoring.
4640

4741
Censored data arises in many modelling problems. Two common examples are:
4842

49-
1. _Survival analysis:_ when studying the effect of a certain medical treatment
50-
on survival times, it is impossible to prolong the study until all subjects
51-
have died. At the end of the study, the only data collected for many patients
52-
is that they were still alive for a time period $T$ after the treatment was
53-
administered: in reality, their true survival times are greater than $T$.
43+
1. _Survival analysis:_ when studying the effect of a certain medical treatment on survival times, it is impossible to prolong the study until all subjects have died. At the end of the study, the only data collected for many patients is that they were still alive for a time period $T$ after the treatment was administered: in reality, their true survival times are greater than $T$.
5444

55-
2. _Sensor saturation:_ a sensor might have a limited range and the upper and
56-
lower limits would simply be the highest and lowest values a sensor can
57-
report. For instance, many mercury thermometers only report a very narrow
58-
range of temperatures.
45+
2. _Sensor saturation:_ a sensor might have a limited range and the upper and lower limits would simply be the highest and lowest values a sensor can report. For instance, many mercury thermometers only report a very narrow range of temperatures.
5946

6047
This example notebook presents two different ways of dealing with censored data
6148
in PyMC:
6249

63-
1. An imputed censored model, which represents censored data as parameters and
64-
makes up plausible values for all censored values. As a result of this
65-
imputation, this model is capable of generating plausible sets of made-up
66-
values that would have been censored. Each censored element introduces a
67-
random variable.
50+
1. An imputed censored model, which represents censored data as parameters and makes up plausible values for all censored values. As a result of this imputation, this model is capable of generating plausible sets of made-up values that would have been censored. Each censored element introduces a random variable.
6851

69-
2. An unimputed censored model, where the censored data are integrated out and
70-
accounted for only through the log-likelihood. This method deals more
71-
adequately with large amounts of censored data and converges more quickly.
52+
2. An unimputed censored model, where the censored data are integrated out and accounted for only through the log-likelihood. This method deals more adequately with large amounts of censored data and converges more quickly.
7253

73-
To establish a baseline we compare to an uncensored model of the uncensored
74-
data.
54+
To establish a baseline we compare to an uncensored model of the uncensored data.
7555

7656
```{code-cell} ipython3
7757
# Produce normally distributed samples
@@ -238,6 +218,7 @@ As we can see, both censored models appear to capture the mean and variance of t
238218
- Updated by [Benjamin Vincent](https://github.com/drbenvincent) in May 2021.
239219
- Updated by [Benjamin Vincent](https://github.com/drbenvincent) in May 2022.
240220
- Updated by [Osvaldo Martin](https://github.com/aloctavodia) in Dec 2025.
221+
- Updated by [Osvaldo Martin](https://github.com/aloctavodia) in Apr 2026.
241222

242223
+++
243224

examples/survival_analysis/weibull_aft.ipynb

Lines changed: 104 additions & 109 deletions
Large diffs are not rendered by default.

examples/survival_analysis/weibull_aft.myst.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ jupytext:
55
format_name: myst
66
format_version: 0.13
77
kernelspec:
8-
display_name: pymc_env
8+
display_name: arviz_1
99
language: python
1010
name: python3
1111
myst:
@@ -24,7 +24,7 @@ myst:
2424
:::
2525

2626
```{code-cell} ipython3
27-
import arviz.preview as az
27+
import arviz as az
2828
import numpy as np
2929
import pymc as pm
3030
import pytensor.tensor as pt
@@ -125,7 +125,7 @@ with model_1:
125125
```
126126

127127
```{code-cell} ipython3
128-
az.plot_trace_dist(idata_param1, var_names=["alpha", "beta"])
128+
az.plot_trace_dist(idata_param1, var_names=["alpha", "beta"]);
129129
```
130130

131131
```{code-cell} ipython3
@@ -155,7 +155,7 @@ with model_2:
155155
```
156156

157157
```{code-cell} ipython3
158-
az.plot_trace_dist(idata_param2, var_names=["r", "beta"])
158+
az.plot_trace_dist(idata_param2, var_names=["r", "beta"]);
159159
```
160160

161161
```{code-cell} ipython3
@@ -190,7 +190,7 @@ with model_3:
190190
```
191191

192192
```{code-cell} ipython3
193-
az.plot_trace_dist(idata_param3)
193+
az.plot_trace_dist(idata_param3);
194194
```
195195

196196
```{code-cell} ipython3
@@ -203,6 +203,7 @@ az.summary(idata_param3, round_to=2)
203203
- Authored and ported to Jupyter notebook by [George Ho](https://eigenfoo.xyz/) on Jul 15, 2018.
204204
- Updated for compatibility with PyMC v5 by Chris Fonnesbeck on Jan 16, 2023.
205205
- Updated to replace `pm.Potential` with `pm.Censored` by Jonathan Dekermanjian on Nov 25, 2024.
206+
- Updated by Osvaldo Martin in April 2026.
206207

207208
```{code-cell} ipython3
208209
%load_ext watermark

0 commit comments

Comments
 (0)