You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/tutorials/oregon.rst
+84-32Lines changed: 84 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -203,9 +203,11 @@ The analysis produces the following local distribution treatment effects visuali
203
203
- **ML-Adjusted Local Estimator**: Shows a smaller effect of LDTE ≈ -0.15 at zero costs, with similar convergence patterns.
204
204
- **Key Finding**: Both estimators reveal insurance primarily affects the lower tail (zero to ~$10,000), shifting the distribution rightward. This indicates insurance increases ED access among those who would otherwise not seek care, while having minimal impact on high-cost users.
205
205
206
-
**2. Covariate Adjustment Effects and Confidence Intervals**
207
206
208
-
The confidence intervals are not substantially narrower with ML adjustment. Both methods show comparably wide confidence bands, indicating limited efficiency gains. This suggests: (1) covariates have limited predictive power for ED costs, (2) the linear regression model may be too simple, or (3) the simple estimator is already reasonably efficient.
207
+
The confidence intervals are not substantially narrower with ML adjustment. Both methods show comparably wide confidence bands, indicating limited efficiency gains. This result reflects the **limited predictive power of available covariates** (R² ≈ 0.21 when predicting ED costs from pre-treatment ED history and demographics).
208
+
209
+
ML adjustment provides efficiency gains proportional to covariate predictive power. When covariates weakly predict outcomes (R² < 0.3), as in this case, ML adjustment yields minimal improvements over simple estimation. This is a characteristic of the data—pre-treatment healthcare utilization and basic demographics cannot strongly predict future emergency department costs—not a failure of the ML methodology.
210
+
209
211
210
212
Cost Analysis with Local PTE
211
213
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -391,9 +393,8 @@ Visits Analysis with Local PTE
391
393
- **ML-Adjusted Local Estimator**: Shows a larger negative effect at zero visits (LPTE ≈ -0.14) and positive effects in the 1-5 visit range (LPTE ≈ 0.03-0.04). Effects converge to zero at higher visit frequencies.
392
394
- **Key Finding**: Insurance reduces the probability mass at zero visits while increasing it in the low-to-moderate visit range (1-5 visits). This represents a redistribution of probability mass from non-users to low-frequency ED users, with minimal effect on frequent visitors.
393
395
394
-
**2. Covariate Adjustment Effects and Confidence Intervals**
395
396
396
-
The confidence intervals remain wide for both estimators, particularly at zero and low visit counts. The limited precision suggests: (1) substantial heterogeneity in treatment effects within visit frequency bins, (2) limited predictive power of covariates for specific visit levels, or (3) relatively small sample sizes within individual bins.
397
+
The confidence intervals remain wide for both estimators, with minimal differences between simple and ML-adjusted approaches. This limited precision reflects the same fundamental constraint as in the cost analysis: covariates have limited predictive power for ED visit frequency (R² ≈ 0.21). The substantial heterogeneity in treatment effects, combined with weak covariate prediction, means ML adjustment provides minimal efficiency gains over the simpler approach.
397
398
398
399
399
400
Stratified Analysis by Household Registration
@@ -487,53 +488,101 @@ Visualization: Comparing Overall Population vs Stratified Results
487
488
.. code-block:: python
488
489
489
490
# Comparison: Overall vs Individual Strata (Local Estimators)
490
-
fig, axes = plt.subplots(2, 3, figsize=(24, 12))
491
+
fig, axes = plt.subplots(2, 2, figsize=(24, 12))
492
+
493
+
# Calculate global y-axis limits across all plots (to align y-axis)
494
+
all_ydatas = []
495
+
all_yerr_lowers = []
496
+
all_yerr_uppers = []
497
+
498
+
# Collect all y values (means and error bounds) for ALL subplots
499
+
# Overall population: Simple and ML-adjusted
500
+
all_ydatas.append(ldte_simple)
501
+
all_yerr_lowers.append(lower_simple)
502
+
all_yerr_uppers.append(upper_simple)
503
+
all_ydatas.append(ldte_ml)
504
+
all_yerr_lowers.append(lower_ml)
505
+
all_yerr_uppers.append(upper_ml)
506
+
507
+
# Each stratum: Simple and ML-adjusted
508
+
for stratum, results in individual_results.items():
plt.suptitle("Comparison: Overall Population vs Individual Household Registration Strata (Local Estimators)", fontsize=16)
582
+
plt.suptitle(
583
+
"Comparison: Overall Population vs Individual Household Registration Strata (Local Estimators)",
584
+
fontsize=16
585
+
)
537
586
plt.tight_layout()
538
587
plt.show()
539
588
@@ -608,9 +657,10 @@ The LPTE analysis reveals insurance does not uniformly increase ED utilization.
608
657
609
658
Stratified analysis uncovers dramatic treatment effect heterogeneity: single-person households ("signed self up") show moderate effects (LDTE ≈ -0.18 to -0.20), while multi-person households ("signed self up + others") exhibit 3-4x larger effects (LDTE ≈ -0.55). This suggests household structure is a critical moderator—insurance enables care-seeking for multiple family members when households include dependents.
610
659
611
-
**4. Limited Efficiency Gains from ML Adjustment**
660
+
**4. ML Adjustment Effectiveness Depends on Covariate Predictive Power**
661
+
662
+
With baseline covariates (pre-randomization ED utilization + demographics, R² ≈ 0.21), ML-adjusted estimators show minimal efficiency gains—confidence intervals remain comparably wide or even slightly wider than simple estimators. However, enhanced feature engineering could be improve predictive power, enabling ML adjustment to narrow confidence intervals.
612
663
613
-
Despite using pre-randomization ED utilization history and demographic covariates, ML-adjusted estimators show minimal efficiency gains over simple estimators. Confidence intervals remain comparably wide for both methods, suggesting: (1) the covariates have limited predictive power for ED outcomes, (2) the linear regression model may be too simple, or (3) substantial residual heterogeneity exists even after covariate adjustment. Notably, ML adjustment becomes unstable in small strata (n=4,068), producing implausible estimates (LDTE reaching +20), highlighting that model complexity must match sample informativeness.
614
664
615
665
**5. Policy Implications for Targeted Interventions**
616
666
@@ -619,6 +669,8 @@ The distributional analysis reveals that Medicaid's primary benefit is enabling
619
669
Next Steps
620
670
~~~~~~~~~~
621
671
672
+
**For Your Own Data**:
673
+
622
674
- Try with your own randomized experiment data
623
675
- Experiment with different ML models (XGBoost, Neural Networks) for adjustment
624
676
- Explore stratified estimators for covariate-adaptive randomization designs
0 commit comments