Skip to content

Commit 6e6dd7f

Browse files
committed
change order of hillstorm analysis
1 parent e4fb68d commit 6e6dd7f

1 file changed

Lines changed: 121 additions & 89 deletions

File tree

docs/source/tutorials/hillstrom.rst

Lines changed: 121 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -67,53 +67,149 @@ Data Setup and Loading
6767
print(f"Men's Email: {df[df['segment']=='Mens E-Mail']['conversion'].mean():.3f}")
6868
print(f"Women's Email: {df[df['segment']=='Women E-Mail']['conversion'].mean():.3f}")
6969
70-
Comparing Men's vs Women's Email Campaigns
71-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70+
Email Campaign Effectiveness Analysis
71+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7272

7373
.. code-block:: python
7474
75-
print(f"Email campaign comparison sample: {len(D_email):,} customers")
76-
print(f"Men's Email: {(D_email==0).sum():,}")
77-
print(f"Women's Email: {(D_email==1).sum():,}")
78-
79-
# Initialize estimators for email comparison
80-
simple_email = dte_adj.SimpleDistributionEstimator()
81-
ml_email = dte_adj.AdjustedDistributionEstimator(
75+
# Initialize estimators
76+
simple_estimator = dte_adj.SimpleDistributionEstimator()
77+
ml_estimator = dte_adj.AdjustedDistributionEstimator(
8278
LinearRegression(),
8379
folds=5
8480
)
8581
86-
# Fit estimators
87-
simple_email.fit(X, D, revenue)
88-
ml_email.fit(X, D, revenue)
82+
# Fit estimators on the full dataset
83+
simple_estimator.fit(X, D, revenue)
84+
ml_estimator.fit(X, D, revenue)
8985
9086
# Define revenue evaluation points
9187
revenue_locations = np.linspace(0, 500, 51)
9288
89+
Control vs Women's Email Campaign
90+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91+
92+
First, let's examine how the Women's email campaign performs compared to no email (control):
93+
94+
.. code-block:: python
95+
96+
# Compute DTE: Women's email vs Control
97+
dte_women_ctrl, lower_women_ctrl, upper_women_ctrl = simple_estimator.predict_dte(
98+
target_treatment_arm=2, # Women's email
99+
control_treatment_arm=0, # No email control
100+
locations=revenue_locations,
101+
variance_type="moment"
102+
)
103+
104+
# Visualize Women's vs Control using dte_adj's plot function
105+
plot(revenue_locations, dte_women_ctrl, lower_women_ctrl, upper_women_ctrl,
106+
title="Women's Email Campaign vs Control",
107+
xlabel="Spending ($)", ylabel="Distribution Treatment Effect")
108+
109+
# Statistical summary
110+
positive_dte_women = (dte_women_ctrl > 0).mean()
111+
significant_dte_women = ((lower_women_ctrl > 0) | (upper_women_ctrl < 0)).mean()
112+
113+
print(f"Women's Email vs Control Results:")
114+
print(f"Locations where Women's > Control: {positive_dte_women:.1%}")
115+
print(f"Statistically significant differences: {significant_dte_women:.1%}")
116+
print(f"Average DTE: {dte_women_ctrl.mean():.3f}")
117+
118+
Control vs Men's Email Campaign
119+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
120+
121+
Next, let's examine how the Men's email campaign performs compared to no email (control):
122+
123+
.. code-block:: python
124+
125+
# Compute DTE: Men's email vs Control
126+
dte_men_ctrl, lower_men_ctrl, upper_men_ctrl = simple_estimator.predict_dte(
127+
target_treatment_arm=1, # Men's email
128+
control_treatment_arm=0, # No email control
129+
locations=revenue_locations,
130+
variance_type="moment"
131+
)
132+
133+
# Visualize Men's vs Control using dte_adj's plot function
134+
plot(revenue_locations, dte_men_ctrl, lower_men_ctrl, upper_men_ctrl,
135+
title="Men's Email Campaign vs Control",
136+
xlabel="Spending ($)", ylabel="Distribution Treatment Effect", color="purple")
137+
138+
# Statistical summary
139+
positive_dte_men = (dte_men_ctrl > 0).mean()
140+
significant_dte_men = ((lower_men_ctrl > 0) | (upper_men_ctrl < 0)).mean()
141+
142+
print(f"Men's Email vs Control Results:")
143+
print(f"Locations where Men's > Control: {positive_dte_men:.1%}")
144+
print(f"Statistically significant differences: {significant_dte_men:.1%}")
145+
print(f"Average DTE: {dte_men_ctrl.mean():.3f}")
146+
147+
Both Campaigns vs Control Comparison
148+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149+
150+
The control vs email campaigns analysis produces the following comparison:
151+
152+
.. image:: ../_static/hillstorm_dte_control.png
153+
:alt: Hillstrom Email Campaigns vs Control Analysis
154+
:width: 800px
155+
:align: center
156+
157+
**Interpreting the Control Comparison Results**: These plots show how each email campaign performs against the no-email control group across different spending levels:
158+
159+
**Women's Email vs Control**:
160+
- **Positive DTE values** indicate that Women's email campaign increases the probability of spending at those levels compared to no email
161+
- **Distribution pattern** shows where Women's email is most effective in driving customer spending
162+
- **Confidence intervals** reveal statistical significance of the treatment effects
163+
164+
**Men's Email vs Control**:
165+
- **Comparative effectiveness** can be assessed by comparing the magnitude and patterns of effects
166+
- **Different spending ranges** may show varying campaign effectiveness
167+
- **Statistical significance** indicated by confidence intervals not crossing zero
168+
169+
**Key Control Analysis Findings**:
170+
171+
1. **Campaign Effectiveness**: Both campaigns show positive effects compared to no email, confirming that email marketing drives incremental spending
172+
173+
2. **Differential Patterns**: The shape and magnitude of effects differ between campaigns, revealing:
174+
- Which campaign has stronger overall effects
175+
- Different spending ranges where each campaign excels
176+
- Varying confidence in treatment effects across spending levels
177+
178+
3. **Business Implications**:
179+
- **ROI Assessment**: Compare effect sizes to determine which campaign provides better return on investment
180+
- **Customer Segmentation**: Identify spending ranges where each campaign is most/least effective
181+
- **Resource Allocation**: Data-driven decisions on campaign budget allocation
182+
183+
4. **Statistical Rigor**: Confidence intervals provide guidance on where observed differences are statistically reliable vs. potentially due to sampling variation
184+
185+
This analysis answers the fundamental question: "Do email campaigns work?" and establishes the baseline effectiveness of each campaign against no email.
186+
187+
Direct Campaign Comparison: Men's vs Women's Email
188+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189+
190+
Finally, let's directly compare the two email campaigns to answer the key research question:
191+
192+
.. code-block:: python
193+
93194
# Compute DTE: Women's vs Men's email campaigns
94-
dte_simple, lower_simple, upper_simple = simple_email.predict_dte(
195+
dte_women_men, lower_women_men, upper_women_men = simple_estimator.predict_dte(
95196
target_treatment_arm=2, # Women's email
96197
control_treatment_arm=1, # Men's email (as "control")
97198
locations=revenue_locations,
98199
variance_type="moment"
99200
)
100201
101-
dte_ml, lower_ml, upper_ml = ml_email.predict_dte(
202+
dte_ml, lower_ml, upper_ml = ml_estimator.predict_dte(
102203
target_treatment_arm=2, # Women's email
103204
control_treatment_arm=1, # Men's email
104205
locations=revenue_locations,
105206
variance_type="moment"
106207
)
107208
108-
Distribution Treatment Effects Analysis
109-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110-
111-
.. code-block:: python
112-
113209
# Visualize the distribution treatment effects using dte_adj's built-in plot function
114210
115211
# Simple estimator
116-
plot(revenue_locations, dte_simple, lower_simple, upper_simple,
212+
plot(revenue_locations, dte_women_men, lower_women_men, upper_women_men,
117213
title="Email Campaign Comparison: Women's vs Men's (Simple Estimator)",
118214
xlabel="Spending ($)", ylabel="Distribution Treatment Effect")
119215
@@ -126,7 +222,7 @@ Distribution Treatment Effects Analysis
126222
positive_dte = (dte_ml > 0).mean()
127223
significant_dte = ((lower_ml > 0) | (upper_ml < 0)).mean()
128224
129-
print(f"\nDistributional Analysis Results:")
225+
print(f"\nDirect Campaign Comparison Results:")
130226
print(f"Locations where Women's > Men's: {positive_dte:.1%}")
131227
print(f"Statistically significant differences: {significant_dte:.1%}")
132228
print(f"Average DTE: {dte_ml.mean():.3f}")
@@ -138,7 +234,7 @@ The analysis produces the following distribution treatment effects visualization
138234
:width: 800px
139235
:align: center
140236

141-
**Interpreting the Results**: The plot shows the distribution treatment effects (DTE) comparing Women's vs Men's email campaigns across different spending levels. Key observations:
237+
**Interpreting the Campaign Comparison Results**: The plot shows the distribution treatment effects (DTE) comparing Women's vs Men's email campaigns across different spending levels. Key observations:
142238

143239
- **Positive DTE values** (above zero line) indicate that Women's email campaign increases the probability of spending at that level compared to Men's campaign
144240
- **Confidence intervals** (shaded areas) show statistical uncertainty - where intervals don't cross zero, effects are statistically significant
@@ -152,15 +248,15 @@ Revenue Category Analysis with PTE
152248

153249
.. code-block:: python
154250
155-
# Compute Probability Treatment Effects
156-
pte_simple, pte_lower_simple, pte_upper_simple = simple_email.predict_pte(
251+
# Compute Probability Treatment Effects for Women's vs Men's comparison
252+
pte_simple, pte_lower_simple, pte_upper_simple = simple_estimator.predict_pte(
157253
target_treatment_arm=2, # Women's email
158254
control_treatment_arm=1, # Men's email
159255
locations=revenue_locations,
160256
variance_type="moment"
161257
)
162258
163-
pte_ml, pte_lower_ml, pte_upper_ml = ml_email.predict_pte(
259+
pte_ml, pte_lower_ml, pte_upper_ml = ml_estimator.predict_pte(
164260
target_treatment_arm=2, # Women's email
165261
control_treatment_arm=1, # Men's email
166262
locations=revenue_locations,
@@ -204,70 +300,6 @@ The Probability Treatment Effects analysis produces the following visualization:
204300

205301
This granular analysis helps marketers understand not just which campaign generates more revenue overall, but specifically which spending behaviors each campaign drives.
206302

207-
Control vs Email Campaigns Analysis
208-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
209-
210-
.. code-block:: python
211-
212-
dte_mens_ctrl, lower_mens_ctrl, upper_mens_ctrl = simple_email.predict_dte(
213-
target_treatment_arm=1, control_treatment_arm=0,
214-
locations=revenue_locations, variance_type="moment"
215-
)
216-
217-
dte_women_ctrl, lower_women_ctrl, upper_women_ctrl = simple_email.predict_dte(
218-
target_treatment_arm=2, control_treatment_arm=0,
219-
locations=revenue_locations, variance_type="moment"
220-
)
221-
222-
# Visualize both campaigns vs control using dte_adj's plot function
223-
224-
# Men's vs Control
225-
plot(revenue_locations, dte_mens_ctrl, lower_mens_ctrl, upper_mens_ctrl,
226-
title="Men's Email Campaign vs Control",
227-
xlabel="Spending ($)", ylabel="Distribution Treatment Effect", color="purple")
228-
229-
# Women's vs Control
230-
plot(revenue_locations, dte_women_ctrl, lower_women_ctrl, upper_women_ctrl,
231-
title="Women's Email Campaign vs Control",
232-
xlabel="Spending ($)", ylabel="Distribution Treatment Effect")
233-
234-
The control vs email campaigns analysis produces the following comparison:
235-
236-
.. image:: ../_static/hillstorm_dte_control.png
237-
:alt: Hillstrom Email Campaigns vs Control Analysis
238-
:width: 800px
239-
:align: center
240-
241-
**Interpreting the Control Comparison Results**: These side-by-side plots show how each email campaign performs against the no-email control group across different spending levels:
242-
243-
**Men's Email vs Control (Top Panel)**:
244-
- **Positive DTE values** indicate that Men's email campaign increases the probability of spending at those levels compared to no email
245-
- **Distribution pattern** shows where Men's email is most effective in driving customer spending
246-
- **Confidence intervals** reveal statistical significance of the treatment effects
247-
248-
**Women's Email vs Control (Bottom Panel)**:
249-
- **Comparative effectiveness** can be assessed by comparing the magnitude and patterns of effects
250-
- **Different spending ranges** may show varying campaign effectiveness
251-
- **Statistical significance** indicated by confidence intervals not crossing zero
252-
253-
**Key Control Analysis Findings**:
254-
255-
1. **Campaign Effectiveness**: Both campaigns show positive effects compared to no email, confirming that email marketing drives incremental spending
256-
257-
2. **Differential Patterns**: The shape and magnitude of effects differ between campaigns, revealing:
258-
- Which campaign has stronger overall effects
259-
- Different spending ranges where each campaign excels
260-
- Varying confidence in treatment effects across spending levels
261-
262-
3. **Business Implications**:
263-
- **ROI Assessment**: Compare effect sizes to determine which campaign provides better return on investment
264-
- **Customer Segmentation**: Identify spending ranges where each campaign is most/least effective
265-
- **Resource Allocation**: Data-driven decisions on campaign budget allocation
266-
267-
4. **Statistical Rigor**: Confidence intervals provide guidance on where observed differences are statistically reliable vs. potentially due to sampling variation
268-
269-
This analysis answers the fundamental question: "Do email campaigns work?" and more importantly, "Which one works better and for which customer segments?"
270-
271303
**Key Findings**: Using the real Hillstrom dataset with 64,000 customers, the distributional analysis reveals nuanced patterns in how email campaigns affect customer spending. The analysis goes beyond simple average comparisons to show how treatment effects vary across the entire spending distribution, providing insights into which customer segments respond best to different campaign types. This demonstrates the power of distribution treatment effect analysis for understanding heterogeneous responses in digital marketing experiments.
272304

273305
Next Steps

0 commit comments

Comments
 (0)