Skip to content

fix(user-interviews): guard null emails in behavioral targeting templates#67286

Merged
pauldambra merged 1 commit into
masterfrom
posthog-code/guard-null-emails-interview-templates
Jul 1, 2026
Merged

fix(user-interviews): guard null emails in behavioral targeting templates#67286
pauldambra merged 1 commit into
masterfrom
posthog-code/guard-null-emails-interview-templates

Conversation

@posthog

@posthog posthog Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Claude-written:

Problem

The planning-voice-agent-user-interviews skill's preferred "find users by behavior" path handed agents three SQL templates that group events by a single <id> placeholder, which the prose tells them to fill with person.properties.email. With no null guard, every emailless (anonymous) event collapses into one None residual row. Under ORDER BY count() DESC that junk row sorts to the very top of the interview-candidate sample, eating a slot of the 20-row result and risking a literal None being passed downstream as an interviewee email.

The same section's prose also said to "keep both kinds of rows" (email and distinct_id), but a single <id> column can't produce both in one run. The file already gets this right one section up: the cohort recipe filters with AND properties.email IS NOT NULL, so this was an internal inconsistency.

This came from PostHog inbox report 019edc91-a574-70d2-9cce-82288f48cf3f.

Changes

Rewrite the three behavioral templates to group by coalesce(person.properties.email, distinct_id) AS id and select an explicit email column. This keeps both kinds of rows in a single query: emailed people group under their email, emailless people fall back to their own distinct_id as separate rows instead of one giant junk bucket. The email column then gives an unambiguous routing rule (non-null goes to interviewee_emails, null goes to interviewee_distinct_ids), and the prose is updated to match.

Also updated Step 5's CSV guidance to prefer the existing user-interview-topics-interviewees-bulk-create tool over one create call per row.

How did you test this code?

Documentation-only change to a skill file, so no automated tests. I ran the rewritten heavy-users template verbatim against project 2 via the PostHog MCP execute-sql tool (event = '$pageview', 60-day window). It is valid HogQL and the top 20 rows are now all real interviewable emails with no None residual row, confirming the fix.

🤖 Agent context

Autonomy: Human-driven (agent-assisted)

  • Authored by Claude (Claude Code) acting on a PostHog Signals inbox report. No repo skills needed code changes, but I read the report's contributing findings and suggested-reviewer artefacts via the inbox MCP tools before editing.
  • The report offered two fixes: append AND person.properties.email IS NOT NULL (mirroring the cohort recipe), or switch to coalesce(person.properties.email, distinct_id). I chose coalesce because it resolves both flagged issues at once — it drops the None residual and genuinely "keeps both kinds of rows" in one query, whereas the plain null guard would only apply when <id> was an email and left the prose contradiction unresolved. I added an explicit email column so the email/distinct_id routing is unambiguous rather than heuristic.

…ates

The three "Finding users by behavior" SQL templates in the
planning-voice-agent-user-interviews skill grouped events by a single
`<id>` placeholder that agents were told to fill with
`person.properties.email`. With no null guard, all emailless (anonymous)
traffic collapsed into one `None` residual row that sorts to the top
under `ORDER BY count() DESC`, eating a slot of the 20-row interview
sample and risking a literal `None` being passed downstream as an
interviewee email. The prose also said to "keep both kinds of rows"
(email and distinct_id), but a single `<id>` column can't do that in one
run.

Rewrite the templates to group by
`coalesce(person.properties.email, distinct_id) AS id` and select an
explicit `email` column. This keeps both kinds of rows in a single query
— emailed people group under their email, emailless people fall back to
their own distinct_id as separate rows instead of one junk bucket — and
the `email` column gives an unambiguous routing rule (non-null →
`interviewee_emails`, null → `interviewee_distinct_ids`), mirroring the
null guard the same file already uses in its cohort recipe.

Also update Step 5's CSV guidance to prefer the existing
`user-interview-topics-interviewees-bulk-create` tool over one create
call per row.

Generated-By: PostHog Code
Task-Id: 8e4de988-d9c9-4628-9d1d-7cb1d7e350a3
@posthog posthog Bot added the skip-agent-review Save $$$, skip auto agent reviews (Greptile) — use for trivial or chore PRs label Jul 1, 2026
@pauldambra pauldambra marked this pull request as ready for review July 1, 2026 08:40
@pauldambra pauldambra enabled auto-merge (squash) July 1, 2026 08:40
@pauldambra pauldambra merged commit e7ade79 into master Jul 1, 2026
243 of 285 checks passed
@pauldambra pauldambra deleted the posthog-code/guard-null-emails-interview-templates branch July 1, 2026 08:52
@deployment-status-posthog

deployment-status-posthog Bot commented Jul 1, 2026

Copy link
Copy Markdown

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-07-01 09:25 UTC Run
prod-us ✅ Deployed 2026-07-01 09:36 UTC Run
prod-eu ✅ Deployed 2026-07-01 09:38 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-agent-review Save $$$, skip auto agent reviews (Greptile) — use for trivial or chore PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant