Add disposable email domain heuristics for sign-up risk scoring#1234
Add disposable email domain heuristics for sign-up risk scoring#1234mantrakp04 wants to merge 6 commits intofraud-protection-country-codefrom
Conversation
…. Introduced new risk score calculations based on disposable email patterns, enhancing fraud detection capabilities. Updated tests to validate the new heuristics and their integration into the sign-up process.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
STACK_RISK_BOT_DISPOSABLE_EMAIL_WEIGHT and STACK_RISK_FTA_DISPOSABLE_EMAIL_WEIGHT (both default to 100). Clamp final score to 0-100 instead of requiring sum=100. Made-with: Cursor
…istics. Added new fields to the ProjectUser model for tracking sign-up IP and email normalization. Updated environment configurations for Emailable API keys and adjusted risk score calculations to incorporate new heuristics. Enhanced tests for email validation and sign-up processes to ensure accurate handling of new attributes.
Greptile SummaryThis PR introduces a weighted sign-up risk scoring pipeline backed by the Emailable API, replacing a hardcoded stub. It adds new database columns for storing heuristic facts (IP, normalised email) at sign-up time, a CEL-based rule evaluator, and an internal test endpoint — all of which are well-architected additions. However, there are several blocking issues that need to be addressed before merging:
Confidence Score: 1/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant SignUpRoute as auth/password/sign-up
participant Users as users.tsx
participant RiskScores as risk-scores.tsx
participant Emailable as emailable.tsx
participant Heuristics as sign-up-heuristics.tsx
participant DB as Prisma / DB
participant CEL as cel-evaluator.ts
participant Rules as sign-up-rules.tsx
Client->>SignUpRoute: POST /auth/password/sign-up
SignUpRoute->>Users: createOrUpgradeAnonymousUserWithRules(...)
Users->>RiskScores: calculateSignUpRiskAssessment(tenancy, context)
RiskScores->>Heuristics: deriveSignUpHeuristicFacts(email, ip)
Heuristics-->>RiskScores: normalised IP + email facts
RiskScores->>DB: query recent same-IP / similar-email sign-ups
DB-->>RiskScores: counts
RiskScores->>Emailable: checkEmailWithEmailable(email)
Emailable-->>RiskScores: status: "ok" | "not-deliverable" | "error"
Note over RiskScores: Combines disposable + IP + email scores (clamped 0-100)
RiskScores-->>Users: { scores, heuristicFacts }
Users->>DB: persist risk scores + heuristic facts on ProjectUser
Users->>CEL: createSignUpRuleContext(email, scores, ...)
CEL->>Rules: evaluateSignUpRulesWithTrace(tenancy, context)
Rules-->>Users: outcome (allow / reject / restrict)
Users-->>SignUpRoute: created user or error
SignUpRoute-->>Client: 200 tokens | 403 SIGN_UP_REJECTED
|
| riskScores: { | ||
| bot: params.riskScores.bot, | ||
| free_trial_abuse: params.riskScores.free_trial_abuse, | ||
| }, |
There was a problem hiding this comment.
riskScores.freeTrialAbuse naming mismatch breaks CEL evaluation
createSignUpRuleContext stores the score under the key free_trial_abuse (snake_case):
riskScores: {
bot: params.riskScores.bot,
free_trial_abuse: params.riskScores.free_trial_abuse,
},But the unit tests at lines 218, 239, 263, 285, and 308 all assert the camelCase key freeTrialAbuse. Similarly, the E2E test in risk-scores.test.ts (line 838) uses the CEL condition riskScores.bot >= 80 && riskScores.freeTrialAbuse >= 80. Because the context object only contains free_trial_abuse, the identifier freeTrialAbuse resolves to undefined inside the CEL evaluator, so undefined >= 80 evaluates to false — the free_trial_abuse score is effectively invisible to every CEL rule that references it. The unit tests would also fail outright.
The return value should use camelCase to match the convention used in CEL expressions and tests:
| riskScores: { | |
| bot: params.riskScores.bot, | |
| free_trial_abuse: params.riskScores.free_trial_abuse, | |
| }, | |
| riskScores: { | |
| bot: params.riskScores.bot, | |
| freeTrialAbuse: params.riskScores.free_trial_abuse, | |
| }, |
| ? { status: "ok" } as const | ||
| : await checkEmailWithEmailable(context.primaryEmail); | ||
|
|
||
| const disposableEmailMatched = emailableResult.status === "not-deliverable" || emailableResult.status === "error"; |
There was a problem hiding this comment.
API errors treated as disposable emails causes false positives
When the Emailable API call throws (network timeout, rate-limit, server error, etc.), checkEmailWithEmailable returns { status: "error" } (the default onError is "return-error"). Line 164 then treats this as evidence that the email is disposable:
const disposableEmailMatched = emailableResult.status === "not-deliverable" || emailableResult.status === "error";This means any transient Emailable outage will silently assign bot: 100, free_trial_abuse: 100 to every sign-up during the outage window. Legitimate users would be wrongly flagged, potentially blocked by downstream sign-up rules, or permanently carry high-risk scores.
Consider treating API errors as "unknown" rather than "disposable":
| const disposableEmailMatched = emailableResult.status === "not-deliverable" || emailableResult.status === "error"; | |
| const disposableEmailMatched = emailableResult.status === "not-deliverable"; |
| method: "POST", | ||
| accessType: "admin", | ||
| body: { | ||
| email: "user@best-tempmail-service.com", |
There was a problem hiding this comment.
E2E test relies on live Emailable API to flag best-tempmail-service.com
The test at line 159 sends user@best-tempmail-service.com and expects bot: 100 / free_trial_abuse: 100. However, the no-API-key shortcut in emailable.tsx only returns not-deliverable for the single sentinel domain emailable-not-deliverable.example.com. Any other domain — including best-tempmail-service.com — falls through to { status: "ok" }, producing zero risk scores. Without a real Emailable API key in the test environment, this test will fail.
Use the dedicated test-mode sentinel domain instead:
| email: "user@best-tempmail-service.com", | |
| email: `user@${EMAILABLE_NOT_DELIVERABLE_TEST_DOMAIN}`, |
(Import EMAILABLE_NOT_DELIVERABLE_TEST_DOMAIN from @/lib/emailable, or export it from a shared test-helpers module.)
The same issue exists for every assertion in risk-scores.test.ts that uses user@tempmail.com and expects bot: 100 / free_trial_abuse: 100 (lines 762, 812, 843).
| method: "POST", | ||
| accessType: "admin", | ||
| body: { | ||
| email: "user@best-tempmail-service.com", |
| ? { status: "ok" } as const | ||
| : await checkEmailWithEmailable(context.primaryEmail); | ||
|
|
||
| const disposableEmailMatched = emailableResult.status === "not-deliverable" || emailableResult.status === "error"; |
There was a problem hiding this comment.
| const disposableEmailMatched = emailableResult.status === "not-deliverable" || emailableResult.status === "error"; | |
| const disposableEmailMatched = emailableResult.status === "not-deliverable"; |
API errors from Emailable service incorrectly penalize users with +100 bot and free_trial_abuse risk scores
| accessType: "client", | ||
| body: { | ||
| email: "test@example.com", | ||
| email: "user@tempmail.com", |
Summary
risk-scores.tsxthat detects disposable/temporary email domains via regex patterns and outputs bot + free-trial-abuse scorestest@example.comstub with real pattern-based detection for domains like tempmail, guerrillamail, mailinator, yopmail, etc.Stacked on #1232
Test plan
risk-scores.tsxviaimport.meta.vitestfor disposable email matchingtest@example.comMade with Cursor