Skip to content

feat(logs): redact PII from workflow logs via configurable rules#5136

Open
TheodoreSpeaks wants to merge 6 commits into
stagingfrom
feat/redact-pii-workflow-log
Open

feat(logs): redact PII from workflow logs via configurable rules#5136
TheodoreSpeaks wants to merge 6 commits into
stagingfrom
feat/redact-pii-workflow-log

Conversation

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator

Summary

  • Adds enterprise PII redaction for workflow execution logs, configured under Data Retention as org-scoped rules — each rule selects which PII entity types to mask and which workspaces it applies to (empty workspaces = all; empty entity types = redact nothing).
  • Reuses the guardrails Microsoft Presidio engine in mask mode, applied at the single log-persist choke point (completeWorkflowExecution), covering both the inline and externalized write paths.
  • Redaction is byte-budgeted/chunked (one batched subprocess per persist, not per block), structure-preserving (masks string leaves, rebuilds via structuredClone), and fail-safe (scrubs rather than leaks on error).
  • Adds a check-digit-validated VIN recognizer so it won't false-match arbitrary 17-char codes.
  • Adds per-workspace data-retention-hours overrides (resolved workspace ?? org ?? plan default) via a new nullable workspace.data_retention_settings column.
  • UI mirrors Access Control: a rules list + ChipModal (grouped checkbox grid for entity types, workspace multiselect), persisting on save with an unsaved-changes guard.

Type of Change

  • New feature

Testing

  • Unit tests for the redaction transform (recursion, batched substitution, structure preservation, fail-safe scrub, oversized-string skip) — passing.
  • Verified Presidio masking end-to-end locally (email/phone/credit-card/VIN), incl. VIN check-digit precision (valid VINs masked, random 17-char tokens left alone).
  • bun run lint, bun run check:api-validation:strict, and bun run check:migrations origin/staging all pass. Migration is a single additive nullable column (expand-phase, backward-compatible).

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Enterprise PII redaction for workflow execution logs, configured under
Data Retention as org-scoped rules (each rule picks entity types + which
workspaces it applies to). Reuses the guardrails Presidio engine in mask
mode at the log-persist choke point, with a check-digit-validated VIN
recognizer. Also adds per-workspace data-retention-hours overrides.
@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 19, 2026 5:44pm

Request Review

@cursor

cursor Bot commented Jun 19, 2026

Copy link
Copy Markdown

PR Summary

High Risk
Changes the workflow log persist hot path and subprocess-based PII handling—misconfiguration or Presidio outages affect stored log content and could over-scrub or add latency on every completion when rules are enabled.

Overview
Introduces org-level PII redaction rules stored in dataRetentionSettings.piiRedaction, exposed through the data-retention API/contracts and a new PII Redaction section in Data Retention settings (rule list, modal with workspace scope and Presidio entity pickers; rules save immediately, retention hours still use the top Save).

Persistence path: completeWorkflowExecution resolves applicable rules per workspace (resolveEffectivePiiRedaction), then runs structure-preserving masking on trace spans, outputs, inputs, and related execution fields via redactPIIFromExecution before inline/externalized storage. Masking uses batched/chunked Presidio (maskPIIBatch, Python mask_batch); failures or oversized payloads replace text with [REDACTION_FAILED] instead of storing raw PII. Enterprise plan re-check is intentionally skipped on persist to avoid fail-open leaks.

Guardrails gain a shared client-safe PII entity catalog, a VIN recognizer with check-digit validation, and refactored Python subprocess helpers. Schema/types formalize DataRetentionSettings and redaction rules; unit tests cover the redaction transform.

Reviewed by Cursor Bugbot for commit a5ca0f4. Bugbot is set up for automated code reviews on this repo. Configure here.

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

Comment thread apps/sim/lib/logs/execution/logger.ts
Comment thread apps/sim/lib/logs/execution/pii-redaction.ts
…rkflow-log

# Conflicts:
#	packages/db/migrations/meta/0241_snapshot.json
#	packages/db/migrations/meta/_journal.json
#	scripts/check-api-validation-contracts.ts
Comment thread apps/sim/lib/billing/retention.ts Outdated
Comment thread apps/sim/lib/logs/execution/logger.ts Outdated
@greptile-apps

greptile-apps Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds enterprise PII redaction for workflow execution logs using Microsoft Presidio, configured via org-scoped rules in the Data Retention settings UI. Redaction is applied at the single completeWorkflowExecution choke point using a collect-then-substitute two-pass design over string leaves, with byte-chunked subprocess batching, a check-digit-validated VIN recognizer, and a fail-safe scrub on error.

  • pii-redaction.ts + validate_pii.ts/py: Core redaction pipeline — deterministic two-pass traversal collects eligible strings, sends them in 256 KB chunks to Presidio via subprocess, then substitutes masked results back by position.
  • logger.ts applyPiiRedaction: Resolves org PII rules via a workspace → organization JOIN at log-persist time; prior review threads flagged the missing error boundary and unconditional JOIN cost.
  • data-retention-settings.tsx: UI mirrors the Access Control pattern; the workspace onChange handler correctly coerces empty selection back to appliesToAllWorkspaces: true.

Confidence Score: 3/5

The log-persist hot path now has an unguarded async call that, on any transient DB error, will throw and drop the entire workflow execution log rather than degrading gracefully.

Several issues raised in prior review threads remain unaddressed: the applyPiiRedaction call in completeWorkflowExecution has no try/catch so a connection hiccup while looking up PII settings causes the whole log-persist to fail silently; oversized strings >128 KB are excluded from the fail-safe scrub path and pass through unmasked even when the payload ceiling is hit; and the unconditional JOIN fires on every workflow completion. Together these make the change risky to merge without addressing at minimum the error-boundary gap.

apps/sim/lib/logs/execution/logger.ts — the applyPiiRedaction call site needs a try/catch so that settings-lookup failures degrade to no redaction rather than lost logs.

Important Files Changed

Filename Overview
apps/sim/lib/logs/execution/logger.ts Adds applyPiiRedaction on the hot completeWorkflowExecution path; the method has no error boundary — a DB failure here will cause the entire log-persist to fail (flagged in prior thread); also performs an unconditional JOIN on every completion.
apps/sim/lib/logs/execution/pii-redaction.ts New redaction engine: collect-then-substitute two-pass design is sound and structurally correct; oversized-string scrub gap and REDACTION_FAILED_MARKER semantics overloading were flagged in prior review threads.
apps/sim/lib/guardrails/validate_pii.ts New maskPIIBatch and runPythonScript helpers are well-structured; sequential chunking reloads the spaCy model per chunk for large payloads (see comment).
apps/sim/lib/guardrails/validate_pii.py Adds VinRecognizer (ISO 3779 check-digit validated), build_analyzer(), and mask_batch; VIN transliteration table and weight array match the NHTSA standard; batch mode correctly detected by presence of texts key.
apps/sim/lib/billing/retention.ts Rule-resolution logic is internally consistent; JSDoc empty=redact-all contradicts the actual behaviour — already flagged in prior thread.
apps/sim/ee/data-retention/components/data-retention-settings.tsx Well-structured UI with ChipModal rule editor; the workspaceIds onChange handler correctly coerces empty selection back to appliesToAllWorkspaces: true, preventing orphaned rules from the UI path.
packages/db/schema.ts Adds PiiRedactionRule and DataRetentionSettings interfaces and updates the organization.dataRetentionSettings column type; JSDoc contradiction flagged in prior thread.
apps/sim/lib/api/contracts/primitives.ts New piiRedactionRuleSchema and piiRedactionSettingsSchema; no cross-field validation enforcing appliesToAllWorkspaces:false requires non-empty workspaceIds.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant EL as ExecutionLogger
    participant APR as applyPiiRedaction
    participant DB as Database
    participant RER as redactPIIFromExecution
    participant MPIB as maskPIIBatch
    participant PY as validate_pii.py

    EL->>APR: workspaceId, payload
    APR->>DB: SELECT orgSettings FROM workspace LEFT JOIN organization
    DB-->>APR: row
    APR->>APR: resolveEffectivePiiRedaction
    alt enabled false
        APR-->>EL: payload unchanged
    else enabled true
        APR->>RER: payload, entityTypes
        RER->>RER: collect eligible strings pass 1
        RER->>MPIB: collected strings
        loop per 256 KB chunk
            MPIB->>PY: texts via stdin
            PY-->>MPIB: masked results
        end
        MPIB-->>RER: masked array
        RER->>RER: substitute masked values pass 2
        RER-->>APR: redacted payload
        APR-->>EL: redacted payload
    end
    EL->>DB: persist cleanExecutionData
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant EL as ExecutionLogger
    participant APR as applyPiiRedaction
    participant DB as Database
    participant RER as redactPIIFromExecution
    participant MPIB as maskPIIBatch
    participant PY as validate_pii.py

    EL->>APR: workspaceId, payload
    APR->>DB: SELECT orgSettings FROM workspace LEFT JOIN organization
    DB-->>APR: row
    APR->>APR: resolveEffectivePiiRedaction
    alt enabled false
        APR-->>EL: payload unchanged
    else enabled true
        APR->>RER: payload, entityTypes
        RER->>RER: collect eligible strings pass 1
        RER->>MPIB: collected strings
        loop per 256 KB chunk
            MPIB->>PY: texts via stdin
            PY-->>MPIB: masked results
        end
        MPIB-->>RER: masked array
        RER->>RER: substitute masked values pass 2
        RER-->>APR: redacted payload
        APR-->>EL: redacted payload
    end
    EL->>DB: persist cleanExecutionData
Loading

Reviews (4): Last reviewed commit: "refactor(logs): drop per-workspace reten..." | Re-trigger Greptile

Comment thread apps/sim/lib/billing/retention.ts
Comment thread apps/sim/lib/logs/execution/pii-redaction.ts Outdated
Comment on lines 592 to 626
}
}

/**
* Mask PII from log content before persistence when the execution's workspace
* (via workspace override or org default) has enterprise PII redaction enabled.
* Resolved at persist time so both the inline and externalized write paths are
* covered. Returns the payload unchanged when disabled or non-enterprise.
*/
private async applyPiiRedaction(
workspaceId: string | null,
payload: RedactablePayload
): Promise<RedactablePayload> {
if (!workspaceId) return payload

const [row] = await db
.select({ orgSettings: organization.dataRetentionSettings })
.from(workspace)
.leftJoin(organization, eq(organization.id, workspace.organizationId))
.where(eq(workspace.id, workspaceId))
.limit(1)
if (!row) return payload

const config = resolveEffectivePiiRedaction({ orgSettings: row.orgSettings, workspaceId })
if (!config.enabled) return payload

// Settings are only writable by enterprise orgs, but re-verify at read time
// (e.g. a plan downgrade) before doing the work.
if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload

return redactPIIFromExecution(payload, { entityTypes: config.entityTypes })
}

async completeWorkflowExecution(params: {
executionId: string

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unconditional DB join on every workflow completion

applyPiiRedaction always fires a workspace → LEFT JOIN organization query, plus a second isWorkspaceOnEnterprisePlan call when redaction is configured, regardless of whether the org has any PII rules or is even on an enterprise plan. For deployments where the majority of executions are personal/non-enterprise workspaces this adds two round-trips to the hot path of every workflow. A lightweight guard (e.g. caching the org's enterprise status or checking a feature-flag ahead of the query) would let the common case skip both calls entirely.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@greptile-apps

greptile-apps Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds enterprise-grade PII redaction for workflow execution logs, a new per-workspace data-retention-hours override, and a check-digit-validated VIN recognizer. The core redaction pipeline is well-engineered: a deterministic two-pass collect/substitute approach batches all strings from a single execution into one chunked Presidio call, and a fail-safe ensures PII is never persisted when masking fails.

  • PII redaction: hooked at completeWorkflowExecution after filterForDisplay and redactApiKeys; org-scoped rules are resolved at persist time covering both inline and externalized write paths; oversized payloads (>16 MB) are scrubbed rather than leaked.
  • Workspace retention overrides: new nullable workspace.data_retention_settings column with a clean workspace ?? org ?? plan-default resolution chain; a new /api/workspaces/[id]/data-retention endpoint mirrors the org-level API with proper admin + enterprise gating.
  • Three copies of the comment \"Empty = redact all detected PII\" on entityTypes (in schema.ts, primitives.ts, and pii-redaction.ts) directly contradict retention.ts, which treats an empty entity-type list as "contribute nothing, rule is inactive." The PR description confirms the correct semantic is "empty = redact nothing", so those three comments need updating to avoid operators mistakenly believing empty rules provide full coverage.

Confidence Score: 3/5

The redaction pipeline and workspace-override logic are structurally sound, but three copies of a wrong comment on a security-sensitive field create a genuine risk of operators deploying empty-entity-type rules under the false impression that they cover all PII.

The core machinery — two-pass collect/substitute, byte-chunked Presidio calls, fail-safe scrubbing, and the workspace ?? org ?? plan-default retention resolution — is implemented correctly. Three conflicting entityTypes comments in schema.ts, primitives.ts, and pii-redaction.ts say the opposite of what retention.ts implements. An enterprise admin who follows those comments and creates a rule with no entity types selected will have no PII masked at all — a silent failure in a privacy-protection feature.

packages/db/schema.ts, apps/sim/lib/api/contracts/primitives.ts, and apps/sim/lib/logs/execution/pii-redaction.ts all carry the wrong entityTypes comment that contradicts retention.ts.

Important Files Changed

Filename Overview
apps/sim/lib/logs/execution/pii-redaction.ts New PII redaction transform: deterministic two-pass collect/substitute approach is sound; fail-safe scrubbing on error is correct; misleading entityTypes comment and reuse of REDACTION_FAILED_MARKER for the byte-ceiling case need fixing.
apps/sim/lib/logs/execution/logger.ts PII redaction correctly hooked at the single persist choke point after filterForDisplay + redactApiKeys; adds 2 extra DB queries per execution on the hot path that could be consolidated into the existing log fetch.
apps/sim/lib/billing/retention.ts New file with clean resolution logic for both retention hours (workspace ?? org ?? fallback) and PII redaction (union of applicable rule entity types). Correctly returns disabled when union is empty, though this contradicts comments elsewhere.
packages/db/schema.ts Adds DataRetentionSettings interface and workspace.dataRetentionSettings column; PiiRedactionRule.entityTypes comment says Empty = redact all detected PII which contradicts the actual retention.ts behavior (empty = redact nothing).
apps/sim/lib/api/contracts/primitives.ts Adds well-validated piiRedactionRuleSchema and piiRedactionSettingsSchema; same misleading entityTypes comment as schema.ts.
apps/sim/lib/guardrails/validate_pii.ts New maskPIIBatch function correctly chunks texts by byte budget (256KB), rejects on subprocess failure so callers can apply fail-safe; existing validatePII single-text path unchanged.
apps/sim/lib/guardrails/validate_pii.py Adds mask_batch function reusing a single AnalyzerEngine + AnonymizerEngine across all texts in a chunk; VinRecognizer with ISO 3779 check-digit validation is well-implemented.
apps/sim/lib/billing/cleanup-dispatcher.ts Adds workspace-level settings to the cleanup scope query; fallback is null for all enterprise jobs so behavior is identical to the previous code for unconfigured orgs.
apps/sim/app/api/workspaces/[id]/data-retention/route.ts New workspace-level data retention GET/PUT endpoints; auth gating (session, admin permission, enterprise plan) is correct.
apps/sim/lib/logs/execution/pii-redaction.test.ts Good coverage: structure preservation, immutability, fail-safe scrub on error, oversized-string skip, and no-op when no strings collected.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant E as Executor
    participant L as ExecutionLogger
    participant R as retention.ts
    participant P as pii-redaction.ts
    participant V as validate_pii.ts
    participant Py as validate_pii.py
    participant DB as Postgres

    E->>L: completeWorkflowExecution
    L->>L: filterForDisplay + redactApiKeys
    L->>DB: SELECT org.dataRetentionSettings
    DB-->>L: orgSettings
    L->>R: resolveEffectivePiiRedaction
    R-->>L: enabled + entityTypes
    alt PII enabled
        L->>DB: isWorkspaceOnEnterprisePlan
        DB-->>L: true
        L->>P: redactPIIFromExecution
        P->>P: collect string leaves
        loop Per 256KB chunk
            P->>V: maskPIIBatch
            V->>Py: spawn subprocess
            Py-->>V: masked strings
            V-->>P: masked strings
        end
        P-->>L: redacted payload
    end
    L->>DB: UPDATE workflowExecutionLogs
    L-->>E: WorkflowExecutionLog
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant E as Executor
    participant L as ExecutionLogger
    participant R as retention.ts
    participant P as pii-redaction.ts
    participant V as validate_pii.ts
    participant Py as validate_pii.py
    participant DB as Postgres

    E->>L: completeWorkflowExecution
    L->>L: filterForDisplay + redactApiKeys
    L->>DB: SELECT org.dataRetentionSettings
    DB-->>L: orgSettings
    L->>R: resolveEffectivePiiRedaction
    R-->>L: enabled + entityTypes
    alt PII enabled
        L->>DB: isWorkspaceOnEnterprisePlan
        DB-->>L: true
        L->>P: redactPIIFromExecution
        P->>P: collect string leaves
        loop Per 256KB chunk
            P->>V: maskPIIBatch
            V->>Py: spawn subprocess
            Py-->>V: masked strings
            V-->>P: masked strings
        end
        P-->>L: redacted payload
    end
    L->>DB: UPDATE workflowExecutionLogs
    L-->>E: WorkflowExecutionLog
Loading

Reviews (2): Last reviewed commit: "Merge remote-tracking branch 'origin/sta..." | Re-trigger Greptile

Comment thread apps/sim/lib/logs/execution/pii-redaction.ts
Comment thread apps/sim/lib/logs/execution/pii-redaction.ts
Comment on lines +601 to +623
private async applyPiiRedaction(
workspaceId: string | null,
payload: RedactablePayload
): Promise<RedactablePayload> {
if (!workspaceId) return payload

const [row] = await db
.select({ orgSettings: organization.dataRetentionSettings })
.from(workspace)
.leftJoin(organization, eq(organization.id, workspace.organizationId))
.where(eq(workspace.id, workspaceId))
.limit(1)
if (!row) return payload

const config = resolveEffectivePiiRedaction({ orgSettings: row.orgSettings, workspaceId })
if (!config.enabled) return payload

// Settings are only writable by enterprise orgs, but re-verify at read time
// (e.g. a plan downgrade) before doing the work.
if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload

return redactPIIFromExecution(payload, { entityTypes: config.entityTypes })
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Two extra DB round-trips per execution on the hot completion path

applyPiiRedaction always issues a JOIN query for org settings, then — if rules are active — calls isWorkspaceOnEnterprisePlan (another query). For the vast majority of executions (no PII rules configured, or non-enterprise workspace), this is wasted work on the critical log-persist path. The existing existingLog query at the top of completeWorkflowExecution already loads the workspace row; adding organization.dataRetentionSettings to that initial SELECT would remove the need for this second query entirely.

Comment thread packages/db/schema.ts
Comment on lines +92 to +93
/** Presidio entity types to mask. Empty = redact all detected PII. */
entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Same misleading comment as on PiiRedactionRule in schema.ts. resolveEffectivePiiRedaction treats an empty entityTypes array on a rule as contributing nothing to the union, resulting in enabled: false (no redaction). The comment should reflect that.

Suggested change
/** Presidio entity types to mask. Empty = redact all detected PII. */
entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),
/** Presidio entity types to mask. Empty array = rule contributes no types (redact nothing). */
entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

Comment on lines +601 to +623
private async applyPiiRedaction(
workspaceId: string | null,
payload: RedactablePayload
): Promise<RedactablePayload> {
if (!workspaceId) return payload

const [row] = await db
.select({ orgSettings: organization.dataRetentionSettings })
.from(workspace)
.leftJoin(organization, eq(organization.id, workspace.organizationId))
.where(eq(workspace.id, workspaceId))
.limit(1)
if (!row) return payload

const config = resolveEffectivePiiRedaction({ orgSettings: row.orgSettings, workspaceId })
if (!config.enabled) return payload

// Settings are only writable by enterprise orgs, but re-verify at read time
// (e.g. a plan downgrade) before doing the work.
if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload

return redactPIIFromExecution(payload, { entityTypes: config.entityTypes })
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 applyPiiRedaction DB failure propagates to log persistence

The db.select() inside applyPiiRedaction has no error handling. A transient DB error (connection timeout, pool exhaustion) while looking up org PII settings will throw from this function, propagate through completeWorkflowExecution — which has no local catch around the applyPiiRedaction call — and cause the entire workflow execution log to fail to persist. The PR's own fail-safe principle ("scrubs rather than leaks on error") in redactPIIFromExecution is not applied here to the settings-lookup path. A try/catch around the entire method body returning payload on any error would prevent PII configuration lookup failures from destroying execution history.

…t lazy

- Extend PII redaction to span error/errorMessage/toolCalls and top-level
  error/completionFailure/trigger/executionState (Bugbot: PII in execution
  metadata). executionState is safe to redact — resume reads from the separate
  pausedExecutions table, not the log copy.
- Lazy-import validate_pii in pii-redaction so the Python/child_process
  guardrails module stays out of the static middleware/RSC graph.
- Type the org retention mutation to the contract body (optional, non-null).
Comment thread apps/sim/lib/logs/execution/pii-redaction.ts
…stays org-scoped

- Remove the unused per-workspace data-retention-hours override (no UI; superseded
  by workspace-scoped PII rules). Reverts cleanup-dispatcher to org-only retention,
  drops resolveEffectiveRetentionHours, the workspace.dataRetentionSettings column +
  migration, and the workspace data-retention route/contract/hooks. Fixes Bugbot's
  null-as-unset finding by removing the buggy path entirely; org retention behavior
  is unchanged.
- Stop re-checking isWorkspaceOnEnterprisePlan at persist time (it returns false on
  transient errors, which would fail-open and leak PII). Enabled rules already imply
  entitlement; redact whenever rules apply (fail-safe).
@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e5dfe05. Configure here.

Comment thread apps/sim/lib/logs/execution/pii-redaction.ts
- Drop the per-string size cap in PII redaction: oversized strings were left
  unmasked (leak). Nothing is skipped now; large payloads still fail-safe via the
  total-bytes ceiling + per-chunk timeout (scrub, never leak).
- Add executionData.environment (incl. variables) to the redaction set.
@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant