Skip to content

feat: add canonical ClawScan verdict fields#2354

Open
Patrick-Erichsen wants to merge 1 commit into
mainfrom
pe/clawscan-canonical-verdict
Open

feat: add canonical ClawScan verdict fields#2354
Patrick-Erichsen wants to merge 1 commit into
mainfrom
pe/clawscan-canonical-verdict

Conversation

@Patrick-Erichsen
Copy link
Copy Markdown
Contributor

@Patrick-Erichsen Patrick-Erichsen commented May 21, 2026

Summary

  • add canonical ClawScan verdict/state fields on skill versions and package/plugin releases
  • update ClawScan prompt parsing, worker writes, moderation, API, UI, Skill Cards, and CLI surfaces around clean / review / warn / malicious
  • add cursor-based backfills and legacy suspicious cleanup helpers while preserving manual moderation and malicious locks
  • keep VirusTotal, static scan, dependency registry, and SkillSpector as scanner evidence/telemetry instead of top-level verdict rollups

Rollout

  1. Merge and deploy this schema-widening PR.
  2. Run cursor backfills:
    bunx convex run maintenance:backfillSkillVersionClawScanFields '{"batchSize":100}' --prod
    bunx convex run maintenance:backfillPackageReleaseClawScanFields '{"batchSize":100}' --prod
  3. Verify canonical fields are populated on relevant skillVersions and packageReleases.
  4. Verify new ClawScan writes set clawScanVerdict / clawScanState and do not recreate isSuspicious, flagged.review, or flagged.suspicious for scanner-managed review/warn outcomes.
  5. Run legacy cleanup actions before the schema/index narrowing PR:
    bunx convex run maintenance:cleanupLegacySuspiciousSkillFields '{"batchSize":100}' --prod
    bunx convex run maintenance:cleanupLegacySuspiciousDigestFields '{"batchSize":100}' --prod
    bunx convex run maintenance:cleanupLegacySuspiciousPackageScanStatuses '{"batchSize":100}' --prod
    bunx convex run maintenance:cleanupLegacySuspiciousPackageReleaseScanStatuses '{"batchSize":100}' --prod
  6. Verify manual hidden/quarantined/revoked/removed rows and legacy malicious rows remain blocked.
  7. Merge/deploy the legacy-removal PR after verification passes.

Rollback:

  • Keep the widened schema and fix forward or revert behavior without deleting canonical fields.
  • If the follow-up cleanup/removal PR needs rollback, revert only that PR and keep canonical verdict data.

Verification

Local:

  • bun run test:pw:local-auth
  • bun run test convex/lib/securityPrompt.test.ts scripts/security/run-codex-scan-worker.test.ts convex/securityScan.test.ts src/components/DetailSecuritySummary.test.tsx src/components/SkillSecurityScanResults.test.tsx convex/httpApiV1.handlers.test.ts convex/lib/moderationEngine.test.ts convex/lib/packageSecurity.test.ts convex/packages.public.test.ts convex/skills.rateLimit.test.ts
  • bun run ci:static
  • bun run ci:types-build
  • bun run ci:unit

Remote:

  • GitHub CI is green for 0c0cffac.

@Patrick-Erichsen Patrick-Erichsen requested review from a team and BunsDev as code owners May 21, 2026 09:32
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
clawhub Ready Ready Preview, Comment May 26, 2026 2:19am

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 21, 2026

Codex review: needs real behavior proof before merge. Reviewed June 2, 2026, 12:59 AM ET / 04:59 UTC.

Summary
The PR adds canonical clawScanVerdict and clawScanState fields across skill/package scan storage and updates scanner parsing, moderation, API, CLI, UI, docs, and migration helpers around clean, review, warn, and malicious outcomes.

Reproducibility: not applicable. This PR implements a schema/API/security-migration feature rather than reporting a current-main bug. Source review establishes the changed behavior and the remaining review blockers.

Review metrics: 3 noteworthy metrics.

  • PR surface: 72 files, +2350/-767. The migration spans schema, Convex functions, packages, CLI, UI, docs, and generated/public API artifacts.
  • Rollout commands: 6 production backfill/cleanup commands. Maintainers need proof that the data migration steps were run safely before the schema-narrowing follow-up lands.
  • Dependent stack: 1 open draft follow-up PR. The related cleanup PR is explicitly stacked on this widening PR and depends on its deployed data state.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦐 gold shrimp
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted real behavior proof for the migration/backfills, cleanup helpers, and representative API/CLI/UI outputs.
  • [P1] Fix the hash-tab regression and add the missing package OpenAPI fields.
  • Rebase or otherwise resolve the dirty merge state before maintainer security/compatibility approval.

Proof guidance:

  • [P1] Needs real behavior proof before merge: No after-fix real behavior proof is included beyond tests/CI and a preview link; add redacted terminal/log/screenshot proof for backfills, cleanup, API/CLI, and UI paths, then update the PR body or ask for @clawsweeper re-review.

Risk before merge

  • [P1] GitHub context reports the PR as dirty/unmergeable against the current base, so maintainers need a rebase or conflict-resolution review of the actual merge result.
  • [P1] The rollout changes the scanner trust boundary: static scan, VirusTotal, and legacy suspicious signals become telemetry/input while only canonical ClawScan malicious blocks installs or hides rows.
  • [P1] The PR depends on production backfill and legacy cleanup commands; without redacted runtime proof, maintainers cannot verify that canonical fields populate correctly or that manual/malicious locks remain blocked.
  • [P1] The dependent draft PR at Remove legacy suspicious filtering and scanner rollups #2355 makes merge order and deployed verification part of the compatibility risk.

Maintainer options:

  1. Fix and prove the widening rollout (recommended)
    Require a rebase, the two review fixes, and redacted terminal/log or UI proof showing backfills, cleanup helpers, and API/CLI/UI outputs before landing.
  2. Accept the canonical-only blocking boundary
    Maintainers may intentionally accept that static scan and VirusTotal become telemetry only, but that decision should be explicit before merge.
  3. Pause the stack until rollout proof exists
    If production backfill and cleanup proof is not available, keep this PR and the dependent cleanup PR paused rather than landing a partially verified security migration.

Next step before merge

  • [P1] Contributor proof plus maintainer security/compatibility judgment are required before merge; automation cannot supply the missing real-environment migration proof.

Security
Needs attention: The diff intentionally changes the scanner blocking boundary, so it needs maintainer security approval and rollout proof before merge.

Review findings

  • [P2] Restore hash-selected tabs after late tab availability — src/components/SkillDetailPage.tsx:445
  • [P3] Document package ClawScan fields in OpenAPI — convex/httpApiV1/packagesV1.ts:2951-2952
Review details

Best possible solution:

Rebase the branch, fix the hash-tab and OpenAPI gaps, add redacted runtime proof for the migration/backfill/API/CLI/UI paths, and get explicit maintainer approval for the scanner trust-boundary rollout before merge.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this PR implements a schema/API/security-migration feature rather than reporting a current-main bug. Source review establishes the changed behavior and the remaining review blockers.

Is this the best way to solve the issue?

Unclear: a widen/backfill/narrow migration is the right shape for this kind of data change, but this branch still regresses hash-driven tab restoration, leaves the public package OpenAPI schema incomplete, and needs explicit security-boundary approval.

Full review comments:

  • [P2] Restore hash-selected tabs after late tab availability — src/components/SkillDetailPage.tsx:445
    When a URL opens with #skill-card before the Skill Card tab is available, the first hash sync stores that tab but this validation effect now falls back to readme and never re-reads window.location.hash after hasSkillCard turns true. That breaks direct links/bookmarks to late-rendered tabs; keep the current-main behavior of re-applying a valid non-readme hash when validTabIds changes.
    Confidence: 0.91
  • [P3] Document package ClawScan fields in OpenAPI — convex/httpApiV1/packagesV1.ts:2951-2952
    The package version handler now returns version.clawScanVerdict and version.clawScanState, and the package schema package includes them, but public/api/v1/openapi.json still omits both fields from PackageVersionResponse. Public clients generated from the OpenAPI spec will not see the new package API contract until those fields are added there too.
    Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against dcbc38999f1a.

Label changes

Label justifications:

  • P1: The PR changes scanner verdict storage and moderation/install visibility across skills and packages, which is an urgent security-boundary workflow for maintainers to review carefully.
  • merge-risk: 🚨 compatibility: The diff changes public API fields, deprecated suspicious filters, rollout expectations, and package/skill scan semantics that existing clients and operators may rely on.
  • merge-risk: 🚨 security-boundary: The diff changes which scanner signals can hide or block artifacts, making only canonical ClawScan malicious automatically blocking.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦐 gold shrimp.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: No after-fix real behavior proof is included beyond tests/CI and a preview link; add redacted terminal/log/screenshot proof for backfills, cleanup, API/CLI, and UI paths, then update the PR body or ask for @clawsweeper re-review.
Evidence reviewed

Security concerns:

  • [medium] Approve the scanner blocking boundary explicitly — convex/skills.ts:413
    The patch makes static scan, VirusTotal, and legacy suspicious signals telemetry/input only while canonical ClawScan malicious becomes the automatic blocking signal; without verified backfill/cleanup proof, a rollout mistake could expose rows that should remain blocked.
    Confidence: 0.82

What I checked:

  • Repository policy read: Read the full 142-line AGENTS.md; its Convex, specs, security-sensitive migration, and PR validation guidance is relevant to this review. (AGENTS.md:1, dcbc38999f1a)
  • Convex guidance read: Read the full Convex AI guidelines because this PR changes Convex schema, functions, queries, and migrations. (convex/_generated/ai/guidelines.md:1, dcbc38999f1a)
  • Current main does not already implement the PR: Current main still has legacy suspicious/search surfaces and no canonical ClawScan verdict fields in the searched implementation paths, so this is not implemented on main. (convex/schema.ts:522, dcbc38999f1a)
  • PR adds package response fields: The PR head returns clawScanVerdict and clawScanState from the package version API response. (convex/httpApiV1/packagesV1.ts:2951, 0c0cffac4157)
  • OpenAPI package schema omission: The PR head public OpenAPI PackageVersionResponse.version schema still goes from sha256hash to scanner objects without documenting the new package clawScanVerdict and clawScanState fields. (public/api/v1/openapi.json:880, 0c0cffac4157)
  • Hash-tab regression evidence: Current main re-reads the URL hash when valid tabs change; the PR head replaces that with only previous-tab validation, so late-available tabs such as Skill Card are no longer restored from the hash. (src/components/SkillDetailPage.tsx:445, 0c0cffac4157)

Likely related people:

  • Patrick-Erichsen: Recent current-main history introduced and maintained the scanner/moderation/API paths that this PR changes, including commits reshaping skill security signals, bulk verdict APIs, and pending VT repair. (role: recent security-scan and API area contributor; confidence: high; commits: 07fed45f425e, 667bc55299c4, 4c965f495711; files: convex/lib/moderationEngine.ts, convex/securityScan.ts, convex/skills.ts)
  • Vyctor H. Brzezowski: Recent current-main commits touched moderated skill access, ownership/restore safety, and adjacent API/security behavior in the same moderation boundary. (role: recent moderation and API adjacent contributor; confidence: medium; commits: 97023d3123f4, 6f893b54f431, 18acbc120915; files: convex/skills.ts, convex/httpApiV1/skillsV1.ts)
  • Jesse Merhi: Recent current-main package and ranking work touches package publishing/API surfaces adjacent to the package release fields changed here. (role: recent package/API adjacent contributor; confidence: low; commits: dcbc38999f1a, 01aa28ccdaf9; files: convex/httpApiV1/packagesV1.ts, packages/schema/src/packages.ts, public/api/v1/openapi.json)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. labels May 21, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 21, 2026

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

This pull request has been automatically marked as stale due to inactivity.
Please update it or it will be closed.

@github-actions github-actions Bot added the stale label Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. stale status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant