Skip to content

fix(project): fix multiple issues identified#324

Merged
krokoko merged 2 commits into
mainfrom
fix/audit-high-severity
Jun 12, 2026
Merged

fix(project): fix multiple issues identified#324
krokoko merged 2 commits into
mainfrom
fix/audit-high-severity

Conversation

@krokoko

@krokoko krokoko commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Fixes multiple findings

  1. Auto-approve / Mergify governance gap — .github/workflows/auto-approve.yml now triggers on labeled only (was: every push/synchronize), checks the event's label is literally auto-approve, and skips fork-head PRs. .mergify.yml gained
    a dismiss_reviews rule so any new push invalidates stale approvals; re-approval requires removing and re-adding the label. zizmor reports no findings on the updated workflow.

  2. UTC timezone bug in approval-gate lifetime math — agent/src/hooks.py now parses TASK_STARTED_AT with tzinfo=UTC before calling .timestamp(). Added a TestRemainingMaxlifetime suite (4 tests) in agent/tests/test_hooks.py, including
    one that sets TZ=America/New_York to prove the regression would be caught.

  3. PAT no longer persisted in .git/config — agent/src/repo.py sets the origin remote to the plain https URL and configures credential.helper = !gh auth git-credential, which resolves the token at push time from the GH_TOKEN env var
    the pipeline already exports before setup_repo(). The token never touches disk. Also widened agent/src/output_scanner.py: GitHub token regex now matches 36+ chars and the ghr_ prefix, and a new GITHUB_URL_TOKEN pattern catches
    x-access-token:…@ URLs echoed in tool output (2 new tests).

  4. Linear webhook empty-secret forgery — cdk/src/handlers/shared/linear-verify.ts: verifyLinearSignature rejects empty/whitespace secrets (critical for the per-workspace path, whose secrets come from OAuth bundles, not
    getLinearSecret), and getLinearSecret treats whitespace-only SecretString as null — mirroring the hardened GitHub verifier. Created the previously missing cdk/test/handlers/shared/linear-verify.test.ts with 21 tests covering
    signatures, caching, rotation re-fetch, replay window, and the forgery scenario end-to-end.

  5. Screenshot frame-attribution race — cdk/src/handlers/shared/agentcore-browser.ts now records Document response statuses per frameId in the single message listener (the duplicate listener and its double JSON.parse are gone), and
    resolves the main frame's status after load using the frameId from Page.navigate. Events arriving before navigate resolves are no longer lost or misattributed; an unambiguous single status is used when navigate returns no frameId, and
    ambiguous multi-frame cases fall back to optimistic capture. Added 5 race-scenario tests (sub-frame 404 after main 200, the inverse, redirect chains, missing frameId).

  6. CLI --verbose secret leak — new redactSensitive() in cli/src/debug.ts deep-redacts secret, access_token, refresh_token, client_secret, authorization, etc. before request/response bodies are debug-logged in cli/src/api-client.ts.
    New cli/test/debug.test.ts (8 tests) covers the webhook-secret case, nesting, case-insensitivity, and non-mutation.

  7. Stale task_type contract in design docs — docs/design/ORCHESTRATOR.md Tasks table now documents workflow_ref/resolved_workflow (the task_type and never-implemented error_code rows are gone), the mid-table blockquote that broke
    rendering was moved below the table, and the hydration prose is keyed by workflow. docs/design/ARCHITECTURE.md model table and the AGENTS.md types.ts description now use workflow IDs. Starlight mirrors regenerated via
    sync-starlight.mjs.

Verification: agent — 1016 tests pass (mise run quality, 76.3% coverage); cdk — compile clean, 2011 tests pass; cli — full build (compile + 324 tests + lint) passes; zizmor clean.

Area

  • cdk — infrastructure, handlers, constructs
  • agent — Python runtime / Docker image
  • clibgagent client
  • docs — guides or design sources (docs/guides/, docs/design/)
  • tooling — root mise.toml, scripts, CI workflows

Tip: AGENTS.md lists where to edit and which tests to extend.

Related

Changes

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

@krokoko krokoko changed the title Fix/audit high severity Fix(project): fix multiple issues identified Jun 12, 2026
@krokoko krokoko changed the title Fix(project): fix multiple issues identified fix(project): fix multiple issues identified Jun 12, 2026

@theagenticguy theagenticguy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve — 7 fixes, all verified correct, CI green

Reviewed each fix against full source context (not just the diff), with two independent deep-dives on the highest-risk changes. build (agentcore) is green; CodeQL, both CodeQL Analyze jobs, and title-validate all pass.

Per-fix verdict

  1. Auto-approve / Mergify governance — Correct. Trigger narrowed to labeled only; the gate now checks github.event.label.name == 'auto-approve' (the label on this event, not mere presence in the set) and excludes fork-head PRs. Paired with Mergify dismiss_reviews so any push invalidates the bot's prior approval. This closes the path where a synchronize re-approved arbitrary future commits under auto-merge.

  2. UTC timezone bug (hooks.py) — Correct. strptime(...).replace(tzinfo=UTC).timestamp() fixes the naive-datetime skew. The TZ=America/New_York test genuinely catches the regression.

  3. PAT out of .git/config (repo.py) — Verified the runtime concern that the credential-helper approach raises: does GH_TOKEN/GITHUB_TOKEN survive to push time? Yes. The token is set process-global in pipeline.py (and runner.py for the LLM path) before setup_repo(), is never deleted, and run_cmd's _clean_env() strips only OTEL_* — so both the deterministic push (post_hooks.py) and the CLI-subprocess push inherit it. gh is built in the Dockerfile and authed purely via those env vars, same as the pre-existing gh repo clone. No push regression. The output_scanner.py widening (36+ chars, ghr_, x-access-token:…@ URL pattern) is a clean complement.

  4. Linear empty-secret forgery (linear-verify.ts) — Correct and the in-function guard is load-bearing, not redundant: verifyLinearRequestForWorkspace feeds the per-workspace OAuth secret straight to verifyLinearSignature without passing through getLinearSecret's empty-check, and a whitespace-only webhook_signing_secret is truthy past the !stored.webhook_signing_secret check — so the .trim() guard is the only thing stopping HMAC(' ') forgery on that path. The "mirrors verifyGitHubSignature / getGitHubWebhookSecret" claim is accurate byte-for-byte. 21 tests.

  5. Screenshot frame-attribution race (agentcore-browser.ts) — Correct. The per-frame Map keyed by the Page.navigate frameId fixes two real bugs in the old code: the global "latest Document status wins" let a sub-frame 404 overwrite the main 200, and the pre-navigate mainDocumentFrameId === null window misattributed early responses. Removing the duplicate ws.on('message') listener (and its second JSON.parse) is a correctness + efficiency win. The ambiguous-multi-frame → optimistic-capture fallback is a reasonable choice. 5 race-scenario tests.

  6. CLI --verbose secret leak (debug.ts) — Correct. redactSensitive deep-redacts case-insensitively, non-mutating, applied to both request and response bodies. 8 tests, including the non-redaction of next_token.

  7. Stale task_type docs — Accurate; also relocates the mid-table blockquote that was breaking table rendering, and the Starlight mirrors match.

One follow-up (out of scope here, not a blocker)

Fix #4's own rationale — "an empty secret must never be used for HMAC, or HMAC('', body) is forgeable" — applies verbatim to two webhook verifiers this PR doesn't touch:

  • cdk/src/handlers/shared/slack-verify.tsgetSlackSecret checks only !result.SecretString, and verifySlackSignature has no empty/whitespace guard before createHmac.
  • cdk/src/handlers/webhook-create-task.tsgetSecret and verifySignature likewise.

Both reject a truly empty "" secret but let a whitespace-only " " secret through to HMAC. Pre-existing condition in different files; worth a follow-up PR to mirror the same two-line guard across all four verifiers.

Approving. Clean, well-tested, well-commented work.

@krokoko krokoko added this pull request to the merge queue Jun 12, 2026
Merged via the queue into main with commit d6b00df Jun 12, 2026
7 of 16 checks passed
@krokoko krokoko deleted the fix/audit-high-severity branch June 12, 2026 15:39
krokoko added a commit to mayakost/sample-autonomous-cloud-coding-agents that referenced this pull request Jun 13, 2026
…ing alerts (aws-samples#332)

* fix(project): resolve medium/low audit findings across cdk, cli, agent, docs

Follow-up to aws-samples#324 (high-severity batch) closing out the remaining
findings from the 2026-06-11 codebase audit, plus review feedback
from that PR.

Security hardening:
- Mirror the empty/whitespace-secret HMAC guard from the GitHub and
  Linear webhook verifiers into slack-verify.ts and
  webhook-create-task.ts so all four verifiers fail closed on a
  misconfigured empty signing secret (PR aws-samples#324 review follow-up)
- Sanitize GitHub issue/comment content and wrap it in untrusted
  delimiters on the agent's local/dry-run prompt path (context.py)
- Validate repoFullName/sha shape before URL and S3-key interpolation
  in github-deployment-status.ts

Correctness and resilience:
- Post-once marker for Linear final-status comments so partial-batch
  stream retries cannot post duplicates (Linear has no comment edit)
- CLI: memoize in-flight Cognito refresh (concurrent-refresh race),
  guard corrupt credentials.json with a friendly CliError, bound
  waitForTask with transient-error retry and a wall-clock ceiling,
  events --all pagination, validated --limit
- Agent: explicit fail-closed deny for non-dict tool_input, default
  branch detection falls back to main on OSError, push_resolve push
  failures surface as a PR comment instead of silent success
- validateMaxBudgetUsd rejects NaN/Infinity

Observability:
- CloudWatch alarm on the fan-out DLQ depth (silent notification
  outages were invisible) and an async-invoke DLQ for the screenshot
  webhook processor
- bgagent watch renders structured approval-milestone metadata

Contracts and tests:
- constants-parity test pins CLI literals to contracts/constants.json
- New hermetic tests for repo.py/post_hooks.py git/gh argv, wait.ts,
  debug redaction, and the four hardened webhook verifiers
- Drop two unused @aws-sdk agentcore deps from the CLI (yarn.lock
  pruned); add files/engines to cli and docs package manifests

Docs: workflow-model vocabulary in agent/README and AGENTS.md, ADR-014
marked accepted, curly-quote and mise-command fixes in guides,
.DS_Store gitignore case fix; Starlight mirrors regenerated.

* fix(project): address code-review findings and open code-scanning alerts

Follow-up to the medium/low audit batch (ea4c94a), resolving the
confirmed findings from the high-effort plugin review of that commit
plus the three open CodeQL clear-text-logging alerts.

Review findings:
- Restore the original transient-backoff curve (2**attempt): the
  extraction to cli/src/retry.ts had silently halved every retry
  delay, doubling pressure on a degraded backend; a new test pins the
  exact per-attempt jitter window so the curve can't drift again
- events --all --limit N now caps TOTAL events client-side instead of
  forwarding limit as the server page size (which returned the whole
  stream in N-event pages)
- Only Cognito auth-rejection errors map to "Session expired"; a
  transient network blip during the (now shared) refresh tells the
  user to retry instead of re-login
- waitForTask timeout/transient-exhaustion exits with code 2 (CliError
  now carries exitCode) so scripts can tell "CLI gave up waiting"
  from a genuinely FAILED task (exit 1); ceiling check moved to loop
  top to cover the transient branch
- Gate verbose-log redaction behind isVerbose() — no more deep-copy
  of every request/response body on the non-verbose hot path
- One isUsableHmacSecret() chokepoint (shared/hmac-secret.ts) replaces
  the eight hand-copied empty-secret guards across the four webhook
  verifiers; one saveDispatchMarker() helper owns the never-throw
  post-once marker semantics in the fan-out plane
- Agent: GitHub issue content is sanitized at fetch_github_issue (the
  source) so the GitHubIssue model never carries raw untrusted
  strings; shared FakeRunCmd/make_task_config test helpers moved to
  conftest.py; vestigial watch.ts re-export removed

Code-scanning alerts (py/clear-text-logging, js/clear-text-logging):
- shell.log() and server._warn_cw() emit redacted lines via a shared
  os.write sink (same pattern as _debug_cw); warn messages were
  previously printed unredacted — tests switched capsys → capfd
- bgagent admin invite-user writes the credential share-block to a
  0600 file under ~/.bgagent/invites/ instead of printing the
  password to stdout (scrollback/CI capture outlive "share once")

* fix(fanout): retry transient Linear post failures; add construct tests; enforce model-level sanitization

Addresses the three PR aws-samples#332 review findings:

- Linear final-status comments are no longer dropped on transient
  failures: postIssueComment/addIssueReaction now return a classified
  LinearPostResult ({ ok } | { ok: false, retryable }) — network
  errors, timeouts, 5xx and 429 are retryable; auth, GraphQL errors
  and token-resolution failures stay terminal. dispatchToLinear throws
  on retryable failures so routeEvent records an infra rejection and
  the record lands in batchItemFailures for a Lambda retry. Safe by
  construction: the post-once marker is only persisted after a
  successful post, so the retry posts the missing comment or
  short-circuits on the marker.

- Construct-test gaps closed: fanout-consumer.test.ts pins the
  FanOutDlqDepthAlarm (metric binding, threshold 1, notBreaching) so a
  refactor can't silently drop the only persistent signal of a fan-out
  outage; new github-screenshot-integration.test.ts asserts the
  WebhookProcessorDlq exists (14-day retention, enforceSSL) and is
  wired as the processor Lambda's async-invoke DeadLetterConfig.

- GitHubIssue/IssueComment sanitization is now structural: field
  validators run sanitize_external_content at construction, so every
  construction path (fetch_github_issue, model_validate from cache,
  tests, future fetchers) yields a sanitized instance — the docstring
  promise "consumers must not re-sanitize" is enforced by the type
  rather than one caller's discipline. fetch_github_issue drops its
  now-redundant call sites; idempotency pinned by tests.

* chore(project): address findings

* chore(cli): hoist 0o600 into SECRET_FILE_MODE constant

The chmod calls added for the durable-permissions fix tripped
@typescript-eslint/no-magic-numbers (the rule exempts object-literal
properties like { mode: 0o600 } but not bare call arguments). One named
constant in config.ts now owns the secret-file mode for credentials and
invite files.

---------

Co-authored-by: bgagent <bgagent@noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants