Skip to content

Commit 8204b65

Browse files
author
bgagent
committed
fix(2.0b-O2): review-2 batch — error specificity, half-creates, runbook
Addresses four PR review items focused on operator UX when things go sideways: - isWebhookSecretConfigured (review high). The bare `catch { return false }` swallowed AccessDeniedException and DecryptionFailureException, making setup re-prompt for a webhook secret when the real problem was IAM. Now: only ResourceNotFoundException returns false; everything else throws a CliError pointing the operator at the IAM gap. Test updated to assert both paths. - admin invite-user half-create (review medium). If AdminCreateUser succeeds but AdminSetUserPassword fails (stricter password policy than generator, partial IAM grant on the Set verb), the user was left in FORCE_CHANGE_PASSWORD with no diagnostic. Wrap the second call in try/catch and throw a CliError that names the user, explains the broken state, and gives both a delete-user CLI and a manual-fix path. - PAK migration runbook (review non-blocking #1). Expanded the "Migration from 2.0a (PAK) to 2.0b (OAuth)" section in LINEAR_SETUP_GUIDE.md with: a pre-deploy checklist, what survives the migration vs what doesn't, an explicit rollback note (fix forward; the original PAK secret is gone with the CFN resource), and the per-step difference between 2.0a-with-Identity (skipped) vs 2.0a-with-PAK (migrate) deploys. - Vestigial AgentCore Identity dep (review non-blocking #2). bedrock-agentcore==1.9.1 is kept in agent/pyproject.toml because the workload-token bridge in server.py still calls it (now wrapped in try/except per review batch 1). Add an inline comment explaining why it's pinned even though Phase 2.0b-O2 reads Secrets Manager directly — it's the seam for resuming the AgentCore Identity path in 2.0c. CLI tests: 13/13 pass.
1 parent c419c75 commit 8204b65

6 files changed

Lines changed: 144 additions & 26 deletions

File tree

agent/pyproject.toml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,16 @@ description = "Background coding agent — runs tasks in isolated cloud environm
55
requires-python = ">=3.13"
66
dependencies = [
77
"boto3==1.43.9", #https://pypi.org/project/boto3/
8+
# Vestigial from the parked AgentCore Identity flow (Phase 2.0a).
9+
# Phase 2.0b reads per-workspace Linear OAuth tokens directly from
10+
# Secrets Manager because AgentCore Identity's USER_FEDERATION
11+
# flow has an open service-side bug (see memory/project_oauth_2_0b.md).
12+
# Kept here so the workload-token bridge in `server.py` still
13+
# imports cleanly when Phase 2.0c eventually resumes the
14+
# AgentCore Identity path. The bridge is now wrapped in
15+
# try/except (ImportError, AttributeError), so removing this dep
16+
# would degrade gracefully — but for now we keep the dep to
17+
# preserve the clean code path.
818
"bedrock-agentcore==1.9.1", #https://pypi.org/project/bedrock-agentcore/
919
"claude-agent-sdk==0.2.82", #https://github.com/anthropics/claude-agent-sdk-python/releases/tag/v0.2.82
1020
"requests==2.34.2", #https://pypi.org/project/requests/

cli/src/commands/admin.ts

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -169,12 +169,31 @@ export function makeAdminCommand(): Command {
169169
throw err;
170170
}
171171

172-
await cognito.send(new AdminSetUserPasswordCommand({
173-
UserPoolId: config.user_pool_id,
174-
Username: email,
175-
Password: tempPassword,
176-
Permanent: true,
177-
}));
172+
// The user has been created at this point. If `AdminSetUserPassword`
173+
// fails (stricter password policy than the generator, partial IAM
174+
// grant on the Set verb, throttling, etc.) the user is left in
175+
// `FORCE_CHANGE_PASSWORD` state — they exist in the pool but
176+
// can't actually log in. Surface a clear diagnostic so the
177+
// admin knows to either retry the password set manually or
178+
// delete the half-created user before re-running.
179+
try {
180+
await cognito.send(new AdminSetUserPasswordCommand({
181+
UserPoolId: config.user_pool_id,
182+
Username: email,
183+
Password: tempPassword,
184+
Permanent: true,
185+
}));
186+
} catch (err) {
187+
const message = err instanceof Error ? err.message : String(err);
188+
const errorName = err instanceof Error ? err.name : 'Error';
189+
throw new CliError(
190+
`User ${email} was created but the password could not be set `
191+
+ `(${errorName}: ${message}). The user is now stuck in FORCE_CHANGE_PASSWORD `
192+
+ 'state and cannot log in. Either:\n'
193+
+ ` 1. Delete the user and re-run: aws cognito-idp admin-delete-user --user-pool-id ${config.user_pool_id} --username ${email}\n`
194+
+ ' 2. Or set the password manually via the AWS console once the underlying issue is fixed.',
195+
);
196+
}
178197

179198
const bundle = encodeBundle(config);
180199
printInviteSummary(email, tempPassword, bundle);

cli/src/commands/linear.ts

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -164,8 +164,22 @@ export async function isWebhookSecretConfigured(
164164
const result = await client.send(new GetSecretValueCommand({ SecretId: secretArn }));
165165
const value = result.SecretString;
166166
return typeof value === 'string' && value.startsWith('lin_wh_');
167-
} catch {
168-
return false;
167+
} catch (err) {
168+
// Only treat "secret doesn't exist yet" as a clean false — any
169+
// other error (AccessDenied, KMS decrypt failure, throttling) is
170+
// actionable and we should surface it. A bare `catch { return
171+
// false }` here makes setup re-prompt for a webhook secret when
172+
// the real problem is IAM, which is a confusing UX for operators.
173+
const errorName = (err as { name?: string }).name;
174+
if (errorName === 'ResourceNotFoundException') {
175+
return false;
176+
}
177+
const message = err instanceof Error ? err.message : String(err);
178+
throw new CliError(
179+
`Failed to read Linear webhook secret '${secretArn}': ${errorName ?? 'Error'}: ${message}. `
180+
+ 'Likely IAM permission gap — confirm your CLI principal has '
181+
+ '`secretsmanager:GetSecretValue` on this ARN.',
182+
);
169183
}
170184
}
171185

cli/test/commands/linear.test.ts

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -215,11 +215,20 @@ describe('isWebhookSecretConfigured', () => {
215215
expect(await isWebhookSecretConfigured(mockClient, 'arn:secret')).toBe(false);
216216
});
217217

218-
test('returns false on Secrets Manager error (best-effort: re-prompt is harmless)', async () => {
219-
mockSend.mockRejectedValueOnce(new Error('AccessDenied'));
218+
test('returns false on ResourceNotFoundException (secret has not been created yet)', async () => {
219+
const err = new Error('Secrets Manager cannot find the specified secret.');
220+
err.name = 'ResourceNotFoundException';
221+
mockSend.mockRejectedValueOnce(err);
220222
expect(await isWebhookSecretConfigured(mockClient, 'arn:secret')).toBe(false);
221223
});
222224

225+
test('throws on AccessDenied so operators see the IAM gap instead of a confusing re-prompt', async () => {
226+
const err = new Error('User is not authorized to perform: secretsmanager:GetSecretValue');
227+
err.name = 'AccessDeniedException';
228+
mockSend.mockRejectedValueOnce(err);
229+
await expect(isWebhookSecretConfigured(mockClient, 'arn:secret')).rejects.toThrow(/IAM permission gap/);
230+
});
231+
223232
test('returns false when SecretString is missing', async () => {
224233
mockSend.mockResolvedValueOnce({});
225234
expect(await isWebhookSecretConfigured(mockClient, 'arn:secret')).toBe(false);

docs/guides/LINEAR_SETUP_GUIDE.md

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -226,16 +226,49 @@ The signing secret in Secrets Manager doesn't match the webhook. Re-run `bgagent
226226

227227
## Migration from 2.0a (PAK) to 2.0b (OAuth)
228228

229-
If your deployment is on Phase 2.0a (personal API key), 2.0b is a **hard cutover** — there is no `--use-pak` fallback flag. Plan for a maintenance window:
229+
If your deployment is on Phase 2.0a (personal API key), 2.0b is a **hard cutover** — there is no `--use-pak` fallback flag. Plan for a short maintenance window (typically <30 min for a single workspace).
230230

231-
1. **Drain the queue.** Wait for in-flight tasks to finish. In-flight tasks at upgrade time will fail their final Linear comment because the OAuth token isn't yet authorized when the agent looks for it.
232-
2. **Deploy 2.0b.** `mise //cdk:deploy`. This adds `LinearWorkspaceRegistryTable`, removes `LinearApiTokenSecret` IAM grants from the agent runtime + Lambdas, and removes the `linear-api-key` AgentCore credential provider's role in the runtime.
233-
3. **For each Linear workspace, run Steps 1–4 above.** Each workspace needs a new Linear OAuth app, a new AgentCore credential provider (`linear-oauth-<slug>`), and a fresh OAuth authorize via `bgagent linear setup`.
234-
4. **Verify with a test issue.** Apply the `bgagent` label in each onboarded workspace and confirm the agent posts as `bgagent[bot]` (not as the previous PAK owner's Linear identity).
235-
5. **Decommission the PAK.** Once 2.0b is verified working, revoke the personal API key in Linear settings ([Linear Settings → Security](https://linear.app/settings/account/security) → Personal API keys → revoke). The PAK is no longer used by any code path; revoking it is a clean break.
236-
6. **Clean up the old api-key credential provider:** `aws bedrock-agentcore-control delete-api-key-credential-provider --name linear-api-key`.
231+
> **What changes under the hood.** 2.0a stored a single `LinearApiTokenSecret` (one PAK shared by all teammates) and granted the agent runtime `secretsmanager:GetSecretValue` on that one ARN. 2.0b stores a per-workspace `bgagent-linear-oauth-<slug>` secret containing `{access_token, refresh_token, expires_at, client_id, client_secret, …}`, and replaces the single-ARN grant with a `bgagent-linear-oauth-*` prefix grant. The CDK stack drops the `LinearApiTokenSecret` resource entirely, so there's no automated rollback once 2.0b is deployed.
237232
238-
User mappings in `LinearUserMappingTable` survive the migration — they're keyed on Linear identity, which is unchanged. Project mappings in `LinearProjectMappingTable` likewise survive.
233+
### Pre-deploy checklist
234+
235+
Run these BEFORE deploying 2.0b so you have everything ready when the maintenance window starts:
236+
237+
1. **List your in-flight tasks.** `bgagent list --status RUNNING --status PENDING` — the migration will not corrupt these, but their final Linear comment may fail because the OAuth token isn't yet authorized when the agent runs.
238+
2. **Pick one Linear workspace to migrate first.** Multi-workspace orgs should rehearse on the lowest-traffic workspace before doing the rest.
239+
3. **Note the workspace's `urlKey`** (the `<slug>` in `linear.app/<slug>/...`). You'll need it for `bgagent linear setup <slug>`.
240+
4. **Confirm CLI admin access.** You need an AWS principal with `secretsmanager:CreateSecret` on `bgagent-linear-oauth-*` AND `dynamodb:PutItem` on `LinearWorkspaceRegistryTable`. Without these, `bgagent linear setup` aborts mid-way (the OAuth dance succeeds, the secret write fails — your Linear OAuth app gets stuck with no usable token).
241+
242+
### Migration steps
243+
244+
1. **Drain the queue.** Wait for in-flight tasks to finish. In-flight tasks at deploy time will fail their final Linear comment because their token resolver short-circuits when neither `LinearApiTokenSecret` (gone) nor `bgagent-linear-oauth-<slug>` (not yet created) is present.
245+
2. **Deploy 2.0b.** `mise //cdk:deploy`. This adds `LinearWorkspaceRegistryTable`, removes the `LinearApiTokenSecret` resource and IAM grants, and adds the `bgagent-linear-oauth-*` prefix grant on the agent runtime + webhook processor + orchestrator.
246+
3. **For each Linear workspace, run [Steps 1–4 above](#step-by-step-setup).** Each workspace needs:
247+
- A new Linear OAuth app (Settings → API → Applications → Create new app, scopes `read,write,app:assignable,app:mentionable`)
248+
- `bgagent linear setup <slug>` to run the OAuth dance and write the per-workspace secret
249+
- The webhook signing secret pasted into the Secrets Manager `LinearWebhookSecret` resource
250+
4. **Re-onboard projects.** If 2.0a had `LinearProjectMappingTable` rows, they survive — but verify with `bgagent linear list-projects` that the listed projects still match what's mapped. The mapping rows are keyed on `linear_project_id` UUID which is stable across the migration.
251+
5. **Verify with a test issue.** Apply the trigger label in each onboarded workspace and confirm the agent posts as `bgagent[bot]` (not as the previous PAK owner's Linear identity). The author byline change is the cleanest signal that OAuth — not the PAK — is on the wire.
252+
6. **Decommission the PAK.** Once 2.0b is verified working, revoke the personal API key in Linear settings ([Linear Settings → Security](https://linear.app/settings/account/security) → Personal API keys → revoke). The PAK is no longer used by any code path; revoking is a clean break with no rollback.
253+
254+
### Rollback
255+
256+
If 2.0b fails verification and you need to revert before doing the OAuth setup:
257+
258+
- The `LinearApiTokenSecret` CFN resource has been deleted, so a `cdk deploy` of the previous commit will recreate it but **the secret value will be empty**. You'd need to re-paste the PAK value manually.
259+
- Recommend instead: **fix-forward**. The 2.0b OAuth dance is a 5-minute step per workspace; rolling back is rarely worth the time.
260+
261+
### What survives the migration
262+
263+
- **`LinearUserMappingTable`** — keyed on Linear identity (organization + user UUID), which is unchanged across PAK→OAuth.
264+
- **`LinearProjectMappingTable`** — keyed on `linear_project_id` UUID, also stable.
265+
- **`LinearWebhookDedupTable`** — TTL-bounded; rows from the maintenance window will TTL out within 8h.
266+
- **GitHub PR comments and Linear-issue mappings** in any in-flight task records.
267+
268+
### What does NOT survive
269+
270+
- `LinearApiTokenSecret` Secrets Manager value — gone with the CDK resource.
271+
- The 2.0a `linear-api-key` AgentCore credential provider (if 2.0a-with-Identity was deployed mid-Phase) — clean it up after with: `aws bedrock-agentcore-control delete-api-key-credential-provider --name linear-api-key`. Phase 2.0b-O2 does not use AgentCore Identity at all, so there's nothing to clean up if you skipped the parked 2.0a-Identity branch.
239272

240273
## Limits and budgets
241274

docs/src/content/docs/using/Linear-setup-guide.md

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -230,16 +230,49 @@ The signing secret in Secrets Manager doesn't match the webhook. Re-run `bgagent
230230

231231
## Migration from 2.0a (PAK) to 2.0b (OAuth)
232232

233-
If your deployment is on Phase 2.0a (personal API key), 2.0b is a **hard cutover** — there is no `--use-pak` fallback flag. Plan for a maintenance window:
233+
If your deployment is on Phase 2.0a (personal API key), 2.0b is a **hard cutover** — there is no `--use-pak` fallback flag. Plan for a short maintenance window (typically <30 min for a single workspace).
234234

235-
1. **Drain the queue.** Wait for in-flight tasks to finish. In-flight tasks at upgrade time will fail their final Linear comment because the OAuth token isn't yet authorized when the agent looks for it.
236-
2. **Deploy 2.0b.** `mise //cdk:deploy`. This adds `LinearWorkspaceRegistryTable`, removes `LinearApiTokenSecret` IAM grants from the agent runtime + Lambdas, and removes the `linear-api-key` AgentCore credential provider's role in the runtime.
237-
3. **For each Linear workspace, run Steps 1–4 above.** Each workspace needs a new Linear OAuth app, a new AgentCore credential provider (`linear-oauth-<slug>`), and a fresh OAuth authorize via `bgagent linear setup`.
238-
4. **Verify with a test issue.** Apply the `bgagent` label in each onboarded workspace and confirm the agent posts as `bgagent[bot]` (not as the previous PAK owner's Linear identity).
239-
5. **Decommission the PAK.** Once 2.0b is verified working, revoke the personal API key in Linear settings ([Linear Settings → Security](https://linear.app/settings/account/security) → Personal API keys → revoke). The PAK is no longer used by any code path; revoking it is a clean break.
240-
6. **Clean up the old api-key credential provider:** `aws bedrock-agentcore-control delete-api-key-credential-provider --name linear-api-key`.
235+
> **What changes under the hood.** 2.0a stored a single `LinearApiTokenSecret` (one PAK shared by all teammates) and granted the agent runtime `secretsmanager:GetSecretValue` on that one ARN. 2.0b stores a per-workspace `bgagent-linear-oauth-<slug>` secret containing `{access_token, refresh_token, expires_at, client_id, client_secret, …}`, and replaces the single-ARN grant with a `bgagent-linear-oauth-*` prefix grant. The CDK stack drops the `LinearApiTokenSecret` resource entirely, so there's no automated rollback once 2.0b is deployed.
241236
242-
User mappings in `LinearUserMappingTable` survive the migration — they're keyed on Linear identity, which is unchanged. Project mappings in `LinearProjectMappingTable` likewise survive.
237+
### Pre-deploy checklist
238+
239+
Run these BEFORE deploying 2.0b so you have everything ready when the maintenance window starts:
240+
241+
1. **List your in-flight tasks.** `bgagent list --status RUNNING --status PENDING` — the migration will not corrupt these, but their final Linear comment may fail because the OAuth token isn't yet authorized when the agent runs.
242+
2. **Pick one Linear workspace to migrate first.** Multi-workspace orgs should rehearse on the lowest-traffic workspace before doing the rest.
243+
3. **Note the workspace's `urlKey`** (the `<slug>` in `linear.app/<slug>/...`). You'll need it for `bgagent linear setup <slug>`.
244+
4. **Confirm CLI admin access.** You need an AWS principal with `secretsmanager:CreateSecret` on `bgagent-linear-oauth-*` AND `dynamodb:PutItem` on `LinearWorkspaceRegistryTable`. Without these, `bgagent linear setup` aborts mid-way (the OAuth dance succeeds, the secret write fails — your Linear OAuth app gets stuck with no usable token).
245+
246+
### Migration steps
247+
248+
1. **Drain the queue.** Wait for in-flight tasks to finish. In-flight tasks at deploy time will fail their final Linear comment because their token resolver short-circuits when neither `LinearApiTokenSecret` (gone) nor `bgagent-linear-oauth-<slug>` (not yet created) is present.
249+
2. **Deploy 2.0b.** `mise //cdk:deploy`. This adds `LinearWorkspaceRegistryTable`, removes the `LinearApiTokenSecret` resource and IAM grants, and adds the `bgagent-linear-oauth-*` prefix grant on the agent runtime + webhook processor + orchestrator.
250+
3. **For each Linear workspace, run [Steps 1–4 above](#step-by-step-setup).** Each workspace needs:
251+
- A new Linear OAuth app (Settings → API → Applications → Create new app, scopes `read,write,app:assignable,app:mentionable`)
252+
- `bgagent linear setup <slug>` to run the OAuth dance and write the per-workspace secret
253+
- The webhook signing secret pasted into the Secrets Manager `LinearWebhookSecret` resource
254+
4. **Re-onboard projects.** If 2.0a had `LinearProjectMappingTable` rows, they survive — but verify with `bgagent linear list-projects` that the listed projects still match what's mapped. The mapping rows are keyed on `linear_project_id` UUID which is stable across the migration.
255+
5. **Verify with a test issue.** Apply the trigger label in each onboarded workspace and confirm the agent posts as `bgagent[bot]` (not as the previous PAK owner's Linear identity). The author byline change is the cleanest signal that OAuth — not the PAK — is on the wire.
256+
6. **Decommission the PAK.** Once 2.0b is verified working, revoke the personal API key in Linear settings ([Linear Settings → Security](https://linear.app/settings/account/security) → Personal API keys → revoke). The PAK is no longer used by any code path; revoking is a clean break with no rollback.
257+
258+
### Rollback
259+
260+
If 2.0b fails verification and you need to revert before doing the OAuth setup:
261+
262+
- The `LinearApiTokenSecret` CFN resource has been deleted, so a `cdk deploy` of the previous commit will recreate it but **the secret value will be empty**. You'd need to re-paste the PAK value manually.
263+
- Recommend instead: **fix-forward**. The 2.0b OAuth dance is a 5-minute step per workspace; rolling back is rarely worth the time.
264+
265+
### What survives the migration
266+
267+
- **`LinearUserMappingTable`** — keyed on Linear identity (organization + user UUID), which is unchanged across PAK→OAuth.
268+
- **`LinearProjectMappingTable`** — keyed on `linear_project_id` UUID, also stable.
269+
- **`LinearWebhookDedupTable`** — TTL-bounded; rows from the maintenance window will TTL out within 8h.
270+
- **GitHub PR comments and Linear-issue mappings** in any in-flight task records.
271+
272+
### What does NOT survive
273+
274+
- `LinearApiTokenSecret` Secrets Manager value — gone with the CDK resource.
275+
- The 2.0a `linear-api-key` AgentCore credential provider (if 2.0a-with-Identity was deployed mid-Phase) — clean it up after with: `aws bedrock-agentcore-control delete-api-key-credential-provider --name linear-api-key`. Phase 2.0b-O2 does not use AgentCore Identity at all, so there's nothing to clean up if you skipped the parked 2.0a-Identity branch.
243276

244277
## Limits and budgets
245278

0 commit comments

Comments
 (0)