Checks
Strands Version
^1.1.0
Node.js Version
nodejs:20.v101
Operating System
Linux
Installation Method
npm
Steps to Reproduce
Steps to Reproduce
Run this on AWS Lambda (Node.js 20 runtime). The agent uses a single Zod-schema'd tool, structured output, and has prior messages pre-loaded into agent.messages before invocation.
import { Agent, BedrockModel, Message, TextBlock } from '@strands-agents/sdk'
import { z } from 'zod'
const ResponseSchema = z.object({
content: z.string(),
sources: z.array(z.object({ id: z.string() })),
})
const myTool = tool({
name: 'retrieve_from_kb',
description: 'Retrieve relevant chunks from a knowledge base.',
inputSchema: z.object({ query: z.string() }),
callback: async ({ query }) => {
// returns a string of formatted chunks
return await myKbUtil.retrieve(query)
},
})
const agent = new Agent({
model: new BedrockModel({
modelId: 'us.anthropic.claude-3-5-haiku-20241022-v1:0',
region: 'us-east-1',
temperature: 0.3,
maxTokens: 2000,
}),
systemPrompt,
tools: [myTool],
structuredOutputSchema: ResponseSchema,
printer: false,
})
// Pre-load conversation history (e.g. from external memory store)
for (const m of priorMessages) {
agent.messages.push(
new Message({ role: m.role, content: [new TextBlock(m.text)] }),
)
}
const result = await agent.invoke(userMessage) // never resolves; Lambda exits
The model decides to call the tool, the tool callback returns successfully, and then the Lambda exits before the second model call (which would feed the tool result back to the model) completes.
Expected Behavior
agent.invoke(...) resolves with the final AgentResult after the agent loop completes. Any errors during the loop should propagate as rejected promises that user code can catch with try/catch around await agent.invoke(...).
Actual Behavior
Two distinct symptoms — both pointing at detached async work in the agent loop:
Symptom 1: Lambda exits with Runtime.NodeJsExit (exit code 0) while await agent.invoke(...) is still pending.
ERROR Invoke Error
{
"errorType": "Runtime.NodeJsExit",
"errorMessage": "The Lambda runtime client detected an unexpected Node.js exit code. This is most commonly caused by a Promise that was never settled.",
"stack": [
"Runtime.NodeJsExit: ...",
" at file:///var/runtime/index.mjs:1383:37",
" at process.invoke (file:///var/runtime/index.mjs:869:44)",
" at process.emit (node:events:536:35)"
]
}
ERROR BEFORE_EXIT { "code": 0 }
Symptom 2: On invocations where the handler does return a successful response, CloudWatch sometimes logs:
ERROR Invocation has already been reported as done. Cannot call complete more than once per invocation.
This implies async work from the agent loop is still resolving after the handler has returned — the runtime gets a second "complete" signal it wasn't expecting.
Diagnostics that did NOT surface anything
beforeExit with code 0 means Node's event loop drained cleanly with no pending I/O, but the handler's await is still suspended. This is the signature of a producer-side async iterator that has stopped registering work on the event loop.
Event sequence (from agent.stream() for visibility)
beforeInvocationEvent
messageAddedEvent
beforeModelCallEvent // first model call — model decides to use tool
modelStreamUpdateEvent (×N)
[tool callback fires and returns chunks]
beforeModelCallEvent // second model call — about to feed tool result back
ERROR Runtime.NodeJsExit
ERROR BEFORE_EXIT { "code": 0 }
Failure happens reliably between the second beforeModelCallEvent and what should be the next modelStreamUpdateEvent.
Additional Context
Environment:
@strands-agents/sdk: ^1.1.0
AWS Lambda runtime: nodejs:20.v101
Region: us-east-1
Model: us.anthropic.claude-3-5-haiku-20241022-v1:0 (cross-region inference profile, IAM scoped to us-east-1/us-east-2/us-west-2)
Memory: 1024 MB (Max Used: ~150 MB — not a memory issue)
Architecture: x86_64
Conditions that affect reproduction:
Reliably fails on tool-using turns when prior messages are pre-loaded into agent.messages.
A tool-using turn with no prior history sometimes succeeds — but produces the "complete more than once" warning afterward, suggesting the same underlying async-leak bug just doesn't always crash the runtime.
Works fine for single-model-call turns where the model doesn't invoke a tool.
What this rules out:
Not an unhandled promise rejection (handler doesn't fire).
Not a synchronous throw (uncaught exception handler doesn't fire).
Not Bedrock IAM (cross-region foundation model + inference profile permissions are present and the first model call succeeds).
Not a tool callback issue (callback completes and logs its return value before the failure).
Possible Solution
The clean exit with BEFORE_EXIT code 0 while the handler is still awaiting strongly suggests the agent loop launches the second model call (post-tool) on a code path where the resulting promise is not awaited as part of the iterator's pending work. Likely either:
The Bedrock model client call between the tool result and the second model invocation is fired without registering with the iterator's pending-work tracker.
The agent loop's internal promise chain has a path where a non-awaited .then(...) allows control to return, draining the event loop while the iterator's next() is still suspended.
Possible direction: audit the agent loop's transition from "tool result received" → "second model call initiated" for any non-awaited promises, and ensure the iterator's pending work is tracked through that transition.
The "complete more than once" symptom on successful runs reinforces this: if the second model call's continuation runs after agent.invoke(...) has already resolved on some paths, that's the same leak surfacing differently.
Related Issues
aws/bedrock-agentcore-sdk-typescript#160 (request for AgentCoreMemorySessionManager in TypeScript — touches the same agent lifecycle area)
Checks
Strands Version
^1.1.0
Node.js Version
nodejs:20.v101
Operating System
Linux
Installation Method
npm
Steps to Reproduce
Steps to Reproduce
Run this on AWS Lambda (Node.js 20 runtime). The agent uses a single Zod-schema'd tool, structured output, and has prior messages pre-loaded into agent.messages before invocation.
The model decides to call the tool, the tool callback returns successfully, and then the Lambda exits before the second model call (which would feed the tool result back to the model) completes.
Expected Behavior
agent.invoke(...) resolves with the final AgentResult after the agent loop completes. Any errors during the loop should propagate as rejected promises that user code can catch with try/catch around await agent.invoke(...).
Actual Behavior
Two distinct symptoms — both pointing at detached async work in the agent loop:
Symptom 1: Lambda exits with Runtime.NodeJsExit (exit code 0) while await agent.invoke(...) is still pending.
Symptom 2: On invocations where the handler does return a successful response, CloudWatch sometimes logs:
This implies async work from the agent loop is still resolving after the handler has returned — the runtime gets a second "complete" signal it wasn't expecting.
Diagnostics that did NOT surface anything
beforeExit with code 0 means Node's event loop drained cleanly with no pending I/O, but the handler's await is still suspended. This is the signature of a producer-side async iterator that has stopped registering work on the event loop.
Event sequence (from agent.stream() for visibility)
Failure happens reliably between the second beforeModelCallEvent and what should be the next modelStreamUpdateEvent.
Additional Context
Environment:
@strands-agents/sdk: ^1.1.0
AWS Lambda runtime: nodejs:20.v101
Region: us-east-1
Model: us.anthropic.claude-3-5-haiku-20241022-v1:0 (cross-region inference profile, IAM scoped to us-east-1/us-east-2/us-west-2)
Memory: 1024 MB (Max Used: ~150 MB — not a memory issue)
Architecture: x86_64
Conditions that affect reproduction:
Reliably fails on tool-using turns when prior messages are pre-loaded into agent.messages.
A tool-using turn with no prior history sometimes succeeds — but produces the "complete more than once" warning afterward, suggesting the same underlying async-leak bug just doesn't always crash the runtime.
Works fine for single-model-call turns where the model doesn't invoke a tool.
What this rules out:
Not an unhandled promise rejection (handler doesn't fire).
Not a synchronous throw (uncaught exception handler doesn't fire).
Not Bedrock IAM (cross-region foundation model + inference profile permissions are present and the first model call succeeds).
Not a tool callback issue (callback completes and logs its return value before the failure).
Possible Solution
The clean exit with BEFORE_EXIT code 0 while the handler is still awaiting strongly suggests the agent loop launches the second model call (post-tool) on a code path where the resulting promise is not awaited as part of the iterator's pending work. Likely either:
The Bedrock model client call between the tool result and the second model invocation is fired without registering with the iterator's pending-work tracker.
The agent loop's internal promise chain has a path where a non-awaited .then(...) allows control to return, draining the event loop while the iterator's next() is still suspended.
Possible direction: audit the agent loop's transition from "tool result received" → "second model call initiated" for any non-awaited promises, and ensure the iterator's pending work is tracked through that transition.
The "complete more than once" symptom on successful runs reinforces this: if the second model call's continuation runs after agent.invoke(...) has already resolved on some paths, that's the same leak surfacing differently.
Related Issues
aws/bedrock-agentcore-sdk-typescript#160 (request for AgentCoreMemorySessionManager in TypeScript — touches the same agent lifecycle area)