Retry Handler Support when call activity or suborchestration#96
Retry Handler Support when call activity or suborchestration#96
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds selective retry support to the Durable Task JS SDK by introducing a handleFailure predicate to the RetryPolicy class, along with RetryHandler and AsyncRetryHandler types for future custom retry logic. It also fixes a critical bug in error type serialization that prevented error types from being accurately represented in TaskFailureDetails.
Changes:
- Added
handleFailurepredicate toRetryPolicyfor filtering which errors should be retried (aligned with .NET SDK'sRetryPolicy.Handledelegate) - Fixed error type serialization to use
e.name || e.constructor.nameand addedthis.nameto all custom Error classes to ensure accurate error type identification - Extended type system to accept
RetryHandlerandAsyncRetryHandlerinTaskOptions(types defined but not yet fully implemented in orchestration executor)
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
packages/durabletask-js/src/task/retry/retry-policy.ts |
Added handleFailure predicate, FailureHandlerPredicate type, and shouldRetry() method to RetryPolicy |
packages/durabletask-js/src/task/retry/retry-context.ts |
New file: RetryContext interface and factory function for retry handler context |
packages/durabletask-js/src/task/retry/retry-handler.ts |
New file: RetryHandler and AsyncRetryHandler type definitions with wrapper function |
packages/durabletask-js/src/task/retryable-task.ts |
Integrated handleFailure predicate check into computeNextDelayInMilliseconds() |
packages/durabletask-js/src/utils/pb-helper.util.ts |
Fixed error type serialization to use e.name || e.constructor.name |
packages/durabletask-js/src/task/exception/*.ts |
Added this.name assignments to TaskFailedError, OrchestrationStateError |
packages/durabletask-js/src/orchestration/exception/*.ts |
Added this.name assignment to OrchestrationFailedError |
packages/durabletask-js/src/exception/timeout-error.ts |
Added this.name assignment to TimeoutError |
packages/durabletask-js/src/task/options/task-options.ts |
Added TaskRetryOptions union type, helper functions, and type guards |
packages/durabletask-js/src/task/failure-details.ts |
Added TaskFailureDetails interface implemented by FailureDetails class |
packages/durabletask-js/src/worker/runtime-orchestration-context.ts |
Added isRetryPolicy type guard to filter RetryPolicy from RetryHandler options |
packages/durabletask-js/src/task/retry/index.ts |
Exported new retry-related types and functions |
packages/durabletask-js/src/task/options/index.ts |
Exported new task option helper functions and type guards |
packages/durabletask-js/src/index.ts |
Exported all new public API types and functions |
test/e2e-azuremanaged/retry-advanced.spec.ts |
New E2E tests for handleFailure filtering, maxRetryInterval capping, and retryTimeout |
packages/durabletask-js/test/*.spec.ts |
New unit tests for RetryContext, RetryHandler, handleFailure predicate, and task options |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Introduced RetryHandlerTask to support imperative retry control using AsyncRetryHandler. - Refactored RuntimeOrchestrationContext to utilize createRetryTaskOrDefault for task creation. - Enhanced retry logic to handle both RetryableTask and RetryHandlerTask. - Added comprehensive tests for RetryHandlerTask, covering various scenarios including success, failure, and retry logic. - Updated orchestration executor tests to validate new retry handler functionality.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…OrchestrationExecutor
…durabletask-js into wangbill/retryhandler
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…and support for synchronous handlers
…durabletask-js into wangbill/retryhandler
torosent
left a comment
There was a problem hiding this comment.
Comprehensive Code Review: PR #96 — Retry Handler Support
Overall Assessment
Well-structured PR with good test coverage (~1,400 lines of tests) and excellent JSDoc documentation. The RetryTaskBase extraction is a clean refactoring, and the inheritance hierarchy is logical. However, there are two high-severity issues that should be addressed before merge.
Summary of Findings
| Severity | Category | Issue |
|---|---|---|
| High | Correctness | e.name change is a breaking change for custom Error classes without this.name — other Error subclasses in the codebase aren't updated |
| High | Performance/Design | RetryHandlerTask retries are always immediate (no backoff) — could spam the backend |
| Medium | Correctness | tryHandleRetry uses sequential if instead of else if for mutually exclusive types |
| Medium | Correctness | isCancelled is always false — dead code path in the public API |
| Medium | Correctness | Stale action reference after rescheduling |
| Medium | Correctness | Sub-orchestration retry reuses the same instance ID |
| Low | Performance | Duplicated protobuf-to-plain-object extraction in two classes |
| Low | Performance | toAsyncRetryHandler double-wrapping from taskOptionsFromSyncRetryHandler |
| Low | Best Practice | No input validation on handler in RetryHandlerTask constructor |
| Info | Memory | Circular ref between RetryHandlerTask and context — acceptable for V8 |
| Info | Simplicity | Helper function proliferation in task-options.ts |
What's Good
- Clean class hierarchy:
Task → CompletableTask → RetryTaskBase → RetryableTask|RetryHandlerTaskavoids duplication - Comprehensive tests: Unit tests cover edge cases (non-retriable failures, exhausted retries, replay-safe context), plus E2E tests
- Excellent JSDoc: Every public type has
@remarks,@example, and@paramdocumentation, including determinism requirements - Proper export organization: New exports well-organized through barrel files, implementation details kept internal
- Correct cleanup:
_pendingTaskscleanup during retry is handled correctly,complete()properly clears failure state
… retry handling logic
…nd add tests for new behavior
… in RuntimeOrchestrationContext
torosent
left a comment
There was a problem hiding this comment.
Second Review — Post-Update Analysis
All 12 issues from the first review have been addressed — excellent responsiveness! The retry handler delay support (RetryHandlerResult = boolean | number) is well-designed and the code quality has improved significantly.
4 new findings in this follow-up:
| # | Severity | Finding |
|---|---|---|
| 1 | Medium | isNonRetriable flag not checked in RetryableTask (only checked in RetryHandlerTask) |
| 2 | Medium | Version field dropped when rescheduling retry tasks |
| 3 | Low | Handler returning 0 crashes orchestration instead of graceful handling |
| 4 | Low | Missing integration test for delay-based retry handler path |
…ehensive tests for various scenarios
Summary
What changed?
New Features: Retry Handler Support
This PR adds support for imperative retry control through
AsyncRetryHandlerandRetryHandlerfunctions, allowing users to make custom retry decisions based on failure context. This mirrors the .NET SDK'sInvokeWithCustomRetryHandlerpattern.New Types & APIs:
AsyncRetryHandler- Async function delegate for imperative retry decisionsRetryHandler- Sync function delegate for imperative retry decisionsRetryContext- Context object passed to handlers with failure details, attempt count, elapsed timetoAsyncRetryHandler()- Utility to normalize sync handlers to asynctaskOptionsFromRetryHandler()- Creates TaskOptions from an AsyncRetryHandlertaskOptionsFromSyncRetryHandler()- Creates TaskOptions from a sync RetryHandlerTaskRetryOptions- Union type supportingRetryPolicy | AsyncRetryHandler | RetryHandlerEnhanced RetryPolicy:
handleFailurepredicate option for filtering retryable failures by error type/messageretryTimeoutInMillisecondsoption for overall retry timeoutshouldRetry()method to evaluate failures against the predicateNew Internal Classes:
RetryTaskBase<T>- Abstract base class for retry tasks, shared between policy and handler approachesRetryHandlerTask<T>- Task implementation for handler-based retry (immediate reschedule, no timer delay)Orchestration Context Changes:
callActivity()andcallSubOrchestrator()now acceptTaskOptions.retryasRetryPolicy,AsyncRetryHandler, orRetryHandlerrescheduleRetryTask()for immediate retry rescheduling (used by handler-based retry)Bug Fixes
e.nameinstead ofe.constructor.name, preserving custom error names set viaerror.name = "CustomType"Why is this change needed?
handleFailurepredicate enables fail-fast behavior for non-retryable errors (e.g., validation errors, authorization failures)InvokeWithCustomRetryHandlerpatternerror.name = "PermanentError"and filter on that in retry handlersIssues / work items
Project checklist
CHANGELOG.mdRetryPolicyusage continues to work unchanged.AI-assisted code disclosure (required)
Was an AI tool used? (select one)
If AI was used:
pb-helper.util.ts(error name detection)orchestration-executor.ts(retry logging order)AI verification (required if AI was used):
Testing
Automated tests
New Unit Tests (7 test files, 100+ test cases):
retry-context.spec.ts- RetryContext creation and propertiesretry-handler.spec.ts- RetryHandler/AsyncRetryHandler types and toAsyncRetryHandler()retry-handler-task.spec.ts- RetryHandlerTask shouldRetry() behaviorretry-policy-handle-failure.spec.ts- handleFailure predicate functionalityretryable-task.spec.ts- RetryableTask with enhanced policy featurestask-options-retry-handler.spec.ts- TaskOptions creation helpersorchestration_executor.spec.ts- Updated for new retry scenariosNew E2E Tests (2 test files, 15 test cases):
retry-handler.spec.ts- AsyncRetryHandler with activities and sub-orchestrationsretry-advanced.spec.ts- handleFailure predicate, timeout, and combined featuresManual validation (only if runtime/behavior changed)
npm run build- Successfulnpm run lint- Passednpm run test- All unit tests passednpx jest test/e2e-azuremanaged/retry-*.spec.ts --runInBand --forceExit- All 15 E2E tests passedTransientError,PermanentError,ValidationErrorcorrectly captured and used in retry decisionsNotes for reviewers
Key Design Decisions
RetryTaskBase abstraction: Created a shared base class to avoid code duplication between
RetryableTask(policy-based) andRetryHandlerTask(handler-based)Immediate vs delayed retry:
RetryableTask(policy) uses timers for backoff delaysRetryHandlerTask(handler) reschedules immediately - delay logic is the handler's responsibilityHandler runs as orchestrator code: Like .NET SDK, the retry handler is subject to replay determinism requirements.
RetryContext.orchestrationContext.isReplayingis exposed for guarding side-effects.Sync/Async handler support: Both sync
RetryHandlerand asyncAsyncRetryHandlerare supported. Sync handlers are automatically wrapped viatoAsyncRetryHandler().Files Changed Summary
retry-handler.ts,retry-context.ts,task-options.tsretry-policy.ts,retryable-task.tsretry-handler-task.ts,retry-task-base.tsorchestration-executor.ts,runtime-orchestration-context.tspb-helper.util.tsindex.ts,retry/index.ts,options/index.ts