Skip to content

fix: Improve timeout handling and fix hanging tests#1451

Merged
thomhurst merged 7 commits into
mainfrom
fix/timeout-handling-and-test-improvements
Dec 30, 2025
Merged

fix: Improve timeout handling and fix hanging tests#1451
thomhurst merged 7 commits into
mainfrom
fix/timeout-handling-and-test-improvements

Conversation

@thomhurst
Copy link
Copy Markdown
Owner

Summary

  • Add TUnit-inspired TimeoutHelper with clean timeout pattern

    • Fast path for no timeout (avoids allocations)
    • Linked token sources for combining cancellation
    • 1-second grace period for cleanup after timeout
    • Distinguishes external cancellation from timeout
  • Refactor ModuleExecutionPipeline to use TimeoutHelper

    • Replaces redundant CancelAfter + Task.Delay pattern
    • Cleaner single-point timeout handling
  • Fix cancellation propagation in ModuleExecutionContext

    • Make ModuleCancellationTokenSource settable
    • Use CreateLinkedTokenSource for proper token linking
  • Fix hanging FolderTests that searched entire AppData folder

    • Replace with controlled temporary folder structure tests
    • Tests now complete in ~1.6s vs hanging indefinitely
  • Add test timeout infrastructure

    • Default 60s timeout in TestHostSettings
    • Timeout parameter in ExecutePipelineAsync overloads

Test plan

  • All 442 unit tests pass (437 succeeded, 5 skipped)
  • FolderTests complete in ~1.6s (previously hung indefinitely)
  • EngineCancellationTokenTests pass with proper cancellation propagation
  • Full test suite completes in ~1m 45s

🤖 Generated with Claude Code

thomhurst and others added 6 commits December 29, 2025 18:39
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This PR addresses issue #1417 by implementing proper module-scoped isolation:

**Breaking Changes:**
- `IEnvironmentContext.ContentDirectory` and `WorkingDirectory` are now readonly
  (changed from `{ get; set; }` to `{ get; }`)

**Improvements:**
- Add AsyncLocal cleanup in ModuleRunner to prevent potential context leaks
- Replace `List<T>` with `ConcurrentBag<T>` in ModuleLoggerContainer for lock-free
  thread-safe operations during high-concurrency module execution
- Add documentation explaining PipelineContext scoped lifetime and logger caching safety

**Files Changed:**
- ModuleRunner.cs: Added try/finally to clear AsyncLocal logger context
- IEnvironmentContext.cs: Made ContentDirectory/WorkingDirectory readonly with docs
- EnvironmentContext.cs: Implementation matches interface changes
- PipelineContext.cs: Added XML documentation for scoped DI behavior
- ModuleLoggerContainer.cs: Replaced List+locks with ConcurrentBag+Interlocked

Closes #1417

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Summary

Adds timeout handling infrastructure and fixes hanging tests by introducing a TimeoutHelper, refactoring cancellation propagation, and replacing problematic AppData folder tests.

Critical Issues

1. Potential resource leak in ModuleRunner.cs

In ModuleRunner.ExecuteModuleWithPipeline (src/ModularPipelines/Engine/Execution/ModuleRunner.cs:150-161), the AsyncLocal cleanup is added but the actual module execution was moved to ExecuteModuleLifecycle without the corresponding finally block being present in the diff. The try-finally pattern for AsyncLocal cleanup should wrap the entire module execution. Can you verify the full implementation ensures ModuleLogger.Values.Value is always cleared?

2. ConcurrentBag enumeration allocation

The change from List to ConcurrentBag in ModuleLoggerContainer is good for thread safety, but calling .ToArray() on every GetLogger call (line 41) creates allocations on what might be a hot path. Consider if GetLogger is called frequently during module execution - if so, this could be a performance regression compared to the previous lock-based approach.

Suggestions

1. TimeoutHelper edge case: very short timeouts

In TimeoutHelper.cs line 91, you schedule the timeout AFTER setting up registration. With extremely short timeouts (microseconds) there could be a theoretical race. This is likely fine for practical timeouts (milliseconds+), but worth noting for robustness.

2. Default test timeout might be aggressive

30-second default timeout (TestHostSettings.DefaultTestTimeout) might be short for complex pipelines or slow CI environments. Consider documenting this clearly so users know to override for complex tests, or increasing to 60 seconds.

3. FolderTests changes look good

The replacement of problematic AppData folder searches with controlled temporary folder tests is exactly the right fix.

Verdict

⚠️ REQUEST CHANGES - Please verify issue 1 (AsyncLocal cleanup) is complete and address issue 2 (potential performance regression) or confirm GetLogger is not a hot path.

- Add TUnit-inspired TimeoutHelper with clean timeout pattern
  - Fast path for no timeout (avoids allocations)
  - Linked token sources for combining cancellation
  - 1-second grace period for cleanup after timeout
  - Distinguishes external cancellation from timeout

- Refactor ModuleExecutionPipeline to use TimeoutHelper
  - Replaces redundant CancelAfter + Task.Delay pattern
  - Cleaner single-point timeout handling

- Fix cancellation propagation in ModuleExecutionContext
  - Make ModuleCancellationTokenSource settable
  - Use CreateLinkedTokenSource for proper token linking

- Fix hanging FolderTests that searched entire AppData folder
  - Replace with controlled temporary folder structure tests
  - Tests now complete in ~1.6s vs hanging indefinitely

- Add test timeout infrastructure
  - Default 60s timeout in TestHostSettings
  - Timeout parameter in ExecutePipelineAsync overloads

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@thomhurst thomhurst force-pushed the fix/timeout-handling-and-test-improvements branch from a1f7b35 to d63b236 Compare December 30, 2025 00:46
@github-actions
Copy link
Copy Markdown
Contributor

Summary

Adds robust timeout handling infrastructure and fixes hanging tests by introducing a TimeoutHelper pattern, refactoring cancellation propagation with linked token sources, and replacing problematic folder search tests.

Critical Issues

None found ✅

Suggestions

1. ModuleLoggerContainer performance consideration

In ModuleLoggerContainer.cs:40, the switch from List with locks to ConcurrentBag is good for thread safety. However, calling .ToArray() on every GetLogger call creates an allocation. Since GetLogger is only called once per module execution (not in hot loops - see ModuleRunner.cs:275 and DependencyWaiter.cs:68), this is acceptable. Just noting for awareness.

2. Consider documenting grace period rationale

The 1-second grace period in TimeoutHelper.cs:15 is a good pattern borrowed from TUnit. Consider adding a brief comment explaining why 1 second was chosen (balance between cleanup time and user experience when timeout occurs).

3. Test timeout documentation

The 30-second default timeout in TestHostSettings.cs is reasonable for unit tests. The XML doc comment correctly mentions users can override it, which is good for complex integration tests.

Previous Review Status

The github-actions bot raised two concerns that I've verified are non-issues:

  1. AsyncLocal cleanup: The diff shows proper try/finally wrapping ExecuteModuleLifecycle (lines 150-161 in diff)
  2. GetLogger allocations: Not a hot path - only called once per module execution, so ToArray() allocation is acceptable

Verdict

APPROVE - No critical issues. The timeout infrastructure is well-designed, cancellation token linking is correct, and the test fixes properly address the hanging issues. Clean implementation following TUnit patterns.

@thomhurst thomhurst merged commit 5b2b7f5 into main Dec 30, 2025
10 of 12 checks passed
@thomhurst thomhurst deleted the fix/timeout-handling-and-test-improvements branch December 30, 2025 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant