Skip to content

fix: AdvancedProfiler no longer crashes when stop() is called after teardown#21790

Open
pooyanazad wants to merge 3 commits into
Lightning-AI:masterfrom
pooyanazad:fix-advanced-profiler-crash
Open

fix: AdvancedProfiler no longer crashes when stop() is called after teardown#21790
pooyanazad wants to merge 3 commits into
Lightning-AI:masterfrom
pooyanazad:fix-advanced-profiler-crash

Conversation

@pooyanazad

Copy link
Copy Markdown

What does this PR do?

This PR fixes a bug where AdvancedProfiler crashes with a ValueError if a profiling context manager is active during trainer teardown.

When teardown() is called, it clears the profiled_actions dictionary. Exiting the context manager subsequently calls stop(), which was crashing because the action no longer existed.
I fixed this by:

  1. Making stop() return silently if the action doesn't exist.
  2. Disabling all active cProfile instances in teardown() before clearing the dictionary to prevent global profiler leaks.

Fixes #9136

Before submitting

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist
  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

@pooyanazad

Copy link
Copy Markdown
Author

Hi, we need a reviewer here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AdvancedProfiler: ValueError: Attempting to stop recording an action (run_test_evaluation) which was never started.

1 participant