Nvtx tracer#1087
Conversation
Summary of ChangesHello @Ohm-Rishabh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the profiling capabilities of the codebase by integrating NVTX range markers into key computational graphs and the training loop. This allows developers to gain deeper insights into performance bottlenecks using NVIDIA's profiling tools. Additionally, the changes include updates to a training test script to leverage these new profiling features and minor cleanups within the training pipeline for improved stability and data logging. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces NVTX range markers throughout the codebase to improve profiling capabilities with NVIDIA Nsight Systems. The changes are well-implemented, adding a new nvtx_range utility and applying it to key performance-critical sections in various models and layers. This will provide valuable, fine-grained insights into GPU execution.
My review includes a few suggestions to improve the portability of example and test scripts by removing hardcoded GPU device IDs. I also noted some leftover debug code and a significant out-of-scope change (removal of audio muxing logic) that would be better handled in a separate pull request to maintain clarity and focus.
I am having trouble creating individual review comments. Click here to see my feedback.
examples/training/finetune/wan_t2v_1.3B/crush_smol/preprocess_wan_data_t2v.sh (4)
Hardcoding CUDA_VISIBLE_DEVICES makes this example script difficult to run on different machines where GPU 2 might not be available or suitable. It would be more robust to either expect the user to set this environment variable before running the script or to parameterize it.
# Set the CUDA_VISIBLE_DEVICES environment variable to select a specific GPU, e.g.:
# export CUDA_VISIBLE_DEVICES=0
fastvideo/tests/training/Vanilla/mfu_calculation.py (36)
Hardcoding CUDA_VISIBLE_DEVICES in a test script reduces its portability and can cause failures on systems where the specified GPUs (5, 6) are not available. It's better to let the user or the test environment configure this. Please remove this line and let the environment (e.g., a CI runner or the user's shell) control which GPUs are visible to the script.
fastvideo/training/training_pipeline.py (835)
This commented-out return statement appears to be a leftover from a debugging session. It should be removed to keep the code clean.
fastvideo/training/training_pipeline.py (951-1042)
The removal of the _mux_audio static method and its related logic is a significant change. While this might be a valid cleanup, it seems unrelated to the main purpose of this pull request, which is to add NVTX tracers. Including unrelated changes makes the PR harder to review and understand. It would be better to submit this change in a separate PR with a descriptive title and explanation.
|
Can you resolve the conflict? |
I've resolved the merge conflicts. lmk if you need to do any other changes. |
|
This PR has merge conflicts with the base branch. Please rebase: git fetch origin main
git rebase origin/main
# Resolve any conflicts, then:
git push --force-with-lease |
Pre-commit checks failedHi @Ohm-Rishabh, the pre-commit checks have failed. To fix them locally: # Install pre-commit if you haven't already
uv pip install pre-commit
pre-commit install
# Run all checks and auto-fix what's possible
pre-commit run --all-filesCommon fixes:
After fixing, commit and push the changes. The checks will re-run automatically. For future commits, |
1 similar comment
Pre-commit checks failedHi @Ohm-Rishabh, the pre-commit checks have failed. To fix them locally: # Install pre-commit if you haven't already
uv pip install pre-commit
pre-commit install
# Run all checks and auto-fix what's possible
pre-commit run --all-filesCommon fixes:
After fixing, commit and push the changes. The checks will re-run automatically. For future commits, |
Buildkite CI tests failedHi @Ohm-Rishabh, some Buildkite CI tests have failed. Check the build for details: Common causes:
If the failure is unrelated to your changes, leave a comment explaining why. |
|
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 PR merge requirementsWaiting for:
This rule is failing.
|
Pre-commit checks failedHi @Ohm-Rishabh, the pre-commit checks have failed. To fix them locally: # Install pre-commit if you haven't already
uv pip install pre-commit
pre-commit install
# Run all checks and auto-fix what's possible
pre-commit run --all-filesCommon fixes:
After fixing, commit and push the changes. The checks will re-run automatically. For future commits, |
Buildkite CI tests failedHi @Ohm-Rishabh, some Buildkite CI tests have failed. Check the build for details: Common causes:
If the failure is unrelated to your changes, leave a comment explaining why. |
Pre-commit checks failedHi @Ohm-Rishabh, the pre-commit checks have failed. To fix them locally: # Install pre-commit if you haven't already
uv pip install pre-commit
pre-commit install
# Run all checks and auto-fix what's possible
pre-commit run --all-filesCommon fixes:
After fixing, commit and push the changes. The checks will re-run automatically. For future commits, |
Buildkite CI tests failedHi @Ohm-Rishabh, some Buildkite CI tests have failed. Check the build for details: Common causes:
If the failure is unrelated to your changes, leave a comment explaining why. |
Pre-commit checks failedHi @Ohm-Rishabh, the pre-commit checks have failed. To fix them locally: # Install pre-commit if you haven't already
uv pip install pre-commit
pre-commit install
# Run all checks and auto-fix what's possible
pre-commit run --all-filesCommon fixes:
After fixing, commit and push the changes. The checks will re-run automatically. For future commits, |
❌ CI tests failed@Ohm-Rishabh — to see what failed:
Or view all builds for this branch on Buildkite → Common causes:
If the failure looks unrelated to your changes, comment why and a maintainer will review. |
This is the pr for adding nvtx range markers in the codebase.
To get trace: python fastvideo/tests/training/Vanilla/mfu_calculation.py --profile
To add custom range: with nvtx_range("range name"): ....code block....