-
Notifications
You must be signed in to change notification settings - Fork 509
HLO Deviation Unit Tests #3713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
darisoy
wants to merge
1
commit into
main
Choose a base branch
from
hlo-identical-tests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+6,405
−8
Open
HLO Deviation Unit Tests #3713
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| name: "Update HLO References (for hlo_diff_test.py)" | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| build-wheel: | ||
| uses: ./.github/workflows/build_package.yml | ||
| with: | ||
| device_type: tpu | ||
| device_name: v6e-4 | ||
| cloud_runner: linux-x86-n2-16-buildkit | ||
|
|
||
| run-tests: | ||
|
github-advanced-security[bot] marked this conversation as resolved.
Fixed
|
||
| needs: build-wheel | ||
| uses: ./.github/workflows/run_tests_coordinator.yml | ||
| with: | ||
| flavor: tpu-integration | ||
| base_image: maxtext-unit-test-tpu:py312 | ||
| is_scheduled_run: false | ||
| maxtext_sha: ${{ github.sha }} | ||
| is_update_hlo: true | ||
|
|
||
| commit-changes: | ||
|
github-advanced-security[bot] marked this conversation as resolved.
Fixed
|
||
| needs: run-tests # Wait for tests to finish | ||
| runs-on: ubuntu-latest | ||
| permissions: | ||
| contents: write | ||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| ref: ${{ github.ref }} | ||
|
|
||
| - name: Download Reference HLO | ||
| uses: actions/download-artifact@v4 | ||
| with: | ||
| name: reference-hlo | ||
| path: tests/utils/ | ||
|
|
||
| - name: Commit and Push changes | ||
| run: | | ||
| git config --global user.name "github-actions[bot]" | ||
| git config --global user.email "github-actions[bot]@users.noreply.github.com" | ||
| git add tests/utils/reference_hlo_*.txt | ||
| git commit -m "Update reference HLO from CI artifact" | ||
| git push | ||
|
github-advanced-security[bot] marked this conversation as resolved.
Fixed
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,90 @@ | ||
| # HLO Graph Diff Verification Testing | ||
|
|
||
| This document provides context for the HLO Graph Diff tests, what HLO is, and how to manage reference baselines. | ||
|
|
||
| ## Related Files | ||
|
|
||
| - **Test Logic**: `tests/integration/hlo_diff_test.py` | ||
| - **Reference Checkpoints baselines**: `tests/utils/reference_hlo_*.txt` | ||
| - **Update Helper script**: `tests/utils/update_hlo_references.py` | ||
| - **GitHub Action Trigger Workflow**: `.github/workflows/update_reference_hlo.yml` | ||
|
|
||
| ## What is HLO? | ||
|
|
||
| **HLO (High-Level Optimizer)** is the intermediate representation used by XLA (Accelerated Linear Algebra) to capture the lowering compiler graph structures. | ||
|
|
||
| An HLO module records: | ||
|
|
||
| - The sequences of low-level math operations (dot products, convolutions, additions). | ||
| - Array tensor shapes and numerical precisions. | ||
| - Multipod TPU cluster partitioning array sharding mappings. | ||
|
|
||
| ## Purpose of HloDiffTest | ||
|
|
||
| The primary purpose of the `TestHloDiff` validation checks is to ensure that **refactoring PRs are purely refactoring code** and not unintentionally impacting graph compiler lowering or performance. | ||
|
|
||
| - **For pure refactors:** The HLO graph layout should remain *strictly identical*. Any detected deviation flags that execution boundaries or operation pipelines might have changed under the hood. | ||
| - **For dependency updates:** Changes to framework dependencies (like updating JAX or XLA versions) *are expected* to slightly alter compiled HLO output layouts, which makes baseline updates appropriate in those scenarios. | ||
|
|
||
| ______________________________________________________________________ | ||
|
|
||
| ## How the Test Works | ||
|
|
||
| This test runs automatically as part of the [`tpu-integration`](https://github.com/AI-Hypercomputer/maxtext/actions/workflows/build_and_test_maxtext.yml) CI test suite on every Pull Request. | ||
|
|
||
| When the test method executes, it performs the following sequence of actions: | ||
|
|
||
| 1. **Triggers Compilation**: It runs the model training lifecycle compilation-only phase (invoking `train_compile.main()`) without actually allocating hardware compute nodes or running optimization passes. | ||
| 2. **Dumps HLO modules**: Instructs the XLA compiler back-end to capture optimizer operations lowering structure graphs and dump them to text files. | ||
| 3. **Strict comparison matches**: Compares the structural lines of the generated representation graph directly against baseline `.txt` copies stored under `tests/utils/`. | ||
|
|
||
| ______________________________________________________________________ | ||
|
|
||
| ## Updating HLO reference files | ||
|
|
||
| When intended architectures transformations alter graph lowering, reference file baselines require updates. | ||
|
|
||
| > [!IMPORTANT]\ | ||
| > While running the update script locally is not the end of the world, **relying on local execution can cause remote CI tests to fail.** | ||
| > The PR verification pipelines run the tests in a strictly locked GitHub Actions environment. The smallest discrepancies in local library installations will introduce slight backend lowering graph deviations. If your local execution leads to a remote CI check failure, rely on the GitHub Action trigger described below to generate environment-matching baselines. | ||
|
|
||
| ### Method 1: Run the manual GitHub Action Workflow (Highly Recommended) | ||
|
|
||
| Triggering the CI workflow guarantees execution runs within the correct environment isolation scope. | ||
|
|
||
| #### Option A: Using the GitHub UI | ||
|
|
||
| 1. Go to the Actions tab in the repository browser. | ||
| 2. Find the manual workflow: `Update HLO References (for hlo_diff_test.py)`. | ||
| 3. Run it targeting your PR workspace branch. It compiles the graph layout and commits the baseline update files back to the branch automatically. | ||
|
|
||
| #### Option B: Using the GitHub CLI (`gh`) | ||
|
|
||
| Alternatively, you can trigger the remote workflow via terminal CLI execution: | ||
|
|
||
| ```bash | ||
| gh workflow run update_reference_hlo.yml --ref <branch> | ||
| ``` | ||
|
|
||
| > [!NOTE] | ||
| > A successful run of the manual update workflow will add a new commit to your Pull Request branch. Once complete, you must: | ||
| > | ||
| > 1. Pull the new commit from remote. | ||
| > 2. Squash the commits in your branch once again to keep your PR history clean. | ||
| > 3. Push the squashed commit to remote. | ||
| > 4. Retry the `tpu-integration` workflow to verify tests pass on your PR. | ||
|
|
||
| ### Method 2: Local Execution | ||
|
|
||
| If you need to test or update baselines manually during development: | ||
|
|
||
| ```bash | ||
| source .venv/bin/activate | ||
| pytest tests/integration/hlo_diff_test.py -v | ||
| ``` | ||
|
|
||
| Or to force update the local baselines: | ||
|
|
||
| ```bash | ||
| python3 tests/utils/update_hlo_references.py | ||
| ``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.