Skip to content

Qualcomm AI Engine Direct - Update CLI tool to dump inference performace#18085

Merged
abhinaykukkadapu merged 3 commits intopytorch:mainfrom
CodeLinaro:dev1/chenweng/cli_etdump
Apr 29, 2026
Merged

Qualcomm AI Engine Direct - Update CLI tool to dump inference performace#18085
abhinaykukkadapu merged 3 commits intopytorch:mainfrom
CodeLinaro:dev1/chenweng/cli_etdump

Conversation

@chenweng-quic
Copy link
Copy Markdown
Collaborator

@chenweng-quic chenweng-quic commented Mar 11, 2026

Summary

Update CLI tool to dump inference performance.
This will be used by internal QA to monitor inference speed.
image

Test plan

python ${EXECUTORCH_ROOT}/backends/qualcomm/tests/test_qnn_delegate.py -k TestUtilsScript.test_cli --model SM8850 --build_folder /local/mnt/workspace/chenweng/executorch/executorch/build-android --artifact et_artifact --device f519f0

cc @cccclai @cbilgin

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Mar 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18085

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 New Failure, 5 Pending

As of commit 4d8d929 with merge base 831b7d4 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 11, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@digantdesai digantdesai added the module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/ label Mar 11, 2026
@digantdesai
Copy link
Copy Markdown
Contributor

@claude Review the PR

@chenweng-quic chenweng-quic force-pushed the dev1/chenweng/cli_etdump branch 3 times, most recently from 0bce505 to 70c53b7 Compare March 14, 2026 16:18
Comment thread examples/qualcomm/utils.py Outdated
@chenweng-quic chenweng-quic marked this pull request as draft April 7, 2026 07:36
@chenweng-quic chenweng-quic force-pushed the dev1/chenweng/cli_etdump branch from 70c53b7 to 1dadd62 Compare April 23, 2026 07:36
@chenweng-quic chenweng-quic marked this pull request as ready for review April 23, 2026 07:36
@chenweng-quic chenweng-quic force-pushed the dev1/chenweng/cli_etdump branch from 1dadd62 to 23cc73e Compare April 24, 2026 01:30
@chenweng-quic chenweng-quic force-pushed the dev1/chenweng/cli_etdump branch from 23cc73e to 7838f11 Compare April 28, 2026 06:44
@chenweng-quic
Copy link
Copy Markdown
Collaborator Author

Hi @abhinaykukkadapu,
Could you tag claude to review this PR again?
Thanks

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@claude review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Apr 29, 2026

Claude finished @abhinaykukkadapu's task in 2m 26s —— View job


Code Review: PR #18085

  • Gather context and read changed files
  • Review backends/qualcomm/export_utils.py
  • Review examples/qualcomm/util_scripts/cli.py
  • Review backends/qualcomm/tests/test_qnn_delegate.py
  • Post review feedback

Summary

This PR adds a --profile flag to the CLI execute subcommand that runs 100 iterations of inference, pulls the ETDump, and converts it to a JSON performance report. The changes are generally clean and well-scoped. A few issues below:


Issues

1. --host default change is a breaking change (export_utils.py:694)

Changing the default of --host from None to "localhost" affects all scripts that use setup_common_args_and_variables(), not just the CLI tool. In SimpleADB._adb (line 343), if not self.host_id: controls whether -H <host> is passed to adb. With the old default (None), adb runs without -H. With "localhost", every adb command now includes -H localhost -s <device>, which changes behavior for all existing users/scripts that don't pass --host.

If the CLI specifically needs localhost as default, consider setting it only in the CLI's argparser rather than in setup_common_args_and_variables(). Fix this →

2. Bare except: clause (cli.py:308)

try:
    input_order_func = program.load_method(INPUT_ORDER)
except:
    logger.error(...)
    exit(1)

A bare except: catches everything including KeyboardInterrupt and SystemExit. Use except Exception: at minimum. Fix this →

3. Verbose comments in post_process_etdump (cli.py:419-435)

The function has several comments that just restate what the code does (# Create a list to hold the data, # Open the CSV file and read its contents, etc.). Per project guidelines, code should be self-documenting. The function can also be simplified:

def post_process_etdump():
    etdump_path = f"{args.output_folder}/etdump.etdp"
    csv_path = f"{args.output_folder}/etdump.csv"
    json_path = f"{args.output_folder}/performance.json"
    inspector = Inspector(etdump_path=etdump_path)
    inspector.save_data_to_tsv(csv_path)
    with open(csv_path, encoding="utf-8") as csv_file:
        data = list(csv.DictReader(csv_file, delimiter="\t"))
    with open(json_path, "w", encoding="utf-8") as json_file:
        json.dump(data, json_file, indent=4)

Fix this →

Minor Notes

  • Hardcoded iteration count: The profile mode hardcodes 100 iterations (cli.py:372). Consider making this configurable via a CLI argument (e.g., --iteration) so QA can adjust without code changes.
  • Test doesn't check subprocess return codes: In test_cli_execute_with_profile, the subprocess.run calls for quantize/compile/execute don't check check=True or assert returncode == 0. If any step fails silently, the test only catches it at the final assertTrue for the output file, making failures harder to diagnose. The existing test_cli has the same pattern though, so this is pre-existing.
  • shutil.rmtree(tmp_dir) at line 442: If args.profile is true, post_process is never called and tmp_dir may be empty, but it's still created (line 375-376) and removed. Not a bug, just slightly wasteful — you could skip creating tmp_dir when profiling.

Overall

The core feature (profile mode with ETDump → JSON conversion) is well-designed and useful. The --host default change (issue #1) is the most significant concern as it's a behavioral change affecting the shared setup_common_args_and_variables() function used by multiple scripts.


@abhinaykukkadapu abhinaykukkadapu merged commit 22e9347 into pytorch:main Apr 29, 2026
170 of 171 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants