Skip to content

Implement Name Based partitioning and update documents#28903

Merged
yuslepukhin merged 12 commits into
mainfrom
yuslepukhin/future_directions
Jun 12, 2026
Merged

Implement Name Based partitioning and update documents#28903
yuslepukhin merged 12 commits into
mainfrom
yuslepukhin/future_directions

Conversation

@yuslepukhin

@yuslepukhin yuslepukhin commented Jun 8, 2026

Copy link
Copy Markdown
Member

This pull request introduces a new "name-based layer assignment" feature for ONNX Runtime, allowing device assignment of model nodes based on substring matching of node names, as an alternative to annotation-based matching. The implementation ensures that name-based and annotation-based assignment modes are mutually exclusive, and updates documentation, configuration keys, and core logic to support this new capability.

Key changes:

Name-based Layer Assignment Feature

  • Added support for a new session option, session.name_based_layer_assignment, which enables device assignment using substring matching against node names (rather than metadata annotations). The longest matching pattern wins, and all patterns are treated as substrings (the = exact-match qualifier is disallowed in this mode). [1] [2]

  • Implemented the SubstringMatcher class and integrated it into the LayeringIndex logic, enabling efficient substring-based rule matching for node assignment. The matcher sorts patterns by length and returns the first (longest) match. [1] [2]

Mutual Exclusivity and API Changes

  • Enforced mutual exclusivity between annotation-based (session.layer_assignment_settings) and name-based (session.name_based_layer_assignment) assignment options. Attempting to set both now results in an error, with clear messaging. [1] [2]

  • Updated the LayeringIndex::Create API and related logic to accept both configuration strings, select the active mode, and construct the appropriate matcher. [1] [2]

Documentation Updates

  • Expanded the documentation to describe the new name-based assignment feature, provide usage examples, highlight best practices for pattern writing, and explain the mutual exclusivity and lack of subgraph inheritance in name-based mode. [1] [2]

Core Logic and Maintenance

  • Refactored the LayeringIndex and related methods to support both matching modes, updating node assignment and update logic to branch appropriately based on the selected mode. [1] [2]

  • Added and documented the new configuration key kOrtSessionOptionsNameBasedLayerAssignment in the public API headers.

These changes provide a more flexible and user-friendly way to partition models across devices, especially for models with structured node names but lacking explicit annotations.

- Correct misleading claim about reusing trie-based LayeringRuleMatcher
- Add SubstringMatcher class design (flat vector + std::string::find)
- Document longest-match-first priority semantics
- Update session option examples to use trailing-slash patterns
- Add integration sketch with LayeringIndex::ProcessGraph
Add SubstringMatcher class that matches patterns anywhere in node names,
unlike the existing trie-based LayeringRuleMatcher which only matches prefixes.

- New session option: session.name_based_layer_assignment
- SubstringMatcher with longest-match-first priority
- Integrated into LayeringIndex::ProcessGraph as fallback after annotation matching
- Annotation-based matching takes priority when both are configured
- Updated PartitioningWithAnnotationsAndMemoryConstraints.md with usage docs
- Tests: SubstringMatcher unit tests + LayeringIndex integration tests
  including annotation-priority-over-name-based test

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an alternative to annotation-based layered partitioning by allowing device assignment rules to match substrings of Node::Name(), enabling partitioning of models with structured node names without modifying model metadata. It updates session configuration parsing to support both annotation-based and name-based rules, and expands documentation accordingly.

Changes:

  • Add session.name_based_layer_assignment session option and integrate it into InferenceSession::TransformGraph.
  • Implement SubstringMatcher and refactor LayeringIndex creation/indexing to combine annotation-based (higher priority) and name-based (fallback) rules.
  • Add unit/integration tests plus documentation updates for the new behavior.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
onnxruntime/core/framework/layering_annotations.h Adds SubstringMatcher and extends LayeringIndex to optionally use it.
onnxruntime/core/framework/layering_annotations.cc Parses/merges name-based + annotation-based rules; performs substring matching fallback by node name.
onnxruntime/core/session/inference_session.cc Wires new session option into graph transform/partitioning pipeline.
include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h Adds the new config key constant and its API comment.
onnxruntime/test/framework/layering_annotations_test.cc Adds SubstringMatcher tests and integration tests for name-based assignment + priority behavior.
onnxruntime/test/framework/session_state_test.cc Updates LayeringIndex::Create callsite for new signature.
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md Documents session.name_based_layer_assignment and updates “combining features” guidance.
docs/annotated_partitioning/future_directions_constrained_env.md Adds long-form design/future directions write-up (new).
docs/annotated_partitioning/cuda_kernel_workspace_inventory.md Adds CUDA workspace inventory document (new).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/framework/layering_annotations.h
Comment thread include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h Outdated
Comment thread docs/annotated_partitioning/future_directions_constrained_env.md Outdated

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Name-based partitioning

The change is well-structured overall: SubstringMatcher cleanly encapsulates longest-pattern-wins substring matching, the new session.name_based_layer_assignment option is documented, and the annotation-vs-name priority is covered by focused unit tests.

One functional gap worth addressing before merge (see inline comment): the incremental LayeringIndex::Update() path does not apply substring_matcher_, so name-based assignment is inconsistent between the initial ProcessGraph pass and the per-partition Update() calls from graph_partitioner.cc.

I also agree with the existing automated comment on layering_annotations.h: building matcher_ from the merged rules_ lets a node's annotation string accidentally prefix/exact-match a name-based pattern. Scoping the annotation matcher to the annotation-only rule range (or discarding matcher_.Match() results in the name-based index range) would keep the two matching domains independent.

Verdict: COMMENT.

Comment thread onnxruntime/core/framework/layering_annotations.cc Outdated
1. Scope annotation matcher to annotation-only rules (Comment 1):
   - LayeringRuleMatcher now accepts a rule_count parameter to limit
     which rules from the merged vector it indexes. Prevents a node's
     annotation string from accidentally matching a name-based pattern.

2. Reject '=' in name-based config (Comment 2):
   - Validate after parsing: if any name-based rule has prefix_match=false
     (meaning '=' was used), return an error with a clear message.
   - Updated doc comment from 'NOT supported' to 'rejected with an error'.

3. Update roadmap text (Comment 3):
   - Replace stale 'name: prefix qualifier' roadmap item with actual
     implementation description (separate session option + SubstringMatcher).

4. Add substring_matcher_ fallback to Update() (Comment 4):
   - The incremental Update() path (called after layout transforms) now
     tries substring_matcher_->Match(node->Name()) when annotation matching
     finds no match, consistent with ProcessGraph().
   - Added UpdateAppliesSubstringMatcherToNewNodes test covering this path.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comment thread onnxruntime/core/framework/layering_annotations.h
Comment thread onnxruntime/core/framework/layering_annotations.cc
Comment thread onnxruntime/core/framework/layering_annotations.cc Outdated
Comment thread onnxruntime/core/framework/layering_annotations.cc Outdated
…tcher

Address additional review comment: the LayeringIndex::Create overload that
accepts a SubstringMatcher now also takes annotation_rule_count so callers
can properly scope the annotation matcher to annotation-only rules.

Updated AnnotationTakesPriorityOverNameBased test to pass the correct count.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comment thread onnxruntime/core/framework/layering_annotations.h Outdated
Comment thread onnxruntime/core/framework/layering_annotations.h Outdated
Comment thread onnxruntime/core/framework/layering_annotations.h Outdated
Comment thread onnxruntime/core/framework/layering_annotations.cc Outdated
…ove naming

- LayeringRuleMatcher: rule_count → annotation_rule_count (std::optional<size_t>)
  nullopt = index all rules, 0 = index none (name-only configs)
- SubstringMatcher: rule_index_offset → name_rules_start_index
  Clearer: it's the starting index of name-based rules in the merged vector
- Updated all doc comments to match new semantics and naming
- Fixed: name-only config (annotation_rule_count=0) no longer leaks
  name-based patterns into the annotation trie

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread onnxruntime/core/framework/layering_annotations.cc Outdated
Comment thread onnxruntime/core/framework/layering_annotations.cc Outdated
…lly exclusive

Annotation-based matching uses sparse metadata with subgraph inheritance.
Name-based matching uses dense node names without inheritance. The two
modes have incompatible semantics, so combining them is now rejected with
INVALID_ARGUMENT.

- Remove merged-rules/offset machinery (annotation_rule_count, name_rules_start_index)
- ProcessGraph branches cleanly: name-based uses SubstringMatcher only,
  annotation-based uses LayeringRuleMatcher with inheritance
- Update config key docs, design docs, and user-facing docs
- Replace AnnotationTakesPriorityOverNameBased test with
  MutualExclusivityRejectsBothConfigs test

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread docs/annotated_partitioning/future_directions_constrained_env.md Outdated
Comment thread docs/annotated_partitioning/future_directions_constrained_env.md
@yuslepukhin yuslepukhin requested a review from Copilot June 11, 2026 17:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread docs/annotated_partitioning/future_directions_constrained_env.md Outdated
Comment thread docs/annotated_partitioning/future_directions_constrained_env.md Outdated

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed the latest head. The previous round's concern about LayeringIndex::Update() not applying substring_matcher_ (causing inconsistent name-based assignment for nodes created during layout transforms) is now addressed and covered by UpdateAppliesSubstringMatcherToNewNodes — that thread is resolved.

The name-based partitioning implementation is clean and well-documented. A few minor items remain:

  • Test gap: No test exercises the new name-based = (exact-match) rejection error path. A small test asserting INVALID_ARGUMENT for a name-based config containing = (e.g. gpu(=Attention)) would lock in this validation, mirroring MutualExclusivityRejectsBothConfigs. (See inline comment.)
  • Nitpick: matcher_ (the LayeringRuleMatcher trie) is built unconditionally in the LayeringIndex constructor even in name-based mode, where it is never consulted. It could be made conditional if the construction cost ever matters for large rule sets.
  • Nitpick: The production factory inlines the LayeringIndex(...) constructor + ProcessGraph instead of delegating to the new static Create(graph, ep_map, rule_map, rules, substring_matcher) overload that currently exists only for tests. Reusing the overload would keep the two construction sites consistent.

None of these are blocking.

Comment thread onnxruntime/core/framework/layering_annotations.cc
@yuslepukhin yuslepukhin merged commit 29616be into main Jun 12, 2026
89 of 90 checks passed
@yuslepukhin yuslepukhin deleted the yuslepukhin/future_directions branch June 12, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants