Skip to content

XPath optimize preparation for positional predicates that uses last()#339

Open
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:more_positional_predicates
Open

XPath optimize preparation for positional predicates that uses last()#339
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:more_positional_predicates

Conversation

@tompng

@tompng tompng commented Jun 20, 2026

Copy link
Copy Markdown
Member

Expand optimizable simple positional queries to support [last()] [last()-N] and [position() <=> last()-N].
This will open the door for future optimization.

Simple positional predicates that may be optimized are represented as:

Before: [:==, 1], [:<, 2], [:>, 3]
↓
After: [:index_eq, 0], [:index_lt, 1], [:index_gt, 2]
Added: [:reverse_index_eq, 0], [:reverse_index_lt, 2], [:reverse_index_gt, 4]

Example

Example optimization of preceding-sibling/following-sibling (not included in this PR)

def preceding_following_sibling(nodeset, tester, selector, reverse:)
  ...
  when :reverse_index_eq, :reverse_index_lt, :reverse_index_gt
    nodeset.group_by(&:parent).flat_map do |parent, sibling_nodes|
      anchors = Set.new.compare_by_identity
      sibling_nodes.each {|sibling| anchors << sibling }
      children = parent.children
      children = children.reverse if reverse

      # Different anchor node gives the same reverse-index. We only need to check with the first anchor
      followings = children.drop_while {|child| !anchors.include?(child) }.drop(1)
      candidates = followings.select(&tester).reverse
      case operator
      when :reverse_index_eq
        matched = candidates[value] if value >= 0
        matched ? [matched] : []
      when :reverse_index_lt
        value >= 0 ? candidates[0...value] : []
      when :reverse_index_gt
        value >= 0 ? candidates.drop(value + 1) : candidates
      end
    end
  when :nodesets
    ...
  end
end
doc = REXML::Document.new('<root>'+('<a>'+'<b/><c/>'*500+'</a>')*2+'</root>')
REXML::XPath.match(doc, '//c/preceding-sibling::*[position()<last()-500]').size
# processing time:  9.740037s → 0.059448s
# => 996

Copilot AI review requested due to automatic review settings June 20, 2026 06:25

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the internal representation of “simple positional predicates” in the XPath parser to support last()-based forms (e.g., [last()], [last()-N], and comparisons against last()-N), enabling future axis-scan optimizations while preserving correct XPath behavior.

Changes:

  • Extends position_operation to recognize last()/last()-INTEGER and to normalize positional predicates into 0-based :index_* and :reverse_index_* selector forms.
  • Updates sibling-axis scanning and the generic slow-path selector application to operate on the new selector operators/0-based indexing.
  • Adds test coverage for inverted position() comparisons and last()-based predicates on preceding-/following-sibling axes.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
lib/rexml/xpath_parser.rb Adds last()-aware predicate detection and migrates selector operators to 0-based :index_* / :reverse_index_*, updating scan/slow-path handling accordingly.
test/xpath/test_axis_preceding_sibling.rb Adds assertions covering last()/last()-N and inverted position() comparisons for sibling axes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Expand optimizable simple positional queries to support `[last()]` `[last()-N]` and `[position() <=> last()-N]`.
This will open the door for future optimization
@tompng tompng force-pushed the more_positional_predicates branch from 3a226e6 to e789f1c Compare June 23, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants