Skip to content

hn/loop-order-cost#521

Draft
HoangNguyen0309 wants to merge 4 commits into
mainfrom
hn/loop-order-cost
Draft

hn/loop-order-cost#521
HoangNguyen0309 wants to merge 4 commits into
mainfrom
hn/loop-order-cost

Conversation

@HoangNguyen0309

Copy link
Copy Markdown
Member

loop order cost port

match node:
case Table(Alias() as tns, idxs):
base = stats_bindings.get(tns)
st = stats_factory.relabel(base, tuple(idxs))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need this relabel call.

return penalty


def loop_order_cost(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is unfortunately not quite right. I had a version that worked like this a while back in previous iterations of the julia code. However, it suffers from a specific problem:

Suppose you have the query:
E_ijklm = A_ij * B_jk * C_kl * D_lm
Further, suppose that D is entirely zeros (i.e. it's empty). Then, the output will always be empty. This means that your full_stats object will represent an empty tensor, so when you project it down, it will still be empty. Because of this, the i->j->k->l->m and m->l->k->j->i order will look identical. However, in execution, the i->j->k->l->m order will be asymptotically worse because it will basically do the matmul between A and B in the first 3 loops, O(n^3), before realizing that the l,m loops are empty. The m->l->k->j->i order would simply exit immediately in O(1) time.

To fix this, you need to consider the sub-query that a prefix induces. When considering i-j-k, you should estimate the nnz of A_ijB_jkC_k. In the julia version, it does this in the function get_loop_lookups (https://github.com/finch-tensor/Finch.jl/blob/8b6554b2c73668670037bfb8413425035589d915/src/Galley/TensorStats/cost-estimates.jl#L15). However, it also relies on the function get_conjuncts_and_disjuncts (https://github.com/finch-tensor/Finch.jl/blob/8b6554b2c73668670037bfb8413425035589d915/src/Galley/PlanAST/plan.jl#L514) in order to figure out which input tensors in the expression are acting as 'conjuncts' (i.e. join-like arguments) and which ones are acting as 'disjuncts' (i.e. union-like arguments).

So, to carry over the key logic, you'll have to port both of those functions over.

@codspeed-hq

codspeed-hq Bot commented Jul 2, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 32 untouched benchmarks


Comparing hn/loop-order-cost (171b5b0) with main (314ba6a)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants