Skip to content

[mypyc] feat: specialize any and all using for loop helpers if possible [1/2]#19948

Open
BobTheBuidler wants to merge 23 commits intopython:masterfrom
BobTheBuidler:any-all-for-helper
Open

[mypyc] feat: specialize any and all using for loop helpers if possible [1/2]#19948
BobTheBuidler wants to merge 23 commits intopython:masterfrom
BobTheBuidler:any-all-for-helper

Conversation

@BobTheBuidler
Copy link
Copy Markdown
Contributor

@BobTheBuidler BobTheBuidler commented Sep 28, 2025

This PR modifies the any and all call specializers to use our for loop helpers in cases where they can be used.

This is only marginally helpful now but will become more impactful with the addition of ForMap and ForFilter helpers, as well as any future helpers.

While the private helpers might seem unnecessary at this point, I intend to use them for future specialization use cases such as list(iterable) and tuple(iterable) as discussed in #19649 , and sum(iterable) which will come shortly after this PR.

This PR is ready for review

Update: The sum(iterable) is now ready in #19950. It's a bit more involved than this one so I'm going to leave it separate.

Update: The helpers are no longer private

Comment thread mypyc/test-data/irbuild-basic.test Outdated
Comment thread mypyc/test-data/irbuild-basic.test Outdated
r12 = list_get_item_unsafe r2, r9
r13 = cast(str, r12)
__mypyc_all_item__ = r13
r14 = CPyStr_IsTrue(__mypyc_all_item__)
Copy link
Copy Markdown
Contributor Author

@BobTheBuidler BobTheBuidler Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boolean check for the all call is now specialized per the dtype of the iterable, and we no longer have to use python's iterator protocol for the input

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On master this seemed to just call the stdlib all using a vector call? So the change is that we now generate a for loop for this, right?

builder.goto(loop_exit)
builder.activate_block(true_block)

index_type = builder._analyze_iterable_item_type(arg)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I deduplicate this block or is this fine?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to deduplicate, since the only difference seems to be true/false switch, so a helper would just need a bool flag that indicates whether it's any or all. Not a big deal though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just keep it as-is, the intent of the code is explicit and clear

Comment thread mypyc/irbuild/for_helpers.py Outdated

def create_synthetic_nameexpr(index_name: str, index_type: Type) -> NameExpr:
"""This helper spoofs a NameExpr to use as the lvalue in one of the for loop helpers."""
index = NameExpr(index_name)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably need something here to ensure no name collisions globally, but not sure what

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my other comment I suggested adding _<line>_column suffix. Another thing we use is to use the temp_counter defined in IRBuilder to get an increasing suffix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should create_synthetic_nameexpr be an IRBuilder method instead of going here then?

I think it could be used to refactor a few other places within the codebase where NameExpr are artificially created

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented the temp_counter but left this function where it is for now

@BobTheBuidler BobTheBuidler changed the title [mypyc] feat: specialize any and all using for loop helpers if possible [mypyc] feat: specialize any and all using for loop helpers if possible [1/2] Oct 1, 2025
@BobTheBuidler BobTheBuidler changed the title [mypyc] feat: specialize any and all using for loop helpers if possible [1/2] [mypyc] feat: specialize any and all using for loop helpers if possible [1/2] Oct 1, 2025
Comment thread mypyc/irbuild/specialize.py Outdated
Comment thread mypyc/irbuild/specialize.py
Comment thread mypyc/irbuild/for_helpers.py Outdated

def create_synthetic_nameexpr(index_name: str, index_type: Type) -> NameExpr:
"""This helper spoofs a NameExpr to use as the lvalue in one of the for loop helpers."""
index = NameExpr(index_name)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my other comment I suggested adding _<line>_column suffix. Another thing we use is to use the temp_counter defined in IRBuilder to get an increasing suffix.

arg = expr.args[0]
if isinstance(arg, GeneratorExpr):
return any_all_helper(builder, arg, builder.false, lambda x: x, builder.true)
elif expr_has_specialized_for_helper(builder, arg):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to understand why this check is useful. The idea is that if we'd have a generic for loop (PyObject_GetIter etc.) we can as well call the stdlib function?

What would happen if we'd always created the for loop here, since it would likely avoid the overhead of calling any using generic call op, and the namespace dict lookup? If that sounds feasible, can you run some microbenchmarks to check if it's useful? I think it might be worth it especially if the iterable is small (0 or 1 items), which is quite common in practice.

Copy link
Copy Markdown
Contributor Author

@BobTheBuidler BobTheBuidler Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"It took me a while to understand why this check is useful. The idea is that if we'd have a generic for loop (PyObject_GetIter etc.) we can as well call the stdlib function?"

Uh. No. While I appreciate your assessment, it wasn't exactly so thought-out. My only intent was to fall back to the existing logic, whatever that may be, if there is no supported ForHelper.

I need to think about how to implement this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait I want to confirm, you're saying if we just remove the if check, we're good to go without any other changes? I suppose that makes sense since the for_loop_helper does have the generic ForGenerator class but I wonder if we're missing any edge cases where that might cause failure

builder.goto(loop_exit)
builder.activate_block(true_block)

index_type = builder._analyze_iterable_item_type(arg)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to deduplicate, since the only difference seems to be true/false switch, so a helper would just need a bool flag that indicates whether it's any or all. Not a big deal though.

Comment thread mypyc/test-data/irbuild-basic.test Outdated
Comment thread mypyc/test-data/irbuild-basic.test Outdated
r12 = list_get_item_unsafe r2, r9
r13 = cast(str, r12)
__mypyc_all_item__ = r13
r14 = CPyStr_IsTrue(__mypyc_all_item__)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On master this seemed to just call the stdlib all using a vector call? So the change is that we now generate a for loop for this, right?

if desc:
obj = builder.accept(obj_expr)
return builder.primitive_op(desc, [obj], expr.line)
if node := type_expr.node:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was definitely not necessary but it bothered me how ugly that snippet was

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants