Currently to check whether a value is True or False we need to compare it either True or False.
In the both the JIT and the interpreter (although this is worse in the JIT) we need to load a full 8 byte value from the instruction stream in order to perform the comparison.
Here's the x86 stencil for _GUARD_IS_TRUE_POP (with TOS caching):
// 0: 48 b8 00 00 00 00 00 00 00 00 movabsq $0x0, %rax
// 0000000000000002: R_X86_64_64 _Py_TrueStruct+0x1
// a: 4c 39 f8 cmpq %r15, %rax
// d: 0f 85 00 00 00 00 jne exit
By putting True and False in an aligned array:
struct _booleans {
PyLongObject False;
PyLongObject True;
};
alignas(sizeof(struct _booleans)) struct _booleans _PyBooleans = {
/* Data for False */
/* Data for True */
};
we can use the alignment to check for True or False by testing a single bit: ref.bits & sizeof(PyLongObject), resulting in this very efficient stencil for _GUARD_IS_TRUE_POP:
// 0: 41 f6 c7 20 testb $0x20, %r15b
// 4: 0f 84 00 00 00 00 je exit
Linked PRs
Currently to check whether a value is
TrueorFalsewe need to compare it eitherTrueorFalse.In the both the JIT and the interpreter (although this is worse in the JIT) we need to load a full 8 byte value from the instruction stream in order to perform the comparison.
Here's the x86 stencil for
_GUARD_IS_TRUE_POP(with TOS caching):By putting
TrueandFalsein an aligned array:we can use the alignment to check for
TrueorFalseby testing a single bit:ref.bits & sizeof(PyLongObject), resulting in this very efficient stencil for_GUARD_IS_TRUE_POP:Linked PRs