You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add xsimd::get<I>() for compile-time element extraction
Introduces get<I>(batch) as a top-level API for extracting a single lane
at a compile-time index. Falls back to the runtime get() when per-arch
overloads aren't present.
Per-arch optimal lowerings:
- SSE2: pextrw / byte-shift+movd / swizzle+first by lane width.
- SSE4.1: pextrb/w/d/q; I==0 short-circuits to first().
- AVX: I==0 short-circuits to first(); else halve + SSE4.1 path.
- AVX-512F: I==0 short-circuits to first(); 32/64-bit lanes use
valignd/valignq + first() (2 ops); 8/16-bit halve through AVX.
- NEON / NEON64 / RVV: native single-lane extract intrinsics.
0 commit comments