avx512bw,vbmi: Optimize shift_left and shift_right. #1147

degasus · 2025-07-21T20:16:21Z

Use _maskz_permutexvar instead of permute + and on avx512bw, too
Use xsimd::make_batch_constant and so _mm512_set instead of _mm512_load(constexpr std::array)

The first one is just a tiny optimization already done in the AVX512VBMI implementation.

The second patch makes it easier for the compiler to move this constants into the .text section. GCC is not affected, MSVC sadly is affected a lot. It used to generated 8 times mov [rbp + N], const with a single vmovdqu32 zmm, [rdp] afterwards - poor MSVC.... And worse for the store_forwarding unit....

This patch picks the instructions from avx512vbmi for the fast path. Masking is faster than an additional AND instruction.

Instead of loading from an aligned array on the stack. So we yield the `set` instead of `load` intrinsic, which makes it easier for the compiler to constant fold this parts. Sadly, MSVC needs this....

degasus added 2 commits July 21, 2025 20:18

avx512bw: Optimize shift_left and shift_right.

6af69c2

This patch picks the instructions from avx512vbmi for the fast path. Masking is faster than an additional AND instruction.

avx512bw,vbmi: Use make_batch_constant.

1d957d8

Instead of loading from an aligned array on the stack. So we yield the `set` instead of `load` intrinsic, which makes it easier for the compiler to constant fold this parts. Sadly, MSVC needs this....

serge-sans-paille merged commit 45cbf4b into xtensor-stack:master Jul 22, 2025
63 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

avx512bw,vbmi: Optimize shift_left and shift_right. #1147

avx512bw,vbmi: Optimize shift_left and shift_right. #1147

Uh oh!

degasus commented Jul 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avx512bw,vbmi: Optimize shift_left and shift_right. #1147

avx512bw,vbmi: Optimize shift_left and shift_right. #1147

Uh oh!

Conversation

degasus commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

degasus commented Jul 21, 2025 •

edited

Loading