Skip to content

Conversation

@junparser
Copy link
Contributor

No description provided.

@junparser
Copy link
Contributor Author

@serge-sans-paille [Linux build / gcc 11 - avx512vnni (pull_request)](https://github.com/xtensor-stack/xsimd/actions/runs/14120454206/job/39559599836?pr=1102) looks weird, I can't reproduce it with gcc&gcc12, what's the cflags used here?

@junparser
Copy link
Contributor Author

@serge-sans-paille [Linux build / gcc 11 - avx512vnni (pull_request)](https://github.com/xtensor-stack/xsimd/actions/runs/14120454206/job/39559599836?pr=1102) looks weird, I can't reproduce it with gcc&gcc12, what's the cflags used here?

this caused by _mm512_shuffle_epi8 for batch swizzle in avx512bw which is incorrect. use _mm512_permutexvar_epi8 in avx512vbmi instead.

@junparser
Copy link
Contributor Author

@serge-sans-paille , CMake Error at benchmark/CMakeLists.txt:12 (cmake_minimum_required): Compatibility with CMake < 3.5 has been removed from CMake. , seems we hit cmake issue. would you please take a look?

@serge-sans-paille
Copy link
Contributor

@junparser #1104 should do the trick

@junparser
Copy link
Contributor Author

@junparser #1104 should do the trick

Also, after this commit, would you like to add one more gcc 11 - avx512vbmi2 in ci?

@serge-sans-paille
Copy link
Contributor

gcc 11 - avx512vbmi2

#1105

@serge-sans-paille
Copy link
Contributor

@junparser
Copy link
Contributor Author

junparser commented Apr 3, 2025

Thanks, now this pr passes avx512vbmi2. Also is it possible to add something like [Linux build / clang18/20 - avx512vbmi2 in ci?


using all_x86_architectures = arch_list<
avx512vnni<avx512vbmi>, avx512vbmi, avx512ifma, avx512pf, avx512vnni<avx512bw>, avx512bw, avx512er, avx512dq, avx512cd, avx512f,
avx512vnni<avx512vbmi2>, avx512vbmi2, avx512vbmi, avx512ifma, avx512pf, avx512vnni<avx512bw>, avx512bw, avx512er, avx512dq, avx512cd, avx512f,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you motivate this change? Why would composition between avx512 VNNI be more legit with VMBI2 rather than VBMI?

Copy link
Contributor Author

@junparser junparser Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change based on https://en.wikichip.org/wiki/x86/avx512_vnni. The table shows that all of the arch have vbmi2 as well as vnni.

@serge-sans-paille serge-sans-paille merged commit 4b182f4 into xtensor-stack:master Apr 3, 2025
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants