Skip to content

Conversation

@mattip
Copy link
Member

@mattip mattip commented Jun 26, 2025

OpenBLAS 0.3.30 was recently released, let's use it. Since we now have scipy-openblas64 wheels for win_arm64, use those in the wheel build.

@github-actions github-actions bot added the 36 - Build Build related PR label Jun 26, 2025
@mattip
Copy link
Member Author

mattip commented Jun 26, 2025

On cirrusCI in the first cibuildwheel iteration (python3.11) of the macos_arm64 build these tests are failing, pointing to a problem with OpenBLAS. The dtype seems to always be float32, but the tests with a * test many dtypes and float32 appears first in the list of dtypes to try:

test_ufunc_noncontiguous[matmul] with dtype float32*
TestEinsum.test_einsum_sums_float32
TestMethods.test_arr_mult[dot] with float32*
TestMethods.test_arr_mult[matmul] with float32*
TestMatmul.test_matrix_matrix_values with float32*
TestMatmul.test_dot_equivalent_matrix_matrix_blastypes[float32]
TestMatmulOperator.test_matrix_matrix_values with float32*
TestInv.test_generalized_sq_cases with float32*

and more in total 44 tests failed.

@mattip
Copy link
Member Author

mattip commented Jun 27, 2025

Locally, the cirrusCI failure does not reproduce on a macbook pro m2 with clang 17.0.0. However locally I can reproduce the 2 failures from the 'transition to github CI" PR #29069. CirrusCI uses clang 16.0.0, the PR uses clang 15.0.0, so I don't think it is a compiler problem.

@ngoldbaum
Copy link
Member

I can also reproduce those two failures on my mac.

@mattip
Copy link
Member Author

mattip commented Jun 28, 2025

It reproduces on scipy-openblas32 as well, and reproduces all the way back to the first scipy-openblas macos_arm64 packages 0.3.24.95.0. The warnings are emitted in zgetrf. Using a homebrew installed gcc-14.2.0 and building OpenBLAS 0.3.30 makes the warnings disappear and all the tests pass.

@mattip
Copy link
Member Author

mattip commented Jun 29, 2025

I am stuck.

  • I don't see why only in zgetrf the floating point error states are set when using clang but not when using gcc, maybe there is a flag missing?
  • I tried to replace clang with gcc in the openblas-lib build in use gcc from homebrew MacPython/openblas-libs#207, but somehow I am setting the cross-compilation up improperly, so the macos-arm64 builds fail.

@martin-frbg any hints how to move past the failures? Do you know off-hand what kernel is used by zgetrf on macos-arm64?

Maybe we can wrap the call and manually reset the flags on macos-arm64?

return LAPACK(zgetrf)(m, n, a, lda, ipiv, info);

@martin-frbg
Copy link

getrf is reimplemented via ztrsm&zgemm (or as getf2 using zgemv&zscal when the problem size is small enough. if you are building for either Apple M or generic ARMV8 that should mean trsm as generic C code ( unlikely to be an issue) zgemv, zgemm and zscal all using the same assembly that any other non-SVE target uses.
no obvious candidate among them that got major changes in 0.3.30 IIRC

@mattip
Copy link
Member Author

mattip commented Jun 29, 2025

Thanks. If I go back to 0.3.24, I still see the FPE registers set in zgetrf on that platform in a very small problem (2x2 matrix). I think the OpenBLAS build uses the native clang compilers. I tried using the same homebrew clang compiler as you do in the OpenBLAS CI, but still got the FPE register errors.

@mattip
Copy link
Member Author

mattip commented Jun 29, 2025

I worked around the problem in a different PR to use github to build the NumPy wheels by resetting the registers after a call to zgetrf.

@mattip
Copy link
Member Author

mattip commented Jul 3, 2025

Replaced by #29069

@mattip mattip closed this Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

36 - Build Build related PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants