Skip to content

Comments

Add BNB_ROCM_VERSION and ROCM_VERSION for ROCm/PyTorch version mismatch#1878

Merged
matthewdouglas merged 2 commits intobitsandbytes-foundation:mainfrom
lucbruni-amd:rocm-version-override
Feb 20, 2026
Merged

Add BNB_ROCM_VERSION and ROCM_VERSION for ROCm/PyTorch version mismatch#1878
matthewdouglas merged 2 commits intobitsandbytes-foundation:mainfrom
lucbruni-amd:rocm-version-override

Conversation

@lucbruni-amd
Copy link
Contributor

@lucbruni-amd lucbruni-amd commented Feb 18, 2026

When PyTorch is built with a different ROCm version than the system (e.g. torch+rocm7.0 on ROCm 7.2), bitsandbytes fails to find the native library because the build uses hipconfig (system) while runtime uses torch.version.hip (PyTorch).

  • Add BNB_ROCM_VERSION env var (runtime): override which ROCm library is loaded, analogous to BNB_CUDA_VERSION. Takes priority when both BNB_ROCM_VERSION and BNB_CUDA_VERSION are set on ROCm.

  • Add ROCM_VERSION CMake cache variable (build): override the version shortcode in the output library name (e.g. -DROCM_VERSION=70 produces libbitsandbytes_rocm70.so on a 7.2 system).

  • Update diagnostics and error messages to mention BNB_ROCM_VERSION; align _print_hip_runtime_diagnostics with _print_cuda_runtime_diagnostics.

  • Reject BNB_CUDA_VERSION on ROCm with a clear error pointing to BNB_ROCM_VERSION if the latter is not set.

  • Add ROCm tests: default path, override, rejection of BNB_CUDA_VERSION, and both vars set (ROCM wins).

    Fixes Bug when there is a mismatch between Torch and ROCm version ROCm/bitsandbytes#82.

When PyTorch is built with a different ROCm version than the system (e.g. torch+rocm7.0 on ROCm 7.2), bitsandbytes fails to find the native library because the build uses hipconfig (system) while runtime uses torch.version.hip (PyTorch).

- Add BNB_ROCM_VERSION env var (runtime): override which ROCm library is loaded, analogous to BNB_CUDA_VERSION. Takes priority when both BNB_ROCM_VERSION and BNB_CUDA_VERSION are set on ROCm.
- Add ROCM_VERSION CMake cache variable (build): override the version shortcode in the output library name (e.g. -DROCM_VERSION=70 produces libbitsandbytes_rocm70.so on a 7.2 system).
- Update diagnostics and error messages to mention BNB_ROCM_VERSION; align _print_hip_runtime_diagnostics with _print_cuda_runtime_diagnostics.
- Reject BNB_CUDA_VERSION on ROCm with a clear error pointing to BNB_ROCM_VERSION.
- Add ROCm tests: default path, override, rejection of BNB_CUDA_VERSION, and both vars set (ROCM wins).

Fixes ROCm#82.
@matthewdouglas matthewdouglas added this to the v0.50.0 milestone Feb 20, 2026
@github-actions
Copy link

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@matthewdouglas matthewdouglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@matthewdouglas matthewdouglas merged commit 746cd64 into bitsandbytes-foundation:main Feb 20, 2026
84 of 85 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug when there is a mismatch between Torch and ROCm version

2 participants