-
Notifications
You must be signed in to change notification settings - Fork 364
Description
Note
Developers who want to run PyTorch deep learning workloads need to install only the drivers and pip install PyTorch wheels binaries. The runtime package for the Intel® Deep Learning Essentials is installed automatically during the pip installation of the PyTorch wheels binaries.
— Intel
Dr. Suarez found CTranslate2 on stream through cibuildwheel. My guess being OpenBLAS is deprecated; I've no experience with oneDNN or other oneapi resource(s) other than Level Zero, but haven't used it yet.
Found this; of interest may be this file.
Important
Developers building PyTorch from source code need to install both the driver and Intel Deep Learning Essentials.
— Intel
Instead of the installer shown above, I'm using the standalone installer available for the compiler. If Intel's Deep Neural Network Library and Math Kernel Library are useful, please comment below.
$ ocloc query CL_DEVICE_EXTENSIONS
cl_ext_float_atomics cl_intel_accelerator cl_intel_command_queue_families cl_intel_device_attribute_query cl_intel_driver_diagnostics cl_intel_mem_force_host_memory cl_intel_required_subgroup_size cl_intel_spirv_subgroups cl_intel_split_work_group_barrier cl_intel_subgroup_local_block_io cl_intel_subgroups cl_intel_subgroups_char cl_intel_subgroups_long cl_intel_subgroups_short cl_intel_unified_shared_memory cl_khr_byte_addressable_store cl_khr_create_command_queue cl_khr_device_uuid cl_khr_extended_bit_ops cl_khr_external_memory cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_il_program cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_integer_dot_product cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_priority_hints cl_khr_spir cl_khr_spirv_linkonce_odr cl_khr_spirv_no_integer_wrap_decoration cl_khr_subgroup_ballot cl_khr_subgroup_clustered_reduce cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_suggested_local_work_size cl_khr_throttle_hints
$ ocloc query OCL_DRIVER_VERSION
1.0.032413
$ ocloc query CL_DEVICE_OPENCL_C_ALL_VERSIONS
"OpenCL C":1.0.0 "OpenCL C":1.1.0 "OpenCL C":1.2.0 "OpenCL C":3.0.0
$ ocloc query CL_DEVICE_OPENCL_C_FEATURES
__opencl_c_atomic_order_acq_rel:3.0.0 __opencl_c_atomic_order_seq_cst:3.0.0 __opencl_c_atomic_scope_all_devices:3.0.0 __opencl_c_atomic_scope_device:3.0.0 __opencl_c_ext_fp16_global_atomic_load_store:3.0.0 __opencl_c_ext_fp16_global_atomic_min_max:3.0.0 __opencl_c_ext_fp16_local_atomic_load_store:3.0.0 __opencl_c_ext_fp16_local_atomic_min_max:3.0.0 __opencl_c_ext_fp32_global_atomic_add:3.0.0 __opencl_c_ext_fp32_global_atomic_min_max:3.0.0 __opencl_c_ext_fp32_local_atomic_add:3.0.0 __opencl_c_ext_fp32_local_atomic_min_max:3.0.0 __opencl_c_generic_address_space:3.0.0 __opencl_c_int64:3.0.0 __opencl_c_integer_dot_product_input_4x8bit:3.0.0 __opencl_c_integer_dot_product_input_4x8bit_packed:3.0.0 __opencl_c_program_scope_global_variables:3.0.0 __opencl_c_subgroups:3.0.0 __opencl_c_work_group_collective_functions:3.0.0
$ ocloc query CL_DEVICE_PROFILE
FULL_PROFILE
:: initializing oneAPI environment...
Initializing Visual Studio command-line environment...
Visual Studio version 17.13.6 environment configured.
"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\"
Visual Studio command-line environment initialized for: 'x64'
: compiler -- latest
: debugger -- latest
: dev-utilities -- latest
: dpl -- latest
: ocloc -- latest
: tbb -- latest
: umf -- latest
:: oneAPI environment initialized ::
C:\Program Files (x86)\Intel\oneAPI>ocloc query SUPPORTED_DEVICES
C:\Program Files (x86)\Intel\oneAPI>
$ icx
Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2025.1.0 Build 20250317
Copyright (C) 1985-2025 Intel Corporation. All rights reserved.
icx: error: no input files
$ icpx
icpx: error: no input files
With C:\Program Files (x86)\Intel\oneAPI\compiler\2025.1\bin\common_clang64.dll, does this mean icx/icpx is clang-compatible? Is it usable in other projects? That aside; would really like to use it if it includes feature(s) facilitating hardware acceleration.
... As a continuous effort, more performance tuning and optimizations will be added into Intel oneAPI LLVM-based compilers and GCC compilers for Intel CPUs AVX-512 and AVX-512-FP16/VNNI ISA and Intel GPUs Gen12 ISA.
— Intel
Without Visual Studio Build Tools 2022 available in Linux, compilation fails if needing vcruntime.h when using icx or icpx. Noticed -std= as expected with icx in linux seems to be -Qstd= with icx in Windows.
https://intel.github.io/intel-extension-for-pytorch/
Note: The current implementation of the DPC++ extension only supports Linux.
— Intel
As for pufferlib - bbd22d - if starting with device = xpu in pufferlib/config/ocean/target.ini, linux shows AssertionError: Torch not compiled with XPU enabled which confirms the possibility. Officially without windows support as of 2.0; after pip install -e . --break-system-packages, getting LINK : error LNK2001: unresolved external symbol PyInit_ocean\target\binding with this merge commit.
Found Intel's install through their tutorial and example, seemingly without any Known Issue after both pip install commands completed successfully:
C:\Program Files (x86)\Intel\oneAPI>python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
[W530 13:29:29.000000000 OperatorEntry.cpp:161] Warning: Warning only once for all operators, other operators may also be overridden.
Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::geometric_(Tensor(a!) self, float p, *, Generator? generator=None) -> Tensor(a!)
registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\build\aten\src\ATen\RegisterSchema.cpp:6
dispatch key: XPU
previous kernel: registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:37
new kernel: registered at I:\frameworks.ai.pytorch.ipex-gpu\build\Release\csrc\gpu\csrc\gpu\xpu\ATen\RegisterXPU_0.cpp:186 (function operator ())
2.7.0+xpu
2.7.10+xpu
C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\xpu\__init__.py:60: UserWarning: XPU device count is zero! (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\c10\xpu\XPUFunctions.cpp:115.)
return torch._C._xpu_getDeviceCount()
C:\Program Files (x86)\Intel\oneAPI>
(not including torchvision and torchaudio in either pip install and guessing Microsoft runtime isn't needed as already using Visual Studio Build Tools 2022)
As for Linux, Intel has pip, source and docker selections if needed.
May need Level Zero:
garner@linux:~$ python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/__init__.py", line 122, in <module>
from .utils._proxy_module import *
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
import intel_extension_for_pytorch._C
ImportError: libze_loader.so.1: cannot open shared object file: No such file or directory
garner@linux:~$
After installing the generated .deb - level-zero_1.9.9+l22.1_amd64.deb - by checking out the level-zero tag v1.9.9:
garner@linux:~$ python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/__init__.py", line 122, in <module>
from .utils._proxy_module import *
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
import intel_extension_for_pytorch._C
ImportError: /opt/intel/compiler/2025.1/lib/libur_loader.so.0: version `LIBUR_LOADER_0.10' not found (required by /usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/lib/../../../../libsycl.so.8)
garner@linux:~$
Pending pytorch issue; got this:
garner@linux:/opt/puffer$ puffer train puffer_target
Traceback (most recent call last):
File "/usr/local/bin/puffer", line 5, in <module>
from pufferlib.pufferl import main
File "/opt/puffer/pufferlib/pufferl.py", line 28, in <module>
import torch
File "/usr/local/lib/python3.12/dist-packages/torch/__init__.py", line 409, in <module>
from torch._C import * # noqa: F403
^^^^^^^^^^^^^^^^^^^^^^
ImportError: /opt/intel/compiler/2025.1/lib/libur_loader.so.0: version `LIBUR_LOADER_0.10' not found (required by /usr/local/lib/python3.12/dist-packages/torch/lib/../../../../libsycl.so.8)
garner@linux:/opt/puffer$
Note if legacy hardware; Linux Mint has intel-opencl-icd (23.43.27642.40-1ubuntu3) at present instead of 24.35.
Is this as expected?
Processing triggers for libc-bin (2.39-0ubuntu8.4) ...
/sbin/ldconfig.real: /usr/local/lib/libccl.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpi.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libpti_view.so.0.10 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpijava.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libpstloffload.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpicxx.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_adapter_opencl.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libOpenCL.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_adapter_level_zero.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libsycl-preview.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpifort.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_loader.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libsycl.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libumf.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libhwloc.so.15 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtcm.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtcm_debug.so.1 is not a symbolic link
Possible libsycl issue as /opt/intel/compiler/2025.1/lib/libur_loader.so.0.11.10 exists. Would that be added here?
