-
Notifications
You must be signed in to change notification settings - Fork 25
Extend ._tensor_impl with remaining functions used by dpnp
#2758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vlad-perevezentsev
wants to merge
41
commits into
include-dpctl-tensor
Choose a base branch
from
move_tensor_impl_ext_part_2
base: include-dpctl-tensor
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
d4fd805
Rename folder dpctl to dpctl_ext
vlad-perevezentsev c040713
Add simplify_iteration_space implementation to libtensor
vlad-perevezentsev 14b466f
Extend codespell ignore list for libtensor
vlad-perevezentsev dcc421b
Add copy_and_cast kernels to libtensor
vlad-perevezentsev 5a9c14c
Add copy_usm_ndarray_into_usm_ndarray implementation
vlad-perevezentsev 4f63340
Add pybind11 bindings for dpctl_ext.tensor._tensor_impl
vlad-perevezentsev 634579c
Add CMake build files for dpctl_ext
vlad-perevezentsev 79d40f2
Add empty __init__ to dpctl_ext/
vlad-perevezentsev 7949c17
Enable _same_logical_tensors in _tensor_impl
vlad-perevezentsev 29d6c02
Add device_support_queries to enable default device types
vlad-perevezentsev 936e719
Enable building and packaging of dpctl_ext
vlad-perevezentsev cd85f1e
Use _tensor_impl from dpctl_ext.tensor in dpnp
vlad-perevezentsev 0c6780a
Move put() and take() to dpctl_ext/tensor
vlad-perevezentsev 87e5482
Use put/take from dpctl_ext.tensor in dpnp
vlad-perevezentsev b537f30
Move full() to dpctl_ext/tensor
vlad-perevezentsev d50f263
Use full and _full_usm_ndarray from dpctl_ext in dpnp
vlad-perevezentsev f189dc5
Update .gitignore to ignore .so files in dpctl_ext
vlad-perevezentsev f9a1817
Move _zeros_usm_ndarray to dpctl_ext
vlad-perevezentsev 4b8505a
Use _zeros_usm_ndarray from dpctl_ext in dpnp_fill.py
vlad-perevezentsev 61106b2
Move linear-sequence implementations to dpctl_ext/tensor
vlad-perevezentsev a030579
Use _tensor_impl from dpctl_ext in dpnp_utils_fft.py
vlad-perevezentsev a1d6fa3
Move tril()/triu() to dpctl_ext/tensor
vlad-perevezentsev f1d6e56
Use tril/triu/_tril from dpctl_ext.tensor in dpnp
vlad-perevezentsev 6680790
Disable pylint no-name-in-module for dpctl_ext
vlad-perevezentsev 263b717
Add TODO comments
vlad-perevezentsev 4130c1b
Use default_device_complex_type from dpctl_ext on test_array_api_info.py
vlad-perevezentsev 17ca9ab
Remove unused build_dpctl_ext function
vlad-perevezentsev 79cb2a4
Apply remarks for CMake files
vlad-perevezentsev 4bf080e
Apply remarks for c++ files
vlad-perevezentsev cfa6cd6
Remove linear-sequence implementations
vlad-perevezentsev e0e50ac
Merge move_tensor_impl_ext into move_tensor_impl_ext_part_2
vlad-perevezentsev 087a2ec
Use _tensor_impl from dpctl_ext in dpnp
vlad-perevezentsev f4492fb
Add missing include
vlad-perevezentsev b367c9f
Use nested namespace syntax
vlad-perevezentsev 3113716
Add missing include complex
vlad-perevezentsev 978afee
Add missing memory and queue checks
vlad-perevezentsev 19e93b9
Update .gitignore to ignore .so files in dpctl_ext
vlad-perevezentsev b111e49
Remove unused includes in tensor_ctors.cpp
vlad-perevezentsev c082224
Use Python::Module for dpctl_ext static lib to avoid libpython depend…
vlad-perevezentsev 9e7deb3
Merge move_tensor_impl_ext into move_tensor_impl_ext_part_2
vlad-perevezentsev 1a736f7
Merge include-dpctl-tensor into move_tensor_impl_ext_part_2
vlad-perevezentsev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,326 @@ | ||
| # ***************************************************************************** | ||
| # Copyright (c) 2026, Intel Corporation | ||
| # All rights reserved. | ||
| # | ||
| # Redistribution and use in source and binary forms, with or without | ||
| # modification, are permitted provided that the following conditions are met: | ||
| # - Redistributions of source code must retain the above copyright notice, | ||
| # this list of conditions and the following disclaimer. | ||
| # - Redistributions in binary form must reproduce the above copyright notice, | ||
| # this list of conditions and the following disclaimer in the documentation | ||
| # and/or other materials provided with the distribution. | ||
| # - Neither the name of the copyright holder nor the names of its contributors | ||
| # may be used to endorse or promote products derived from this software | ||
| # without specific prior written permission. | ||
| # | ||
| # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
| # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
| # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | ||
| # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE | ||
| # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR | ||
| # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF | ||
| # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS | ||
| # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN | ||
| # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) | ||
| # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF | ||
| # THE POSSIBILITY OF SUCH DAMAGE. | ||
| # ***************************************************************************** | ||
|
|
||
| import operator | ||
| from numbers import Number | ||
|
|
||
| import dpctl | ||
| import dpctl.tensor as dpt | ||
| import dpctl.utils | ||
| import numpy as np | ||
| from dpctl.tensor._data_types import _get_dtype | ||
| from dpctl.tensor._device import normalize_queue_device | ||
|
|
||
| import dpctl_ext.tensor._tensor_impl as ti | ||
|
|
||
|
|
||
| def _cast_fill_val(fill_val, dt): | ||
| """ | ||
| Casts the Python scalar `fill_val` to another Python type coercible to the | ||
| requested data type `dt`, if necessary. | ||
| """ | ||
| val_type = type(fill_val) | ||
| if val_type in [float, complex] and np.issubdtype(dt, np.integer): | ||
| return int(fill_val.real) | ||
| elif val_type is complex and np.issubdtype(dt, np.floating): | ||
| return fill_val.real | ||
| elif val_type is int and np.issubdtype(dt, np.integer): | ||
| return _to_scalar(fill_val, dt) | ||
| else: | ||
| return fill_val | ||
|
|
||
|
|
||
| def _to_scalar(obj, sc_ty): | ||
| """A way to convert object to NumPy scalar type. | ||
| Raises OverflowError if obj can not be represented | ||
| using the requested scalar type. | ||
| """ | ||
| zd_arr = np.asarray(obj, dtype=sc_ty) | ||
| return zd_arr[()] | ||
|
|
||
|
|
||
| def _validate_fill_value(fill_val): | ||
| """Validates that `fill_val` is a numeric or boolean scalar.""" | ||
| # TODO: verify if `np.True_` and `np.False_` should be instances of | ||
| # Number in NumPy, like other NumPy scalars and like Python bools | ||
| # check for `np.bool_` separately as NumPy<2 has no `np.bool` | ||
| if not isinstance(fill_val, Number) and not isinstance(fill_val, np.bool_): | ||
| raise TypeError( | ||
| f"array cannot be filled with scalar of type {type(fill_val)}" | ||
| ) | ||
|
|
||
|
|
||
| def full( | ||
| shape, | ||
| fill_value, | ||
| *, | ||
| dtype=None, | ||
| order="C", | ||
| device=None, | ||
| usm_type=None, | ||
| sycl_queue=None, | ||
| ): | ||
| """ | ||
| Returns a new :class:`dpctl.tensor.usm_ndarray` having a specified | ||
| shape and filled with `fill_value`. | ||
|
|
||
| Args: | ||
| shape (tuple): | ||
| Dimensions of the array to be created. | ||
| fill_value (int,float,complex,usm_ndarray): | ||
| fill value | ||
| dtype (optional): data type of the array. Can be typestring, | ||
| a :class:`numpy.dtype` object, :mod:`numpy` char string, | ||
| or a NumPy scalar type. Default: ``None`` | ||
| order ("C", or "F"): | ||
| memory layout for the array. Default: ``"C"`` | ||
| device (optional): array API concept of device where the output array | ||
| is created. ``device`` can be ``None``, a oneAPI filter selector | ||
| string, an instance of :class:`dpctl.SyclDevice` corresponding to | ||
| a non-partitioned SYCL device, an instance of | ||
| :class:`dpctl.SyclQueue`, or a :class:`dpctl.tensor.Device` object | ||
| returned by :attr:`dpctl.tensor.usm_ndarray.device`. | ||
| Default: ``None`` | ||
| usm_type (``"device"``, ``"shared"``, ``"host"``, optional): | ||
| The type of SYCL USM allocation for the output array. | ||
| Default: ``"device"`` | ||
| sycl_queue (:class:`dpctl.SyclQueue`, optional): | ||
| The SYCL queue to use | ||
| for output array allocation and copying. ``sycl_queue`` and | ||
| ``device`` are complementary arguments, i.e. use one or another. | ||
| If both are specified, a :exc:`TypeError` is raised unless both | ||
| imply the same underlying SYCL queue to be used. If both are | ||
| ``None``, a cached queue targeting default-selected device is | ||
| used for allocation and population. Default: ``None`` | ||
|
|
||
| Returns: | ||
| usm_ndarray: | ||
| New array initialized with given value. | ||
| """ | ||
| if not isinstance(order, str) or len(order) == 0 or order[0] not in "CcFf": | ||
| raise ValueError( | ||
| "Unrecognized order keyword value, expecting 'F' or 'C'." | ||
| ) | ||
| order = order[0].upper() | ||
| dpctl.utils.validate_usm_type(usm_type, allow_none=True) | ||
|
|
||
| if isinstance(fill_value, (dpt.usm_ndarray, np.ndarray, tuple, list)): | ||
| if ( | ||
| isinstance(fill_value, dpt.usm_ndarray) | ||
| and sycl_queue is None | ||
| and device is None | ||
| ): | ||
| sycl_queue = fill_value.sycl_queue | ||
| else: | ||
| sycl_queue = normalize_queue_device( | ||
| sycl_queue=sycl_queue, device=device | ||
| ) | ||
| X = dpt.asarray( | ||
| fill_value, | ||
| dtype=dtype, | ||
| order=order, | ||
| usm_type=usm_type, | ||
| sycl_queue=sycl_queue, | ||
| ) | ||
| return dpt.copy(dpt.broadcast_to(X, shape), order=order) | ||
| else: | ||
| _validate_fill_value(fill_value) | ||
|
|
||
| sycl_queue = normalize_queue_device(sycl_queue=sycl_queue, device=device) | ||
| usm_type = usm_type if usm_type is not None else "device" | ||
| dtype = _get_dtype(dtype, sycl_queue, ref_type=type(fill_value)) | ||
| res = dpt.usm_ndarray( | ||
| shape, | ||
| dtype=dtype, | ||
| buffer=usm_type, | ||
| order=order, | ||
| buffer_ctor_kwargs={"queue": sycl_queue}, | ||
| ) | ||
| fill_value = _cast_fill_val(fill_value, dtype) | ||
|
|
||
| _manager = dpctl.utils.SequentialOrderManager[sycl_queue] | ||
| # populating new allocation, no dependent events | ||
| hev, full_ev = ti._full_usm_ndarray(fill_value, res, sycl_queue) | ||
| _manager.add_event_pair(hev, full_ev) | ||
| return res | ||
|
|
||
|
|
||
| def tril(x, /, *, k=0): | ||
| """ | ||
| Returns the lower triangular part of a matrix (or a stack of matrices) | ||
| ``x``. | ||
|
|
||
| The lower triangular part of the matrix is defined as the elements on and | ||
| below the specified diagonal ``k``. | ||
|
|
||
| Args: | ||
| x (usm_ndarray): | ||
| Input array | ||
| k (int, optional): | ||
| Specifies the diagonal above which to set | ||
| elements to zero. If ``k = 0``, the diagonal is the main diagonal. | ||
| If ``k < 0``, the diagonal is below the main diagonal. | ||
| If ``k > 0``, the diagonal is above the main diagonal. | ||
| Default: ``0`` | ||
|
|
||
| Returns: | ||
| usm_ndarray: | ||
| A lower-triangular array or a stack of lower-triangular arrays. | ||
| """ | ||
| if not isinstance(x, dpt.usm_ndarray): | ||
| raise TypeError( | ||
| "Expected argument of type dpctl.tensor.usm_ndarray, " | ||
| f"got {type(x)}." | ||
| ) | ||
|
|
||
| k = operator.index(k) | ||
|
|
||
| order = "F" if (x.flags.f_contiguous) else "C" | ||
|
|
||
| shape = x.shape | ||
| nd = x.ndim | ||
| if nd < 2: | ||
| raise ValueError("Array dimensions less than 2.") | ||
|
|
||
| q = x.sycl_queue | ||
| if k >= shape[nd - 1] - 1: | ||
| res = dpt.empty( | ||
| x.shape, | ||
| dtype=x.dtype, | ||
| order=order, | ||
| usm_type=x.usm_type, | ||
| sycl_queue=q, | ||
| ) | ||
| _manager = dpctl.utils.SequentialOrderManager[q] | ||
| dep_evs = _manager.submitted_events | ||
| hev, cpy_ev = ti._copy_usm_ndarray_into_usm_ndarray( | ||
| src=x, dst=res, sycl_queue=q, depends=dep_evs | ||
| ) | ||
| _manager.add_event_pair(hev, cpy_ev) | ||
| elif k < -shape[nd - 2]: | ||
| res = dpt.zeros( | ||
| x.shape, | ||
| dtype=x.dtype, | ||
| order=order, | ||
| usm_type=x.usm_type, | ||
| sycl_queue=q, | ||
| ) | ||
| else: | ||
| res = dpt.empty( | ||
| x.shape, | ||
| dtype=x.dtype, | ||
| order=order, | ||
| usm_type=x.usm_type, | ||
| sycl_queue=q, | ||
| ) | ||
| _manager = dpctl.utils.SequentialOrderManager[q] | ||
| dep_evs = _manager.submitted_events | ||
| hev, tril_ev = ti._tril( | ||
| src=x, dst=res, k=k, sycl_queue=q, depends=dep_evs | ||
| ) | ||
| _manager.add_event_pair(hev, tril_ev) | ||
|
|
||
| return res | ||
|
|
||
|
|
||
| def triu(x, /, *, k=0): | ||
| """ | ||
| Returns the upper triangular part of a matrix (or a stack of matrices) | ||
| ``x``. | ||
|
|
||
| The upper triangular part of the matrix is defined as the elements on and | ||
| above the specified diagonal ``k``. | ||
|
|
||
| Args: | ||
| x (usm_ndarray): | ||
| Input array | ||
| k (int, optional): | ||
| Specifies the diagonal below which to set | ||
| elements to zero. If ``k = 0``, the diagonal is the main diagonal. | ||
| If ``k < 0``, the diagonal is below the main diagonal. | ||
| If ``k > 0``, the diagonal is above the main diagonal. | ||
| Default: ``0`` | ||
|
|
||
| Returns: | ||
| usm_ndarray: | ||
| An upper-triangular array or a stack of upper-triangular arrays. | ||
| """ | ||
| if not isinstance(x, dpt.usm_ndarray): | ||
| raise TypeError( | ||
| "Expected argument of type dpctl.tensor.usm_ndarray, " | ||
| f"got {type(x)}." | ||
| ) | ||
|
|
||
| k = operator.index(k) | ||
|
|
||
| order = "F" if (x.flags.f_contiguous) else "C" | ||
|
|
||
| shape = x.shape | ||
| nd = x.ndim | ||
| if nd < 2: | ||
| raise ValueError("Array dimensions less than 2.") | ||
|
|
||
| q = x.sycl_queue | ||
| if k > shape[nd - 1]: | ||
| res = dpt.zeros( | ||
| x.shape, | ||
| dtype=x.dtype, | ||
| order=order, | ||
| usm_type=x.usm_type, | ||
| sycl_queue=q, | ||
| ) | ||
| elif k <= -shape[nd - 2] + 1: | ||
| res = dpt.empty( | ||
| x.shape, | ||
| dtype=x.dtype, | ||
| order=order, | ||
| usm_type=x.usm_type, | ||
| sycl_queue=q, | ||
| ) | ||
| _manager = dpctl.utils.SequentialOrderManager[q] | ||
| dep_evs = _manager.submitted_events | ||
| hev, cpy_ev = ti._copy_usm_ndarray_into_usm_ndarray( | ||
| src=x, dst=res, sycl_queue=q, depends=dep_evs | ||
| ) | ||
| _manager.add_event_pair(hev, cpy_ev) | ||
| else: | ||
| res = dpt.empty( | ||
| x.shape, | ||
| dtype=x.dtype, | ||
| order=order, | ||
| usm_type=x.usm_type, | ||
| sycl_queue=q, | ||
| ) | ||
| _manager = dpctl.utils.SequentialOrderManager[q] | ||
| dep_evs = _manager.submitted_events | ||
| hev, triu_ev = ti._triu( | ||
| src=x, dst=res, k=k, sycl_queue=q, depends=dep_evs | ||
| ) | ||
| _manager.add_event_pair(hev, triu_ev) | ||
|
|
||
| return res |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're adding these functions in, should we start adding dpctl test suite as well?
Or would you rather start reusing the functions in dpnp and relying on dpnp tests?
I think it's sensible to have the two test suites and merge later if possible, but it's just a choice we have to make