|
| 1 | +#1 Example of SYCL extension working NumPy array input via SYCL buffers |
| 2 | + |
| 3 | + |
| 4 | +#2 Decription |
| 5 | + |
| 6 | +Cython function expecting a 2D array in C-contiguous layout that |
| 7 | +computes column-wise total by using SYCL oneMKL (as GEMV call with |
| 8 | +an all units vector). |
| 9 | + |
| 10 | +Example illustrates compiling SYCL extension, linking to oneMKL. |
| 11 | + |
| 12 | + |
| 13 | +#2 Compiling |
| 14 | + |
| 15 | +``` |
| 16 | +# make sure oneAPI is activated, $ONEAPI_ROOT must be set |
| 17 | +CC=clang CXX=dpcpp python setup.py build_ext --inplace |
| 18 | +``` |
| 19 | + |
| 20 | + |
| 21 | +#2 Running |
| 22 | + |
| 23 | +``` |
| 24 | +# SYCL_BE=PI_OPENCL sets SYCL backend to OpenCL to avoid a |
| 25 | +# transient issue with MKL's using the default Level-0 backend |
| 26 | +(idp) [08:16:12 ansatnuc04 simple]$ SYCL_BE=PI_OPENCL ipython |
| 27 | +Python 3.7.7 (default, Jul 14 2020, 22:02:37) |
| 28 | +Type 'copyright', 'credits' or 'license' for more information |
| 29 | +IPython 7.17.0 -- An enhanced Interactive Python. Type '?' for help. |
| 30 | +
|
| 31 | +In [1]: import syclbuffer as sb, numpy as np, dpctl |
| 32 | +
|
| 33 | +In [2]: x = np.random.randn(10**4, 2500) |
| 34 | +
|
| 35 | +In [3]: %time m1 = np.sum(x, axis=0) |
| 36 | +CPU times: user 22.3 ms, sys: 160 µs, total: 22.5 ms |
| 37 | +Wall time: 21.2 ms |
| 38 | +
|
| 39 | +In [4]: %time m = sb.columnwise_total(x) # first time is slower, due to JIT overhead |
| 40 | +CPU times: user 207 ms, sys: 36.1 ms, total: 243 ms |
| 41 | +Wall time: 248 ms |
| 42 | +
|
| 43 | +In [5]: %time m = sb.columnwise_total(x) |
| 44 | +CPU times: user 8.89 ms, sys: 4.12 ms, total: 13 ms |
| 45 | +Wall time: 12.4 ms |
| 46 | +
|
| 47 | +In [6]: %time m = sb.columnwise_total(x) |
| 48 | +CPU times: user 4.82 ms, sys: 8.06 ms, total: 12.9 ms |
| 49 | +Wall time: 12.3 ms |
| 50 | +``` |
| 51 | + |
| 52 | +Running bench.py: |
| 53 | + |
| 54 | +``` |
| 55 | +========== Executing warm-up ========== |
| 56 | +NumPy result: [1. 1. 1. ... 1. 1. 1.] |
| 57 | +SYCL(Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz) result: [1. 1. 1. ... 1. 1. 1.] |
| 58 | +SYCL(Intel(R) Gen9 HD Graphics NEO) result: [1. 1. 1. ... 1. 1. 1.] |
| 59 | +Times for 'opencl:cpu:0' |
| 60 | +[2.864787499012891, 2.690436460019555, 2.5902308400254697, 2.5802528870408423, 2.538990616973024] |
| 61 | +Times for 'opencl:gpu:0' |
| 62 | +[1.9769684099592268, 2.3491444009705447, 2.293720397981815, 2.391633405990433, 1.9465659779962152] |
| 63 | +Times for NumPy |
| 64 | +[3.4011058019823395, 3.07286038500024, 3.0390414349967614, 3.0305576199898496, 3.002687797998078] |
| 65 | +``` |
| 66 | + |
| 67 | +Running run.py: |
| 68 | + |
| 69 | +``` |
| 70 | +(idp) [09:14:53 ansatnuc04 sycl_buffer]$ SYCL_BE=PI_OPENCL python run.py |
| 71 | +Result computed by NumPy |
| 72 | +[ 0.27170187 -23.36798583 7.31326489 -1.95121928] |
| 73 | +Result computed by SYCL extension |
| 74 | +[ 0.27170187 -23.36798583 7.31326489 -1.95121928] |
| 75 | +
|
| 76 | +Running on: Intel(R) Gen9 HD Graphics NEO |
| 77 | +[ 0.27170187 -23.36798583 7.31326489 -1.95121928] |
| 78 | +Running on: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz |
| 79 | +[ 0.27170187 -23.36798583 7.31326489 -1.95121928] |
| 80 | +``` |
0 commit comments