

- What is SYCL
- Enqueueing a Kernel
- Managing Data
- Handling Errors
- Device Discovery
- Data Parallelism
- Introduction to USM
- Using USM
- Asynchronous Execution
- Data and Dependencies
- In Order Queue
- Advanced Data Flow
- Multiple Devices
- Image Convolution
- Coalesced Global Memory
- Vectors
- Local Memory Tiling
- Further Optimisations
- Matrix Transpose
- More SYCL Features
- Functors
Coalesced Global Memory
In this exercise you will learn how to use vec to explicitly vectorized your
kernel function.
1.) Use vectors
Now that global memory access is coalesced another optimization you can do is to
use the vec class to present the pixels in the image.
To do this first you want to reinterpret the buffer objects as buffers with
the element type float4. You can do this by calling reinterpret on the
buffers and specifying the type float4.
Note you will have to provide a new ranges which will be the original ranges
divided by 4 to account for the channels which are now represented by the four
elements of the vector type.
2.) Refactor the kernel function
Now that you are using vector types, you want to replace the representation of
pixels in the kernel function with a float4 and remove any multiplication or
offsets to account for the number of channels.
Build and execution hints
For DevCloud via JupiterLab follow these instructions.
For DPC++: instructions.
For AdaptiveCpp: instructions.