

- What is SYCL
- Enqueueing a Kernel
- Managing Data
- Handling Errors
- Device Discovery
- Data Parallelism
- Introduction to USM
- Using USM
- Asynchronous Execution
- Data and Dependencies
- In Order Queue
- Advanced Data Flow
- Multiple Devices
- Image Convolution
- Coalesced Global Memory
- Vectors
- Local Memory Tiling
- Further Optimisations
- Matrix Transpose
- More SYCL Features
- Functors
Vectors
In this exercise you will learn how to cache global memory into local memory in tiles according to work-groups in order to compare the performance difference.
1.) Use local memory
Allocate local memory for the kernel function by creating a local accessor.
The local accessor is created by specifying a range which is the number of
elements to allocate.
Note that local memory is allocated per work-group and each work-group can only access it's own local memory.
2.) Cache global memory access in local memory
Cache the global memory required for each work-group in local memory by reading
from global accessor and then writing to the local accessor.
Remember to use the local id to know where in local memory to store the data,
this can be queried by calling the nd_item member function get_local_id.
Compare the performance with local memory and without local memory.
Build and execution hints
For DevCloud via JupiterLab follow these instructions.
For DPC++: instructions.
For AdaptiveCpp: instructions.