Back
Close

Introduction to SYCL

Previous: Device Memory using Buffers and Accessors Next: Quiz

SYCL Kernel Programming - The single_task API

Declare the vector addition kernel

Description

By specification, an unnamed type is an invalid template argument for classes used to name SYCL kernels. Therefore, you need to at least declare the kernel class before using it.

The following examples are valid SYCL kernel names:

// class
class MyKernelClass;
// struct
struct MyKernelStruct;
// enum class
enum class MyKernelEnum : int;

Task

Forward declare the vector_addition class functor before the int main entry point of the application.

Location in the source code:

// <<Declare vector_addition kernel function object>>
Hint

class vector_addition;

Implement the vector addition kernel

Description

Now we need to enqueue and the vector addition kernel for execution.

A kernel that is executed on one thread using NDRange(1,1,1) can be enqueued using the cl::sycl::single_task API:

single_task<typename kernel_lambda_name>([=](){});

This is equivalent to executing a kernel on a single compute unit with a single work-group of only one work-item. Thus, we can access the values of the accessor objects directly by using the 0th index as follows:

// e.g, assign accessor element in the kernel body
acc[0] = someValue;

Task

Complete the vector addition kernel code where the accessors of a, b, c make the vectors available for calculating c = a + b.

Location in the source code:

// <<Complete the vector addition computation>>
Hint

c_acc[0] = a_acc[0] + b_acc[0];

Run it!

Hello World from SYCL
Create your playground on Tech.io
This playground was created on Tech.io, our hands-on, knowledge-sharing platform for developers.
Go to tech.io
#include <iostream>
// The SYCL header
#include <SYCL/sycl.hpp>
namespace sycl = cl::sycl;
// <<Declare vector_addition kernel function object>>
int main(int argc, char **) {
// <<Setup host memory>>
// define input vectors
sycl::float4 a = {1.0, 1.0, 1.0, 1.0}; // input 1
sycl::float4 b = {1.0, 1.0, 1.0, 1.0}; // input 2
// define output vector
sycl::float4 c = {0.0, 0.0, 0.0, 0.0}; // output
// <<Setup SYCL queue>>
sycl::queue myQueue(sycl::default_selector{});
// Begin SYCL scope
{
// <<Setup device memory>>
// define input buffers
sycl::buffer<sycl::float4, 1> buf_a(&a, sycl::range<1>{1});
sycl::buffer<sycl::float4, 1> buf_b(&b, sycl::range<1>{1});
// define output buffer
sycl::buffer<sycl::float4, 1> buf_c(&c, sycl::range<1>{1});
// Submit a command group functor for execution on a queue. This functor
// encapsulates the kernel and the data needed for its execution.
myQueue.submit([&](sycl::handler &cgh) {
// <<Request device memory access>>
// read accessors
auto a_acc = buf_a.get_access<sycl::access::mode::read>(cgh);
auto b_acc = buf_b.get_access<sycl::access::mode::read>(cgh);
// write accessor
auto c_acc = buf_c.get_access<sycl::access::mode::write>(cgh);
// Enqueue the kernel for execution using the `single_task` API
cgh.single_task<class vector_addition>([=]() {
// <<Complete the vector addition computation>>
// calculate: c = a+b;
});
});
}
// End SYCL scope
// validation checks
{
// Simple vector swizzles are available in SYCL for the cl::sycl::vec class
// for up to width of 4 and are defined as functions, e.g. a.xy()
// The cl::sycl::vec class works for host as well so the same functionality
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
codingame x discord
Join the CodinGame community on Discord to chat about puzzle contributions, challenges, streams, blog articles - all that good stuff!
JOIN US ON DISCORD
Online Participants