MicroZed Chronicles: Getting Started with OpenCL

As I mentioned in last week’s blog, I want to spend some time discussing OpenCL because being able to work with OpenCL is going to be an increasingly important skill for programmable logic solutions.

So, let’s first start by defining OpenCL. OpenCL is an open-source framework managed by the Khronos group which allows us to work with heterogeneous systems. For devices like the Zynq-7000 SoC, Zynq MPSoC, Alveo accelerator card, and ACAPs, OpenCL allows us to define parallel computing applications.

The goal of OpenCL is to enable portability of design across multiple different systems comprised of CPUs, GPUs, DSPs, and FGPAs. Of course, while the design is portable, performance of the design is not.

OpenCL works around a model of a host and device(s). An OpenCL system always contains a single host and one or more device. In our Xilinx solutions, the host is always an Arm-based processing system for Zynq-7000 SoCs, Zynq MPSOCs, and ACAP devices. For Xilinx Alveo accelerator cards, the host is an x86 processor connected via the PCIe link.

This model allows the host to manage the overall application while the devices implement the necessary algorithms. This allows developers to create kernels running in the devices which accelerate applications using high-performance devices. In the case of Xilinx OpenCL solutions, this is programmable logic.

The language used for the generation of the device is OpenCL C which is based on C but has limitations to enable portability. We can use both OpenCL and C/C++ in the Xilinx ecosystem to create the kernels using Vitis HLS.

At first the concept of host and kernels running on devices might be seen daunting, however, it is fairly simple to hit the ground running.

To use the Vitis acceleration flow which is where we leverage OpenCL, we first need a platform for our target device. This platform defines all the hardware elements required (XSA) and the software components necessary to support an OpenCL application. Xilinx and many third-party board developers make Vitis platforms available. If you are working with a custom board, we have previously covered how to create your own platforms (U96Hw, U96Sw, MicroZed & Genesys ZU).

For this miniseries, I am going to be using the Alveo U50 card. We installed the platform last week as part of the installation of the card.

Now that we understand the concepts around OpenCL and the Vitis platform, we are going to start looking at what is included in the initial configuration of the host program. That is how we detect if we have a valid platform and device before moving on to creating the kernels and completing the host application.

The host application at the highest level does three things:

Configures the environment
Loads and executes one or more kernels
Gracefully cleans up and releases resources

In the remainder of this blog we are going to look at the first element which is the configuration of the environment.

To be able to use an OpenCL device and load a kernel, we need to perform the following steps:

Identify the platform
Identify the devices available
Create a context
Create a command queue

The first step identifies that a Xilinx platform has been found. Once that has been correctly identified, we can look for available devices. For this example, we will be looking for the previously installed Alveo U50 card to be detected as the device.

Once we have the correct platform and device, the context and command queue can be created. A context allows the host to be able to manage programs, command queues, memories, and kernels.

With the context created, the command queue can then be created to attach to a device. By using this command queue, the host is able to execute kernels, transfer data between different memories, and synchronize. We can have several different command queues within a single context.

To demonstrate this in Vitis, we can create a new blank project targeting the Alveo U50. This will create the framework for the host and kernel applications in Vitis.

Once the project is created under the host application, create a new CPP source file where we will create our first host application.

#include <stdlib.h>

#include <fstream>

#include <iostream>

#include <CL/opencl.h>

#include <CL/cl2.hpp>

int main(int argc, char* argv[]) {

std::vector<cl::Device> devices;

cl::Device device;

std::vector<cl::Platform> platforms;

bool found_device = false;

//traversing all Platforms To find Xilinx Platform and targeted

//Device in Xilinx Platform

cl::Platform::get(&platforms);

for(size_t i = 0; (i < platforms.size() ) & (found_device == false) ;i++){

cl::Platform platform = platforms[i];

std::string platformName = platform.getInfo<CL_PLATFORM_NAME>();

std::cout << "Platform " << platformName << std::endl;

if ( platformName == "Xilinx"){

devices.clear();

platform.getDevices(CL_DEVICE_TYPE_ACCELERATOR, &devices);

if (devices.size()){

device = devices[0];

found_device = true;

break;

}

if (found_device == false){

std::cout << "Error: Unable to find Target Device " << device.getInfo<CL_DEVICE_NAME>() << std::endl;

return EXIT_FAILURE;

}

std::cout << "Target Device " << device.getInfo<CL_DEVICE_NAME>() << std::endl;

std::cout << "Target Device " << device.getInfo<CL_DEVICE_TYPE>() << std::endl;

}

With this code created and having confirmed the XRT is set up, we can execute the application created on the host and ensure it detects the platform and device correctly.

You can find the executable under the project hardware folder.

As the executable is run, you will see the Xilinx platform along with the U50 platform being identified.

With a simple host platform created, we next need to create our kernels which we will look in more detail at in the next blog. Once we have completed our Kernel design and implementation, we will come back to focusing on how to make the host application, run the overall application using the context and command queue created.

MicroZed Chronicles: Getting Started with OpenCL

Recent Posts

Comments