With every major software release, comes new capabilities. For example, Vitis and Vivado 2020.2 introduced support for Versal ACAP, along with Vitis HLS becoming the default HLS compiler for both Vivado and Vitis.
One thing you might have missed in the 2020.2 release of Vitis HLS is that Xilinx opened the LLVM Intermediate Representation (IR) layer of Vitis HLS to partners. This allows partners to make additional optimization pragmas available to developers, which will result in a better quality of result for the developer.
Last week, I noticed that Silexica has released the first plugin that exploits this capability, one that adds in a new pragma that performs loop interchange.
The loop interchange pragma enables the tool to exchange the inner loop with an outer loop. Interchanging the loops can eliminate loop carry dependencies which enables more unrolling. Interchanging loops can also enable a better initiation interval, thereby increasing throughput.
Importantly the new pragma also incorporates protection to detect if the transformation is
safe and legal. The protection layer will reject the interchange when it can’t be applied, such as, if it leads to an invalid computation.
Let’s take a look at how we can install the SLX Plugin and explore the initial examples which come with it. We need the following prerequisites to be able to use this plugin:
Vitis HLS 2020.2
The SLX Plugin can be obtained from the Silexica website. This download will contain all the necessary files, documentation, and examples to install and start working with the plugin. To use the plugin, we do not need to have an SLX FPGA license.
Once the plugin is downloaded, the next step is to extract the files from the provided tarball. Installing the plugin and running the examples is very quick and easy with user guide Silexica provides.
In a terminal window, ensure the Vitis HLS setup script has been run if your system does not automatically run it at start up.
In the terminal window, navigate to the top level of the extracted SLX Plugin directory.
Running the command source exports will install the plugin. We are then ready to try out some of the examples provided.
The plugin comes with several example applications:
Discrete Fourier Transform
Brillouin Optical Time Domain Analyzer
We can run any of these examples directly from the command line by running check.sh script in the terminal.
Running this script will create a new project for the chosen example and generate results that show the difference between a solution that uses the loop interchange pragma and one that does not.
Once the project has completed its run, you will notice a new project directory under your selected example directory along with a file called latency.new.
The latency.new file will contain the performance information when the design is implemented with and without the loop interchange pragma.
I decided to run the provided MRI example to understand more about how the plugin works. Like everything which runs in Vitis HLS, the compilation process is quick and fast and the results are interesting.
Opening the resulting latency.new file shows that the vanilla implementation takes a considerable number of iterations for Loop N and Loop M. In the vanilla implementation Loop N is the outer loop and Loop M the inner loop.
Applying the loop interchange pragma results in Loop M being used as the outer loop, and Loop N as the inner loop. This results in a significant increase in performance. The output loop now takes only 11% of the original latency, while the inner loop takes only 44% of the original latency.
This is quite impressive. However, the most impressive thing is that the Vitis HLS LLVM layer is now open to partners and there will be a range of new optimization pragmas developed that will further accelerate HLS developments.
As I understand it, Silexica will be implementing more pragmas via the plugin over the coming releases.