top of page
Adiuvo Engineering & Training logo
MicroZed Chronicles icon

MicroZed Chronicles: Vitis AI and the NPU

I am always interested to see new tools and approaches being released into the FPGA world. Recently, I have been working a lot more with Versal devices for clients, in my blogs, and Hackster projects, especially the Versal Edge AI devices. With the Versal Edge AI, we have mostly examined the VE2302 device available on the Trenz TE950 or Alinx VD100. We also recently looked at the VEK280, which contains the VE2802, the largest of the Versal Edge AI devices.


ree

Of course, if we are developing for the AIE or AIE-ML within the Versal range of devices for AI inference, we leverage Vitis AI.


Recently, Vitis AI 5.1 was released as a beta. Vitis 5.1 for the first time introduces the Neural Processing Unit, which leverages both the PL and the AIE-ML to deploy neural networks.


Supporting the NPU, there is a software stack which includes the model compilation tools and model deployment APIs.


ree

We will examine creating our own Vitis AI 5.1 example in a future blog. However, I want to explore the NPU in a little more detail.


The NPU is a matrix of heterogeneous processing engines which leverages a combination of the AIE and PL to implement the inference accelerator. The NPU is a general-purpose AI inference accelerator which supports the loading and execution of multiple concurrent models. We can, of course, implement as many NPUs within our design as the resources allow.


This architecture ensures we do not need to generate a new hardware platform to change the neural network we are currently executing.


We are able to implement the NPU within our designs using either a Vitis or Vivado flow. Both flows initially utilise a Vitis Subsystem (VSS). Within the VSS, you will find a combination of the AIE and kernelized PL logic. The VSS, once created in Vitis, can be exported to Vivado using a Vitis Meta Archive (VMA) flow.


To deploy a network onto the NPU, we use the Vitis AI compiler and quantizer to generate snapshots. Snapshots contain the quantized model and instructions for execution by the NPU on the target platform.


Vitis AI 5.1 Beta provides us the ability to deploy CNN-based models which have been developed using PyTorch and TensorFlow. Implementation within the Versal device using the NPU is capable of using INT8 and BF16 types.


To interact with the NPU at run time, both C++ and Python APIs are provided for application integration.


The reference design is provided on the AMD website and includes not only the source code but also an SD card image which we can use to deploy onto the VEK280 and test out Vitis AI 5.1.


Developing Vitis AI 5.1 solutions requires a Linux host machine running Ubuntu 22.04.4 LTS, 100 GB of disk space, and 64 GB of RAM. The toolchain is based upon the AMD 2025.1 toolchain (PetaLinux, Vivado, and Vitis), coupled with Python, Docker, FFmpeg, and GStreamer.


Having a VEK280 in the lab, I thought it would be a good idea to try running the demo application. The demo application provides examples of a ResNet50 model being deployed. We can interact with it using either the VART runner application or an end-to-end application.


Running both is straightforward, and we can see a demonstration of the Vitis AI 5.1 Beta NPU. For this example, we can transfer files to and from the host machine to the VEK280 using SCP.


ree
ree
ree

The next step is to train and create our own model using Vitis AI 5.1 and target the NPU. I will probably include that in a Hackster project soon.


FPGA Conference

FPGA Horizons US East - April 28th, 29th 2026 - THE FPGA Conference, find out more here.


FPGA Journal

Read about cutting edge FPGA developments, in the FPGA Horizons Journal or contribute an article.


Workshops and Webinars:

If you enjoyed the blog why not take a look at the free webinars, workshops and training courses we have created over the years. Highlights include:



Boards

Get an Adiuvo development board:

  • Adiuvo Embedded System Development board - Embedded System Development Board

  • Adiuvo Embedded System Tile - Low Risk way to add a FPGA to your design.

  • SpaceWire CODEC - SpaceWire CODEC, digital download, AXIS Interfaces

  • SpaceWire RMAP Initiator - SpaceWire RMAP Initiator,  digital download, AXIS & AXI4 Interfaces

  • SpaceWire RMAP Target - SpaceWire Target, digital download, AXI4 and AXIS Interfaces

  • Other Adiuvo Boards & Projects.


Embedded System Book   

Do you want to know more about designing embedded systems from scratch? Check out our book on creating embedded systems. This book will walk you through all the stages of requirements, architecture, component selection, schematics, layout, and FPGA / software design. We designed and manufactured the board at the heart of the book! The schematics and layout are available in Altium here.  Learn more about the board (see previous blogs on Bring up, DDR validation, USB, Sensors) and view the schematics here.


Sponsored by AMD

bottom of page