top of page

MicroZed Chronicles: A look at the DSP58

We looked at fixed point math a few weeks ago and we’ve also previously looked at how we can use DSP48 in both 7 series and UltraScale+ devices. One thing I’ve been wanting to look at for a while now has been DSP58 which is the updated DSP element within the Versal architecture.

The DSP58 is the sixth generation of AMD FPGA-based DSP blocks. As would be expected, it has evolved to better support the demands of application developers and enables a range of functionality including:

  • 27 x 24 + 58-bit signed multiply accumulate

  • 18 x 18 + 58-bit complex multiplier

  • Floating-point 32 Single precision operation

  • Single instruction, multiple data of dual 24 bit or quad 12-bit operations

  • INT8 dot-product capability

  • 58-bit logic unit

  • Pattern detection

This list of capabilities includes several evolutions over the DSP48 while still being backwards compatible. Along with the increased bit widths for inputs, accumulate, logic unit, and pattern detection (which supports increased precision and dynamic range in a single DSP58), there is also implementation of new modes of operation including FP32 single-precision floating-point, INT8 dot product and back-to-back DSP58 to be implemented as complex multipliers.

Within the Versal architecture, DSP58 slices are arranged as supertiles which consists of two rows and two columns of configurable logic blocks next to the DSP58. This provides 64 LUTMs, 64 LUTLs, and 256 flip flops which are connected to the DSP58. When configured for complex multiplication, the two back-to-back DSP58 super tiles are connected to create a complex super tile. The left-hand side tile provides the imaginary result and the right-hand side creates the real result.

To get the best performance from the DSP58, we need to pipeline with at least three stages when using the multiplier, or two stages in other configurations. Of course, we should also remember to correctly use the reset to ensure the registers are implemented optimally.

The DSP58 also supports the use of wide multiplexing as can be done with the DSP48E2. This capability is very useful for video and networking applications where wide buses need to multiplexed.

When it comes to implementing solutions which use the DSP58, one of the key methods for ensuring DSP58 utilization is inference. This creates the most portable code which can be used across a range of solutions. However, IP Integrator, instantiation and macros can be used to leverage the DSP58 within the device.

One good place to get started with instantiation is using the language templates within the Vivado text editor. This provides several templates which are useful as starting points for DSP58 instantiations.

However, if we wish to use the DSP58 in its floating-point format, we cannot use inference from our HDL. Instead, we need to either use instantiation or the floating-point IP core which can be implemented using the floating-point IP. This is an IP block that we haven’t examined before so we will look at it in a future blog.

As would be expected, the DSP58 offers developers significantly improved performance over the previous DSP48E2.

Workshops and Webinars

If you enjoyed the blog why not take a look at the free webinars, workshops and training courses we have created over the years. Highlights include

Embedded System Book Do you want to know more about designing embedded systems from scratch? Check out our book on creating embedded systems. This book will walk you through all the stages of requirements, architecture, component selection, schematics, layout, and FPGA / software design. We designed and manufactured the board at the heart of the book! The schematics and layout are available in Altium here Learn more about the board (see previous blogs on Bring up, DDR validation, USB, Sensors) and view the schematics here.

Sponsored by AMD Xilinx



bottom of page