MicroZed Chronicles: You Really Should Use a MicroBlaze V

Adam Taylor
Nov 12
5 min read

I see a lot of designs across a range of industries and applications. One thing that always stands out is that designs often use complex state machines to implement features such as I²C, SPI, GPIO sequencing, etc.

Often, these FSMs become unwieldy and complex, which adds to bring-up and debugging challenges. Of course, these FSMs also need to be verified in simulation and again on the board once they arrive.

When I am asked to work with these interfaces in an FPGA, I often replace a complex FSM with a MicroBlaze V. This enables me to drop in a small soft-core processor and then create a simple software application to control those interfaces. This approach makes the software flexible, easy to understand, and simple to modify if there are any last-minute changes or issues—which, as we all know, never happens on projects.

You might think that implementing a MicroBlaze V would be resource-intensive; however, MicroBlaze V is incredibly configurable, which enables me to optimise its instantiation footprint.

I thought it would make for an interesting blog to examine the configuration options of the MicroBlaze V and their impact on device footprint.

The MicroBlaze V is, of course, based on the RISC-V RV32 architecture. However, it is highly customisable, not only with interfaces and peripherals but also with optional elements of the instruction set architecture (ISA).

Before I dive into the different ISA optimisations we can make, I am first going to examine the different overall processor architecture configurations we can select. The configuration I am referring to is the number of pipeline stages used for instruction execution.

Area – In this configuration, the processor is implemented as a three-stage pipeline to ensure a minimal logic resource implementation.
Throughput – This option implements a four-stage pipeline, optimised for computational throughput.
Performance – This option implements a five-stage pipeline, optimised for maximum performance.
Frequency – This option implements an eight-stage pipeline, optimised to achieve the highest possible implementation frequency.

One thing to note is that instructions might take longer than one cycle to fetch (for example, if DDR memory is used). In this case, the memory architecture and the use of tightly coupled caches are critical to ensure performance. We’ll examine caches and performance in another blog soon.

When it comes to ISA optimisations within the MicroBlaze V configuration, we can enable the following:

Code Compression (C) – Replaces many common 32-bit instructions with 16-bit equivalents, reducing program size by about 25–35% with almost no performance penalty. It adds a small decompression block in hardware; the area overhead on FPGAs is negligible, while instruction memory and cache efficiency improve noticeably.
Integer Multiplier (M) – Adds hardware support for integer multiplication and division, greatly speeding up arithmetic-heavy code compared to software-emulated operations. This typically increases the number of DSP elements used in the FPGA.
Floating Point (F) – Introduces single-precision floating-point operations implemented directly in hardware. This increases performance significantly but comes with a large resource-usage overhead.
Atomic Operations (A) – Adds atomic read-modify-write instructions, enabling efficient synchronisation and lock-free data sharing in multi-threaded or interrupt-driven systems. This has a small overhead on logic resources but can be critical for real-time operating systems.
Bit Manipulation (Zba/Zbb/Zbc/Zbs) – These extensions add specialised instructions for common bitwise operations such as shifts, rotates, extracts, and bit counting. They can be very useful for cryptographic and DSP applications.

As we enable these pipeline configurations and ISA optimisations, we naturally see an increase in resource requirements in the FPGA.

I therefore set out to run implementations for each pipeline configuration:

with no ISA optimisations,
with Code Compression,
with Integer Multiplier,
with Floating Point Support,
and with all options enabled for each pipeline configuration.

The results can be seen in the table below.

	Opt	LUT	FF	BRAM	DSP
RV32I	Area	823	449	8	0
RV32iC	Area	1033	488	8	0
RV32IM	Area	1130	581	8	4
RV32IMF	Area	3855	1698	8	6
RV32ICMF	Area	4095	1737	8	6

RV32I	Throughput	1200	547	8	0
RV32IC	Throughput	1247	587	8	0
RV32IM	Throughput	1577	684	8	4
RV32IMF	Throughput	4044	1814	8	6
RV32ICMF	Throughput	4243	1854	8	6

RV32I	performance	1235	631	8	0
RV32IC	performance	1208	671	8	0
RV32IM	performance	1304	768	8	4
RV32IMF	performance	4012	1906	8	6
RV32ICMF	performance	4229	1946	8	6

RV32I	Frequency	1762	1252	8	0
RV32IC	Frequency	1713	1315	8	0
RV32IM	Frequency	1956	1448	8	4
RV32IMF	Frequency	4685	2780	8	6
RV32ICMF	Frequency	4949	2846	8	6

From the table and charts, we can observe that the baseline RV32I core is the most compact, using minimal logic and registers. Enabling the M extension increases area moderately, while the F extension drives a large jump in LUT and FF usage due to the floating-point unit. The C extension adds almost no extra hardware yet greatly reduces code size, making it an efficient enhancement—especially when running from BRAM.

When all extensions (ICMF) are combined, resource usage rises to about 4–5× the baseline, though memory and DSP usage remain largely unchanged.

Let’s dive a little deeper. The increase over the base RV32I design in terms of Flip-Flops and LUTs is shown below:

Variant	ΔLUT vs RV32I	ΔFF vs RV32I
RV32IM	+37%	+29%
RV32IMF	+368%	+278%
RV32IC	+26%	+9%
RV32ICMF	+398%	+287%

When it comes to configurations, there are a few efficient combinations we can leverage to ensure the best balance between performance and utilisation:

RV32IM (Throughput): excellent balance — ~1.5K LUTs, 4 DSPs.
RV32IMF (Performance): compute-dense — 4012 LUTs, 6 DSPs.
RV32ICMF (Area): largest area, but you get instruction compression “for free” in resource terms.
RV32IC: most cost-effective if your workload is integer-only and memory-bandwidth-limited.

While the exact configuration depends on your application’s needs, leveraging a MicroBlaze V in your design can significantly simplify the need for complex and inflexible FSMs. In its smallest implementation, the footprint is compact—yet it provides the ability to implement many powerful features.

FPGA Conference

FPGA Horizons US East - April 28th, 29th 2026 - THE FPGA Conference, find out more here.

FPGA Journal

Read about cutting edge FPGA developments, in the FPGA Horizons Journal or contribute an article.

Workshops and Webinars:

If you enjoyed the blog why not take a look at the free webinars, workshops and training courses we have created over the years. Highlights include:

Upcoming Webinars Timing, RTL Creation, FPGA Math and Mixed Signal
Professional PYNQ Learn how to use PYNQ in your developments
Introduction to Vivado learn how to use AMD Vivado
Ultra96, MiniZed & ZU1 three day course looking at HW, SW and PetaLinux
Arty Z7-20 Class looking at HW, SW and PetaLinux
Mastering MicroBlaze learn how to create MicroBlaze solutions
HLS Hero Workshop learn how to create High Level Synthesis based solutions
Perfecting Petalinux learn how to create and work with PetaLinux OS

Boards

Get an Adiuvo development board:

Adiuvo Embedded System Development board - Embedded System Development Board
Adiuvo Embedded System Tile - Low Risk way to add a FPGA to your design.
SpaceWire CODEC - SpaceWire CODEC, digital download, AXIS Interfaces
SpaceWire RMAP Initiator - SpaceWire RMAP Initiator, digital download, AXIS & AXI4 Interfaces
SpaceWire RMAP Target - SpaceWire Target, digital download, AXI4 and AXIS Interfaces
Other Adiuvo Boards & Projects.

Embedded System Book

Do you want to know more about designing embedded systems from scratch? Check out our book on creating embedded systems. This book will walk you through all the stages of requirements, architecture, component selection, schematics, layout, and FPGA / software design. We designed and manufactured the board at the heart of the book! The schematics and layout are available in Altium here. Learn more about the board (see previous blogs on Bring up, DDR validation , USB, Sensors) and view the schematics here.

Sponsored by AMD