top of page
Adiuvo Engineering & Training logo
MicroZed Chronicles icon

MicroZed Chronicles: You Really Should Use a MicroBlaze V

I see a lot of designs across a range of industries and applications. One thing that always stands out is that designs often use complex state machines to implement features such as I²C, SPI, GPIO sequencing, etc.


ree

Often, these FSMs become unwieldy and complex, which adds to bring-up and debugging challenges. Of course, these FSMs also need to be verified in simulation and again on the board once they arrive.


When I am asked to work with these interfaces in an FPGA, I often replace a complex FSM with a MicroBlaze V. This enables me to drop in a small soft-core processor and then create a simple software application to control those interfaces. This approach makes the software flexible, easy to understand, and simple to modify if there are any last-minute changes or issues—which, as we all know, never happens on projects.

You might think that implementing a MicroBlaze V would be resource-intensive; however, MicroBlaze V is incredibly configurable, which enables me to optimise its instantiation footprint.


I thought it would make for an interesting blog to examine the configuration options of the MicroBlaze V and their impact on device footprint.


The MicroBlaze V is, of course, based on the RISC-V RV32 architecture. However, it is highly customisable, not only with interfaces and peripherals but also with optional elements of the instruction set architecture (ISA).


Before I dive into the different ISA optimisations we can make, I am first going to examine the different overall processor architecture configurations we can select. The configuration I am referring to is the number of pipeline stages used for instruction execution.


  • Area – In this configuration, the processor is implemented as a three-stage pipeline to ensure a minimal logic resource implementation.

  • Throughput – This option implements a four-stage pipeline, optimised for computational throughput.

  • Performance – This option implements a five-stage pipeline, optimised for maximum performance.

  • Frequency – This option implements an eight-stage pipeline, optimised to achieve the highest possible implementation frequency.


One thing to note is that instructions might take longer than one cycle to fetch (for example, if DDR memory is used). In this case, the memory architecture and the use of tightly coupled caches are critical to ensure performance. We’ll examine caches and performance in another blog soon.


When it comes to ISA optimisations within the MicroBlaze V configuration, we can enable the following:

  • Code Compression (C) – Replaces many common 32-bit instructions with 16-bit equivalents, reducing program size by about 25–35% with almost no performance penalty. It adds a small decompression block in hardware; the area overhead on FPGAs is negligible, while instruction memory and cache efficiency improve noticeably.

  • Integer Multiplier (M) – Adds hardware support for integer multiplication and division, greatly speeding up arithmetic-heavy code compared to software-emulated operations. This typically increases the number of DSP elements used in the FPGA.

  • Floating Point (F) – Introduces single-precision floating-point operations implemented directly in hardware. This increases performance significantly but comes with a large resource-usage overhead.

  • Atomic Operations (A) – Adds atomic read-modify-write instructions, enabling efficient synchronisation and lock-free data sharing in multi-threaded or interrupt-driven systems. This has a small overhead on logic resources but can be critical for real-time operating systems.

  • Bit Manipulation (Zba/Zbb/Zbc/Zbs) – These extensions add specialised instructions for common bitwise operations such as shifts, rotates, extracts, and bit counting. They can be very useful for cryptographic and DSP applications.


As we enable these pipeline configurations and ISA optimisations, we naturally see an increase in resource requirements in the FPGA.

I therefore set out to run implementations for each pipeline configuration:

  • with no ISA optimisations,

  • with Code Compression,

  • with Integer Multiplier,

  • with Floating Point Support,

  • and with all options enabled for each pipeline configuration.


The results can be seen in the table below.


 

Opt

LUT

FF

BRAM 

DSP

RV32I

Area

823

449

8

0

RV32iC

Area

1033

488

8

0

RV32IM

Area

1130

581

8

4

RV32IMF

Area

3855

1698

8

6

RV32ICMF

Area

4095

1737

8

6

 

 

 

 

 

 

RV32I

Throughput

1200

547

8

0

RV32IC

Throughput

1247

587

8

0

RV32IM

Throughput

1577

684

8

4

RV32IMF

Throughput

4044

1814

8

6

RV32ICMF

Throughput

4243

1854

8

6

 

 

 

 

 

 

RV32I

performance

1235

631

8

0

RV32IC

performance

1208

671

8

0

RV32IM

performance

1304

768

8

4

RV32IMF

performance

4012

1906

8

6

RV32ICMF

performance

4229

1946

8

6

 

 

 

 

 

 

RV32I

Frequency

1762

1252

8

0

RV32IC

Frequency

1713

1315

8

0

RV32IM

Frequency

1956

1448

8

4

RV32IMF

Frequency

4685

2780

8

6

RV32ICMF

Frequency

4949

2846

8

6


ree
ree


From the table and charts, we can observe that the baseline RV32I core is the most compact, using minimal logic and registers. Enabling the M extension increases area moderately, while the F extension drives a large jump in LUT and FF usage due to the floating-point unit. The C extension adds almost no extra hardware yet greatly reduces code size, making it an efficient enhancement—especially when running from BRAM.

When all extensions (ICMF) are combined, resource usage rises to about 4–5× the baseline, though memory and DSP usage remain largely unchanged.


Let’s dive a little deeper. The increase over the base RV32I design in terms of Flip-Flops and LUTs is shown below:

Variant

ΔLUT vs RV32I

ΔFF vs RV32I

RV32IM

+37%

+29%

RV32IMF

+368%

+278%

RV32IC

+26%

+9%

RV32ICMF

+398%

+287%

When it comes to configurations, there are a few efficient combinations we can leverage to ensure the best balance between performance and utilisation:


  • RV32IM (Throughput): excellent balance — ~1.5K LUTs, 4 DSPs.

  • RV32IMF (Performance): compute-dense — 4012 LUTs, 6 DSPs.

  • RV32ICMF (Area): largest area, but you get instruction compression “for free” in resource terms.

  • RV32IC: most cost-effective if your workload is integer-only and memory-bandwidth-limited.


While the exact configuration depends on your application’s needs, leveraging a MicroBlaze V in your design can significantly simplify the need for complex and inflexible FSMs. In its smallest implementation, the footprint is compact—yet it provides the ability to implement many powerful features.


FPGA Conference

FPGA Horizons US East - April 28th, 29th 2026 - THE FPGA Conference, find out more here.


FPGA Journal

Read about cutting edge FPGA developments, in the FPGA Horizons Journal or contribute an article.


Workshops and Webinars:

If you enjoyed the blog why not take a look at the free webinars, workshops and training courses we have created over the years. Highlights include:



Boards

Get an Adiuvo development board:

  • Adiuvo Embedded System Development board - Embedded System Development Board

  • Adiuvo Embedded System Tile - Low Risk way to add a FPGA to your design.

  • SpaceWire CODEC - SpaceWire CODEC, digital download, AXIS Interfaces

  • SpaceWire RMAP Initiator - SpaceWire RMAP Initiator,  digital download, AXIS & AXI4 Interfaces

  • SpaceWire RMAP Target - SpaceWire Target, digital download, AXI4 and AXIS Interfaces

  • Other Adiuvo Boards & Projects.


Embedded System Book   

Do you want to know more about designing embedded systems from scratch? Check out our book on creating embedded systems. This book will walk you through all the stages of requirements, architecture, component selection, schematics, layout, and FPGA / software design. We designed and manufactured the board at the heart of the book! The schematics and layout are available in Altium here.  Learn more about the board (see previous blogs on Bring up, DDR validation, USB, Sensors) and view the schematics here.


Sponsored by AMD

bottom of page