MicroZed Chronicles: DFT from Scratch
- May 20
- 5 min read
FPGA Horizons London- October 6th and 7th 2026 - get Tickets here.
The $99 Artix UltraScale+ Explorer Board - learn more here
One of the things I am working on is a DSP for FPGA course which takes you from initial concepts through to hands on examples and labs.
As part of this I wanted to generate simple labs for the students which allow them to build filters, DFTs and FFTs themselves. Even though in reality we might just drop in an IP core in the interest of time, I think it is good that we should be able to understand the logic and the basics, now more than ever.
This week we are going to look at the DFT and how it allows us to go from the time domain to the frequency domain with ease. I will follow up with a blog on the FFT from scratch as well.
To keep things simple and therefore understandable, we are going to create an example with a DFT size of eight.
I wrote a blog on the frequency domain previously which you may want to recap, however the equation for the DFT can be seen below. N is the size of the DFT, in this case eight, while k is the index of the output bin.

This means our application is going to take eight input samples per DFT. For a real valued input, bins N/2+1 through N-1 are the conjugate mirrors of bins N/2-1 down to 1, so they contain no new information. That leaves bins 0 to N/2, i.e. five unique bins for an N=8 DFT. To keep this first example as compact as possible, we will calculate bins 0 to 3 and skip the Nyquist bin (bin 4).
Implementing a DFT is straight forward, we just need two nested loops. The outer loop walks through the bins (k), while the inner loop walks through all N input samples for each bin, accumulating into a real and an imaginary register. Each sample is multiplied by a twiddle factor of cosine and sine, this being the complex exponential which performs the rotation in the DFT.
How do we calculate these values? We divide the unit circle of 360 degrees by the size of the DFT, which gives 360 / 8 = 45 degrees. This gives us eight points on the unit circle separated by 45 degrees, and from these we can calculate the twiddle factors we need for the real and imaginary calculations. Of course, this is a simple development and larger DFT sizes would have a larger number of twiddle factors.
To quantise these points on the unit circle, we will use Q10 encoding.
The nice thing about this table is that it is not developed for a single frequency. It is the lookup for the complex exponential, which means the actual frequency for each bin comes from how fast we walk through the table.
Another thing we need to understand about the unit circle and the inner loop is that bin k corresponds to a frequency of k·fs/N, so as we move up through the bins we are looking for harmonics of the fundamental bin frequency fs/N. This means we can traverse the unit circle faster for the higher bins. Because N is a power of two, the wrap around is just a MOD 8 operation, which is straight forward to implement in logic.

When implemented this way the inner loop runs eight times and the outer loop four times, giving 32 multiply accumulate iterations. Each iteration needs both a cosine multiply and a sine multiply, so that is 64 real multiplies in total. How many physical multipliers this maps to in the FPGA depends on whether the design is fully unrolled, pipelined or time multiplexed.
The code then outputs the results over the next four clock cycles, one bin and identifier on each clock cycle.
To prove this is working as expected, I first created a simple Python script and plotted the expected output.
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
def main() -> None:
samples = np.array([4, 3, 0, -3, -4, -3, 0, 3], dtype=float)
spectrum = np.fft.rfft(samples)
out_dir = Path(__file__).resolve().parents[1] / "docs"
out_dir.mkdir(parents=True, exist_ok=True)
fig, ax = plt.subplots(figsize=(7, 3))
ax.stem(np.arange(len(spectrum)), np.abs(spectrum), basefmt=" ")
ax.set_title("Reference 8-Point DFT Spectrum")
ax.set_xlabel("Bin")
ax.set_ylabel("Magnitude")
ax.grid(True, alpha=0.3)
fig.tight_layout()
fig.savefig(out_dir / "dft_reference.png", dpi=150)
if __name__ == "__main__":
main()

Following this I created the RTL module and a test bench which take the same input as the Python script.
The results of the test bench can be seen below, the RTL is available on my github

This correlates with the Python results and shows we have correctly implemented a simple DFT.
When it comes to implementation, this small DFT is compact, however the DFT does not scale nicely and a DFT size of eight is not really that useful in practice.

As we scale the DFT size the number of multiplies will increase significantly, which means larger DFTs require significant resources. The better approach is to leverage a Fast Fourier Transform, which we will look at in the next blog.
FPGA Conference
FPGA Horizons London- October 6th and 7th 2026 - get Tickets here.
FPGA Journal
Read about cutting edge FPGA developments, in the FPGA Horizons Journal or contribute an article.
Workshops and Webinars:
If you enjoyed the blog why not take a look at the free webinars, workshops and training courses we have created over the years. Highlights include:
Upcoming Webinars Timing, RTL Creation, FPGA Math and Mixed Signal
Professional PYNQ Learn how to use PYNQ in your developments
Introduction to Vivado learn how to use AMD Vivado
Ultra96, MiniZed & ZU1 three day course looking at HW, SW and PetaLinux
Arty Z7-20 Class looking at HW, SW and PetaLinux
Mastering MicroBlaze learn how to create MicroBlaze solutions
HLS Hero Workshop learn how to create High Level Synthesis based solutions
Perfecting Petalinux learn how to create and work with PetaLinux OS
Boards
Get an Adiuvo development board:
Adiuvo Embedded System Development board - Embedded System Development Board
Adiuvo Embedded System Tile - Low Risk way to add a FPGA to your design.
SpaceWire CODEC - SpaceWire CODEC, digital download, AXIS Interfaces
SpaceWire RMAP Initiator - SpaceWire RMAP Initiator, digital download, AXIS & AXI4 Interfaces
SpaceWire RMAP Target - SpaceWire Target, digital download, AXI4 and AXIS Interfaces
Embedded System Book
Do you want to know more about designing embedded systems from scratch? Check out our book on creating embedded systems. This book will walk you through all the stages of requirements, architecture, component selection, schematics, layout, and FPGA / software design. We designed and manufactured the board at the heart of the book! The schematics and layout are available in Altium here. Learn more about the board (see previous blogs on Bring up, DDR validation, USB, Sensors) and view the schematics here.
All words in this blog were written by a human.




