Adam Taylor
- Mar 13
- 6 min read

MicroZed Chronicles: The CORDIC Algorithm

In last weeks blog, we looked at one of the most important algorithms ever developed: the FFT. In this week’s blog, we’ll be discussing the CORDIC algorithm, which is similar in importance to the FFT.

Short for COordinate Rotation DIgital Computer, the CORDIC algorithm, invented by Jack Volder for the B58 program in 1959, is one of the most important algorithms in an FPGA engineer’s tool box and one that few are aware of. Engineers have almost certainly used results calculated by a scientific calculator like the HP35, and this practice continues with present-day calculators, which use the algorithm for trigonometric and exponential functions. However, it’s critically important to understand the CORDIC algorithm as well. The real beauty of the CORDIC algorithm is that it can be implemented with a very small FPGA footprint and requires only a small lookup table along with logic to perform shifts and additions. Equally important is that the algorithm requires no dedicated multipliers or dividers to implement.

This algorithm is one of the most useful for DSP and industrial control applications and can also implement some very useful mathematical functions depending on its mode and configuration. The CORDIC algorithm can operate in one of three configurations: linear, circular or hyperbolic. Within each of these configurations, the algorithm functions in one of two modes: rotation or vectoring. In rotation mode, the input vector is rotated by a specified angle. In vector mode, the algorithm rotates the input vector to the x axis while recording the angle of rotation required.

The unified CORDIC algorithm can be seen below and covers all three configurations. It has three inputs (X, Y and Z) and how these are initialized at start up depends on the mode of operation (vectoring or rotation)

Where m defines the configuration for either hyperbolic (m = -1), linear (m = 0) or circular (m = 1), the value of ei, which notes the angle of rotation, changes depending upon the configuration. The value of ei is normally implemented as a small lookup table within the FPGA.

di is the direction of rotation which depends upon the mode of operation for rotation mode di = -1 if Zi < 0 else +1, while in vectoring mode di = +1 if Yi < 0 else -1.

When configured in either circular or hyperbolic and using rotation mode, the output results will have gain which can be pre calculated using the number of rotations defined using the following equation.

This gain is typically fed back into the initial setting of the algorithm to remove the need for post scaling of the result.

While the algorithm presented above is very important to the design engineer, it has to be noted that the CORDIC algorithm only operates within a strict convergence zone, which may require the engineer to perform some pre-scaling to ensure the algorithm performs as expected. It’s worth noting that the algorithm will get more accurate with every iteration (serial) or stage (parallel) the engineer decides to implement. A general rule of thumb is that for n bits of precision, n iterations or stages are required.

Convergence

The CORDIC algorithm will only converge (work) across a limited range of input value. For circular configurations of CORDIC algorithms, convergence is guaranteed for the angles below the sum of the angles in the lookup table (i.e. between -99.7 and 99.7 degrees). For angles outside of this, the engineer must use a trigonometric identity to translate one within. This is also true for convergence within the linear configuration. In hyperbolic mode, however, certain iterations must be repeated (4, 13, 40, K… 3K+1) to gain convergence. In this case, the maximum input of Ɵ is approximately 1.118 radians.

Where Are These Used

Like previously stated, CORDICs are used in a wide range of applications from DSP and image processing to industrial control systems. The most basic method of using a CORDIC is to generate sine and cosine waves when coupled with a phase accumulator. The use of the algorithm to generate these waveforms can, if done correctly, result in a high spurious free dynamic range (SFDR). It’s important to note that good SFDR performance is required for most signal processing applications. Within the field of robotics, CORDICs are used in kinematics where the addition of coordinate values with new coordinate values can be easily accomplished by the use of a circular CORDIC in vectoring mode. Within the field of image processing, three dimensional operations such as lighting and vector rotation are perfect candidates for algorithm implementation. However, perhaps the most common use of the algorithm is in the implementation of traditional mathematical functions as shown in table one. Here we see that multipliers, dividers, or more interesting mathematical functions are required in devices where there are no dedicated multipliers or DSP blocks. This means that CORDICs are used in many small industrial controllers to implement mathematical transfer functions. True RMS measurement is one such example.

Implementation within an FPGA

When it comes to implementing a CORDIC within our AMD FPGA, we can implement the CORDIC IP block that comes with Vivado. This module provides a range of configurations to implement vectoring, rotation, and sine and cosine generation along with support for hyperbolic and square root operations.

To conclude this article, we are going to configure the module to generate sine and cosine vectors and simulate it to understand more about its operation. We will extend the test bench to look at other features in a future blog.

When working in the sine and cosine mode, we need to ensure the option for coarse rotation is checked if we want to be able to cover the full circle. If this is not checked, we are limited to the -Pi/4 to Pi/. With it selected, we are able to complete a full circle.

The block diagram looks as below.

With the simple design created, I made a simple test bench which inputs values using the AXIS. The input format of this vector is 3 signed integer bits with the remainder fractional. This allows values between +/- Pi to be input.

While the output vector is formatted as a 1 bits signed integer bit with the remainder fractional, this allows values between -1 and 1 to be output and the sine and cosine outputs are shared on the output vector.

The test bench is very straight forward and uses a counter to count continually between -pi and pi and is applied as the phase input to the CORDIC IP core.

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.fixed_pkg.all;

use IEEE.math_real.all;

use IEEE.NUMERIC_STD.ALL;

library UNISIM;

use UNISIM.VComponents.all;

entity cordic_tb is

--  Port ( );

end cordic_tb;

architecture Behavioral of cordic_tb is

constant clk_period : time := 10 ns;

constant increment : real := 0.01256637;

signal M_AXIS_DOUT_0_tdata    :  STD_LOGIC_VECTOR ( 31 downto 0 );

signal M_AXIS_DOUT_0_tvalid   :  STD_LOGIC;

signal S_AXIS_PHASE_0_tdata   :  STD_LOGIC_VECTOR ( 15 downto 0 );

signal S_AXIS_PHASE_0_tvalid  :  STD_LOGIC := '0';

signal clk                    :  STD_LOGIC := '0';

signal phase_wheel : real range -MATH_PI  to MATH_PI ;

signal direction :  std_logic := '0';

begin

clk <= not clk after (clk_period/2);

uut: entity work.design_1_wrapper port map (

    M_AXIS_DOUT_0_tdata     =>  M_AXIS_DOUT_0_tdata,

    M_AXIS_DOUT_0_tvalid    =>  M_AXIS_DOUT_0_tvalid,

    S_AXIS_PHASE_0_tdata    =>  S_AXIS_PHASE_0_tdata,

    S_AXIS_PHASE_0_tvalid   =>  S_AXIS_PHASE_0_tvalid,

    aclk_0                  =>  clk

);

process(clk)

begin

    if rising_edge(clk) then

        S_AXIS_PHASE_0_tvalid <= '1';

        S_AXIS_PHASE_0_tdata  <= std_logic_vector(to_sfixed(phase_wheel, 2,-13 )) ;

        if phase_wheel >= MATH_PI and direction = '0' then

            direction <= '1';

        elsif phase_wheel <= -MATH_PI and direction = '1' then

            direction <= '0';

        end if;

        if direction = '1' then

            phase_wheel <= phase_wheel - increment;

        else

            phase_wheel <= phase_wheel + increment;

        end if;

    end if;

end process;

end Behavioral;

Running this in the test bench provides the following output in Vivado simulator.

In a future blog, we will look deeper at exploiting this for various applications, but until then, we now know more about the CORDIC algorithm!

Workshops and Webinars

If you enjoyed the blog why not take a look at the free webinars, workshops and training courses we have created over the years. Highlights include

Professional PYNQ Learn how to use PYNQ in your developments
Introduction to Vivado learn how to use AMD Vivado
Ultra96, MiniZed & ZU1 three day course looking at HW, SW and Petalinux
Arty Z7-20 Class looking at HW, SW and Petalinux
Mastering MicroBlaze learn how to create MicroBlaze solutions
HLS Hero Workshop learn how to create High Level Synthesis based solutions
Perfecting Petalinux learn how to create and work with PetaLinux OS

Embedded System Book

Do you want to know more about designing embedded systems from scratch? Check out our book on creating embedded systems. This book will walk you through all the stages of requirements, architecture, component selection, schematics, layout, and FPGA / software design. We designed and manufactured the board at the heart of the book! The schematics and layout are available in Altium here Learn more about the board (see previous blogs on Bring up, DDR validation, USB, Sensors) and view the schematics here.

Order here

MicroZed Chronicles: The CORDIC Algorithm

Recent Posts