MicroZed Chronicles: Synchronization & Metastability
Over the last week I noticed a few questions on r/FPGA and from a client on metastability and synchronizing signals coming into the system. I thought it would make a good topic for a blog and is always one of the most interesting elements when we teach our Mission Critical Design Course.
Let’s start at the beginning, all flip flops have a setup and hold window around the active clock edge during which the data must not change. If the data does change in this window the output of the flip flop will go to indeterminate state this will be neither a logic 0 nor logic 1. After a defined recovery time the flip flop output will recover to either logic 0 or logic 1.
The setup and hold times along with the recovery time are unique to each device family, this information is normally defined in datasheets, or application notes. In general, as we design our FPGA to meet the timing constraints, we define we do not have to worry about them as Vivado tries to achieve the performance defined in our constraints. They do of course become a concern when we have timing issues.
However, when we have asynchronous inputs into the FPGA or multiple clock domains which are asynchronous to each other we need to carefully consider the design to ensure we do not violate the setup and hold time which results in metastability. Of course, we cannot prevent a metastable event from occurring in either case but we can ensure our design does not have incorrect data because of it occurring.
Of course, it is obvious that an input signal will be asynchronous to the internal clock, it is not so clear for what constitutes being in the same clock domain. There are a few simple rules we can use; we are in the same clock domain if the clock is an integer division of a common clock. They are not in the same clock domain if the clock is a non-integer division, or come from different sources (even if they have the same clock frequency)
When it comes to synchronizing signals into the FPGA or into a different clock domain there are several design choices available. Within the Xilinx world the best way to do this is to use the Xilinx Parameterized Macros, these macros are created with the intent to supporting CDC / Synchronization issues. The XPM provide a range macros including
Single Static Synchronizer – The classic flip flop synchronizer for a single bit
Pulse Data Synchronizer – Transfer a pulse from one domain to the next
Data Bus – Mux, FIFO, Handshake and Gray code-based transfer
Rather nicely in Vivado we can now implement XPM structures within IP Integrator to make the CDC crossing visible.
You need to think carefully about which structure you use, do not for example attempt to synchronize multiple data bits using single bit synchronizers as the data transferred cannot be guaranteed to be aligned resulting in corruption. You also need to be aware of recombination, this is where two or more static signals cross the clock domain and are recombined in a logic function. Delays in the synchronizer, due to the metastability recovery can result in the downstream logic being impacted.
Despite our best efforts to mitigate CDC in our design, we are human and we might miss some, as such we can use the in built Vivado options to report CDC which are occurring in your design.
We can run a CDC report once synthesis is completed, there is no need to wait for place and root to complete.
In the TCL console run the command report_cdc – there is a range of options which can be used to write out a file, analyse a specific path or create a waiver.
This will show any clock domain crossings in the design and if there are any unsafe or unknown crossings. In the example above you can see there are 6 unsafe and several unknown. If we run the command with the option -details we will see all the paths reported. Clicking on the unsafe or unknown will open the paths of concern for inspection.
With a path selected, we can open the schematic viewer which focuses in on the path in question. The issue in this case (contrived for this blog) is that the reset is generated by a different clock.
Knowing this we can correct the issue by either updating the design to correct for the error, insert the necessary synchronizing structures, or correct the constraints.
Vivado is good at providing tools which allow us to investigate clock domain crossing and clock interactions. As we implement in our designs, in Vivado we should be running the CDC Report as we implement our design even if we are "sure" we have no clock domain crossing issues.