Speedgoat IO3xx Ring Buffer Example
Speedgoat IO3xx Ring Buffer Example — Example Showcasing a ring buffer on the FPGA as well as the DMA data
path to the CPU domain
The model is called IO3xx_ring_buffer_hdlc.slx.
The list of basic software requirements are provided in the prerequisites section of the
Getting Started page.
In addition, you must also install the Fixed-Point Designer.
The example uses the following interfaces:
The example uses the following user blocks:
Direct Memory Access (DMA) is used to reduce the latency of data transferred
between the FPGA and the target CPU domain, especially if larger amounts of data
need to be transferred. There are different use cases depending on whether the data
transfer is in the direction FPGA to CPU (data logging), or in the direction CPU to
FPGA (playback) or bidirectional (coprocessor mode). A simplified setup including
the basic blocks required to discuss the DMA use cases in a system comprising an
FPGA-based I/O module and a Real-Time Target x86 CPU is illustrated below.
The FPGA I/O Module consists of the I/O channels
(digital, analog, multi-gigabit transceiver, etc.), the FPGA itself, an external RAM
(typically DDR3/DDR4 SDRAM), the design under test (DUT), the DMA engine and the
PCIe Endpoint (used to communicate with the Target x86 CPU).
The Motherboard (x86) consists of the Target CPU
(x86), the System Memory and the Solid-State Drive (SSD) for persistent data
To test the HDL interface functionality, dedicated
examples are included in the downloaded archive file. To open the examples, navigate
to the corresponding folder. Note that the examples only test I/O channels for which
the loopback test method is possible. The terminal board provided must be wired as
described. Examples do not test I/O channels that require external hardware (for
some examples a function generator or an oscilloscope is required), but running this
example will still provide sufficient confirmation of the correct setup of this
implementation. The examples only test interface channels which are provided by the
base functionality of the I/O module. Please note that the examples provided have
been color coded. The green colored subsystem (FPGA domain) is the part of the model
which is actually compiled using HDL Coder and ultimately runs on the FPGA. The FPGA
domain usually has a sample frequency in the range of 100 MHz and is set in the
HDL Workflow Advisor (FPGA Synthesis Software
Settings). The blue blocks (CPU domain) which surround the green
subsystem are interfaces to the processor section of the model. The CPU domain
usually has a sample frequency in the range of 1 kHz. The interrupt subsystem
has been given another color (magenta), as its functionality is asynchronous to both
the processor and FPGA. The interrupt source can be selected in the generated model
in the Interrupt Setup block once the model has been run through the HDL Coder
In this use case the data is buffered into the on-board RAM of the programmable
FPGA module and is only transferred using DMA when triggered by a defined event
(such as an error) on the FPGA. The event or trigger for transferring into the x86
CPU domain is flexible and can be modeled by the user within the design under test
(DUT) algorithm. The example implemented here is an error event debug. The user
signal is monitored, and based on signal anomalies (for example, spikes), stops
logging and initiating a DMA transfer of the data into the x86 domain. The setup for
the ring buffer use case can be illustrated as follows:
The data is logged from the DUT in the external RAM (1) in a ring buffer fashion
(2) until an event (error) occurs, whereupon the RAM readout DMA is triggered and
the data is transferred to the x86 System Memory (3).
Open the example model by navigating to the
folder containing the "*.slx" model file and double clicking the file. If the
example is provided as a Simulink Project, navigate to the corresponding example
folder and extract the Simulink project zip file. Then double-click the "*.prj" icon
to open the project. After opening the project, open the model by double clicking
the "*.slx" file. The model is shown as follows:
The design under test (DUT) subsystem is shown as follows:
Data emulation models three sine waves with
configurable amplitude and frequency (number of samples of the sine wave look-up
table). The output of the sine generation subsystem is three sine wave signals that
will be logged. The third sine wave passes through the fault_injection subsystem,
which emulates a failure case on this user signal (it adds a significant value to
the user sine wave at a configurable time). The configuration parameters of the sine
waves are declared in the data dictionary file provided.
Failure detection checks the sin3_fault user
signal for any anomalies and if there is a failure (for example, a spike), it
fault_flag output signal. This flag indicates that the
failure case occurred and that the logging to the ring buffer can be stopped. The
Post Logging subsystem delays the
fault_flag signal and introduces a
configurable amount of post-event logging.
The DDR RAM write controller handles the AXI4
Master signal interface to the External RAM. The DDR write controller block,
implemented as a MATLAB function state machine, includes the management of
addressing and signaling to the External RAM. The blocks have two mask parameters:
Burst Size, which specifies the size in 64-bit words of a AXI4 Master transaction to
the External RAM, and FIFO_size, which defines the supported buffer size to
pre-store the data in the FIFO and initiate the next burst once enough data is
pre-stored. The FIFO_sys subsystem is used to deal with backpressure introduced on
the AXI4 Master interface, which requires a buffer to pre-store the data before a
The DDR RAM Monitoring subsystem is used to
monitor the various status flags from the DDR Write Controller, which can help
identify any issues writing to the external RAM.
If the model is started by clicking the green run button in the models' toolbar,
the following data may be observed in the Simulation Data Inspector (SDI) as
illustrated bellow. The first subplot illustrates the three sine waves
sine3_fault, with the
injected spike on sine3_fault. The second subplot shows the timing counter value.
The third subplot shows the address signal of the AXI4 Master interface and
illustrates the ring buffer’s behavior. The fourth subplot shows the
fault_detected signals that finally
initiate the DMA data transfer.
Target CPU Driver
On the target CPU, the extraction of the DMA data is done with the DMA read
external RAM block which is located in the read RAM-enabled subsystem. The enabled
subsystem is used in such a way that the DMA is only enabled when the logging on the
FPGA is actually finished (
pci_RAM_write_done signal). The input port
of the DMA block offset is the actual address of the RAM being read at each sample
step. The RAM is read out starting from address zero up to the specified logging
RAM_depth parameter). Owing to the ring buffer setup, the
realignment of the data is then done in the post-analysis script. The value for the
DMA transfer size is pre-set to a variable value
frame_size, which is
defined in the Simulink data dictionary.
This example does not require any test wiring.
Running HDL Workflow Advisor
Before the example can be deployed and run on the real-time target machine, you
will need to run through the HDL Coder Workflow Advisor steps to actually generate
HDL code and a FPGA bitstream using HDL Coder (FPGA Synthesis Software
New: Reference design parameters, set at step 1.2
now control which interfaces will be available to target in step 1.3 of the
workflow. This has reduced the total number of reference designs, and the list of
interfaces available. Please remember to select the front plug-in and rear plug-in
setting that is appropriate for your module, as well as the Aurora settings that
should be used for your model (if applicable). These additional reference design
parameter settings are further described in the interface sections for which they
New: Prior to running the workflow advisor, be
sure to double click the Select Module block in the demo model. If one or more of
your modules support the model (due to available interface compatibility), a pop-up
will display prompting you to select the module you would like to target. If only a
single module is installed, and providing it is compatible, it will be automatically
selected when the box is double clicked.
Upon completion, a newly generated model containing the Simulink Real-Time
interface subsystem appears. At first sight, this subsystem resembles the FPGA
subsystem. However, inside, the Simulink algorithm has been removed and replaced
with blocks that the real-time application will use to communicate with the FPGA
during simulation execution. The newly generated model is now ready to be deployed
to a real-time target machine. To download the FPGA bitstream and the Simulink model
to the target, click the Build Model button on the
Simulink Editor toolbar. The real-time application loads on the Speedgoat target
machine and the FPGA algorithm bitstream loads on the FPGA. If you are using I/O
lines, check that you have connected the lines to the external hardware under test.
Please note that some example models do have Global Delay
Balancing intentionally disabled. If an error is displayed about
delay balancing in step 2.3 of the HDL Coder Workflow Advisor, it can be safely
ignored by checking the Ignore warnings checkbox.
Running the Generated Model
Once the model has been successfully built using the HDL Workflow Advisor, it can be
tested by running the IO3xx_ring_buffer_hdlc_run.m script provided. The script
configures, builds and runs the model and retrieves the logged data after the run.
Once the model has been downloaded and the target application has been started, the
script will read out the logged data directly with SDI and illustrate them in a plot
as shown below.