Speedgoat IO3xx Playback Example
Speedgoat IO3xx Playback Example — Example showcasing playback DMA functionality with data originated
in the CPU and sampled out with a fixed sample rate on the
FPGA.
Model Name
The model is called IO3xx_playback_hdlc.slx.
Required Toolboxes
The list of basic software requirements are provided in the prerequisites section of the
Getting Started page.
In addition, you must also install the Fixed-Point Designer.
Interfaces
The example uses the following interfaces:
User Blocks
The example uses the following user blocks:
DMA Overview
Direct Memory Access (DMA) is used to reduce the latency of data transferred
between the FPGA and the target CPU domain, especially if larger amounts of data
need to be transferred. There are different use cases depending on whether the data
transfer is in the direction FPGA to CPU (data logging), or in the direction CPU to
FPGA (playback) or bidirectional (coprocessor mode). A simplified setup including
the basic blocks required to discuss the DMA use cases in a system comprising an
FPGA-based I/O module and a Real-Time Target x86 CPU is illustrated below.

The FPGA I/O Module consists of the I/O channels
(digital, analog, multi-gigabit transceiver, etc.), the FPGA itself, an external RAM
(typically DDR3/DDR4 SDRAM), the design under test (DUT), the DMA engine and the
PCIe Endpoint (used to communicate with the Target x86 CPU).
The Motherboard (x86) consists of the Target CPU
(x86), the System Memory and the Solid-State Drive (SSD) for persistent data
storage.
Example
To test the HDL interface functionality, dedicated
examples are included in the downloaded archive file. To open the examples, navigate
to the corresponding folder. Note that the examples only test I/O channels for which
the loopback test method is possible. The terminal board provided must be wired as
described. Examples do not test I/O channels that require external hardware (for
some examples a function generator or an oscilloscope is required), but running this
example will still provide sufficient confirmation of the correct setup of this
implementation. The examples only test interface channels which are provided by the
base functionality of the I/O module. Please note that the examples provided have
been color coded. The green colored subsystem (FPGA domain) is the part of the model
which is actually compiled using HDL Coder and ultimately runs on the FPGA. The FPGA
domain usually has a sample frequency in the range of 100 MHz and is set in the
HDL Workflow Advisor (FPGA Synthesis Software
Settings). The blue blocks (CPU domain) which surround the green
subsystem are interfaces to the processor section of the model. The CPU domain
usually has a sample frequency in the range of 1 kHz. The interrupt subsystem
has been given another color (magenta), as its functionality is asynchronous to both
the processor and FPGA. The interrupt source can be selected in the generated model
in the Interrupt Setup block once the model has been run through the HDL Coder
Workflow Advisor.
In this use case, the data is directly streamed from the CPU through PCIe Endpoint
and DMA engine to the design under test (DUT) of the FPGA. The setup for the Direct
Stream use case can be illustrated as follows:

DMA Playback
In the DMA playback, the FPGA initiates the DMA transfer by triggering an
interrupt to the x86 CPU. Since the DMA write driver is placed in the Interrupt
subsystem, the DMA will be initiated as soon as the interrupt is received on the CPU
side.
The example represents a small playback application, on which data originates from
the CPU and is finally output on the analog output of the FPGA. The sampling of the
analog output interface runs at a fixed but configurable rate and therefore must
always have new data available to pass on to the AO interface. The data itself is
passed on from the CPU through DMA, and a small FIFO in the design under test (DUT)
is used to buffer enough data to bridge the gap when new data is requested by the
CPU and passed to the FPGA through DMA.
Open the example model by navigating to the
folder containing the "*.slx" model file and double clicking the file. If the
example is provided as a Simulink Project, navigate to the corresponding example
folder and extract the Simulink project zip file. Then double-click the "*.prj" icon
to open the project. After opening the project, open the model by double clicking
the "*.slx" file. The model is shown as follows:

The design under test (DUT) subsystem is shown as follows:

The DUT_IO3xx_DMA_Playback subsystem, which is the part
intended for HDL code generation, consists of a AO_Triggering
and a Fifo Buffer subsystem.
AO_Triggering generates trigger pulses in the
sample frequency configured through the
PCIe_CounterValue_DA_Trigger port from the CPU. The
Enable port of the subsystem only starts generating trigger
pulses when the Fifo Buffer has received enough data to get started.
Fifo Buffer implements a small FIFO that handles
the incoming data from the DMA interface and transfers it to the AO interface. The
Fifo is needed to bridge the gap until new data is received from the CPU. Everytime
the DMA frame completes, new data is requested by initiating an interrupt. The
tReady port of the AXI4-Stream interface is used to apply
backpressure on the incoming DMA data stream to avoid overloading the FIFO.
Simulation
The simulation of the model is started by clicking the green run button in the
model toolbar. The simulation illustrates in SDI (Simulation Data Inspector) how the
FIFO is first filled and then how the first few samples are output on the AO
interface (AO_Trigger, AO_Data).
Target CPU Driver
On the target CPU, the write of the DMA is done with the DMA write
block which is located in the INTA Write
subsystem. The block itself is placed in a Function-Call Subsystem, as the DMA
transfer is initiated on the CPU side via interrupts. The size of the DMA transfer
is configured in the frame_size variable, defined in the
IO3xx_playback.sldd data dictionary.
Running HDL Workflow Advisor
Before the example can be deployed and run on the real-time target machine, you
will need to run through the HDL Coder Workflow Advisor steps to actually generate
HDL code and a FPGA bitstream using HDL Coder (FPGA Synthesis Software
Settings).
New: Reference design parameters, set at step 1.2
now control which interfaces will be available to target in step 1.3 of the
workflow. This has reduced the total number of reference designs, and the list of
interfaces available. Please remember to select the front plug-in and rear plug-in
setting that is appropriate for your module, as well as the Aurora settings that
should be used for your model (if applicable). These additional reference design
parameter settings are further described in the interface sections for which they
are relevant.
New: Prior to running the workflow advisor, be
sure to double click the Select Module block in the demo model. If one or more of
your modules support the model (due to available interface compatibility), a pop-up
will display prompting you to select the module you would like to target. If only a
single module is installed, and providing it is compatible, it will be automatically
selected when the box is double clicked.
Upon completion, a newly generated model containing the Simulink Real-Time
interface subsystem appears. At first sight, this subsystem resembles the FPGA
subsystem. However, inside, the Simulink algorithm has been removed and replaced
with blocks that the real-time application will use to communicate with the FPGA
during simulation execution. The newly generated model is now ready to be deployed
to a real-time target machine. To download the FPGA bitstream and the Simulink model
to the target, click the Build Model button on the
Simulink Editor toolbar. The real-time application loads on the Speedgoat target
machine and the FPGA algorithm bitstream loads on the FPGA. If you are using I/O
lines, check that you have connected the lines to the external hardware under test.
Please note that some example models do have Global Delay
Balancing intentionally disabled. If an error is displayed about
delay balancing in step 2.3 of the HDL Coder Workflow Advisor, it can be safely
ignored by checking the Ignore warnings checkbox.
Running the Example
To run the generated model, simply run the IO3xx_playback_run.m example script.
The script configures, builds and runs the model. Finally, it retrieves the logged
data after the run is complete. Once the model has been downloaded and the target
application has been started, the script will read out the logged data directly from
the Simulation Data Inspector and illustrate them in a plot.
To see the actual signal output on the analog interface, connect an external
oscilloscope. The signal illustrated in the image below, which is the logged data of
the CPU, is undersampled, as the data is only retrieved with a CPU sample rate of
1 ms. The data on the analog output port, however, is sampled with a rate of
48 kHz.
Logged data of the top-level model