Skip to main content

Customer Login

This content is for Speedgoat customer only. Log in to see content.

Forgot your password?

Don't have a Speedgoat account? Create an account.


Speedgoat IO3xx Direct Stream Example

Speedgoat IO3xx Direct Stream Example — Example showcasing data directly streamed out of the design under test (DUT) through the DMA engine and PCIe Endpoint to the system memory

Model Name

The model is called IO3xx_direct_stream_hdlc.slx.

Supported Modules

  • IO332-200k

  • IO333-325k, IO333-410k, IO333-325k-SFP, IO333-410k-SFP

  • IO334-325k, IO335-325k

  • IO342-1450k, IO342-1080k

Required Toolboxes

The list of basic software requirements are provided in the prerequisites section of the Getting Started page.

In addition, you must also install the Fixed-Point Designer.


The example uses the following interfaces:

User Blocks

The example uses the following user blocks:

DMA Overview

Direct Memory Access (DMA) is used to reduce the latency of data transferred between the FPGA and the target CPU domain, especially if larger amounts of data need to be transferred. There are different use cases depending on whether the data transfer is in the direction FPGA to CPU (data logging), or in the direction CPU to FPGA (playback) or bidirectional (coprocessor mode). A simplified setup including the basic blocks required to discuss the DMA use cases in a system comprising an FPGA-based I/O module and a Real-Time Target x86 CPU is illustrated below.

The FPGA I/O Module consists of the I/O channels (digital, analog, multi-gigabit transceiver, etc.), the FPGA itself, an external RAM (typically DDR3/DDR4 SDRAM), the design under test (DUT), the DMA engine and the PCIe Endpoint (used to communicate with the Target x86 CPU).

The Motherboard (x86) consists of the Target CPU (x86), the System Memory and the Solid-State Drive (SSD) for persistent data storage.


To test the HDL interface functionality, dedicated examples are included in the downloaded archive file. To open the examples, navigate to the corresponding folder. Note that the examples only test I/O channels for which the loopback test method is possible. The terminal board provided must be wired as described. Examples do not test I/O channels that require external hardware (for some examples a function generator or an oscilloscope is required), but running this example will still provide sufficient confirmation of the correct setup of this implementation. The examples only test interface channels which are provided by the base functionality of the I/O module. Please note that the examples provided have been color coded. The green colored subsystem (FPGA domain) is the part of the model which is actually compiled using HDL Coder and ultimately runs on the FPGA. The FPGA domain usually has a sample frequency in the range of 100 MHz and is set in the HDL Workflow Advisor (FPGA Synthesis Software Settings). The blue blocks (CPU domain) which surround the green subsystem are interfaces to the processor section of the model. The CPU domain usually has a sample frequency in the range of 1 kHz. The interrupt subsystem has been given another color (magenta), as its functionality is asynchronous to both the processor and FPGA. The interrupt source can be selected in the generated model in the Interrupt Setup block once the model has been run through the HDL Coder Workflow Advisor.

In this use case, the data is directly streamed out of the DUT through the DMA engine and PCIe Endpoint to the System Memory. The setup for the Direct Stream use case can be illustrated as follows:

DMA direct stream

In the DMA direct stream, the FPGA initiates the DMA transfer to pass the data to the x86 System Memory. The scatter gather description chain consists of two alternating descriptors, each pointing to the other, to achieve a ping-pong buffer setup in system memory. This allows the target CPU to process one data buffer, while the other data buffer is being filled by the DMA engine.

The example represents a small data logging application, on which two data channels (two sine waves) are generated in the FPGA and then logged into the CPU domain.

Open the example model by navigating to the folder containing the "*.slx" model file and double clicking the file. If the example is provided as a Simulink Project, navigate to the corresponding example folder and extract the Simulink project zip file. Then double-click the "*.prj" icon to open the project. After opening the project, open the model by double clicking the "*.slx" file. The model is shown as follows:

The design under test (DUT) subsystem is shown as follows:

The DUT_IO3xx_direct_stream_DMA subsystem, which is the part intended for HDL code generation, consists of a Data emulation and a DMA - data logging part.

Data emulation models two sine waves with configurable amplitude and frequency (number of samples of the sine wave search table). The data output signal of the sine generation subsystem is a vector of the two sine waves. These are the two data channels to log. The configuration parameters of the sine waves are declared in the data dictionary file provided.

DMA - data logging handles the logging by interfacing with the DMA engine. The serializer subsystem interlaces the incoming data vector, in this case two sine waves, into a single data stream to interface to the DMA engine, which consists of a single channel data stream. Valid input data may, therefore, only be passed at a rate that allows the serializer to serialize the data vector completely, before the next data samples are passed. If data are passed too early, an error (err port) is thrown. The Stream data subsystem handles the AXI4-Stream signal interface to the DMA engine. It receives the serialized data stream (valid_in, data_in ports), controls the AXI4-Stream signals (tValid, tData, tLast and tReady), creates data frames according to the configurable frameSize parameter and indicates if the internal data FIFO overflows (full port).


The simulation of the model is started by clicking the green run button in the model toolbar. The following data will then be observed in the Simulation Data Inspector (SDI).

Simulation results of the DMA top-level model. The upper plot shows the two data channels and the lower plot shows the interlaced data stream.

Target CPU Driver

On the target CPU, the extraction of the DMA is done with the DMA read block which is located in the DMA subsystem. The block itself is placed in a Function-Call Subsystem, as the DMA transfer is initiated on the CPU side via interrupts. In addition to running a portion of the model triggered with the DMA interrupt, the entire model can also be triggered based on the DMA interrupt. The size of the DMA transfer is configured in the frame_size variable, defined in the IO3xx_direct_stream_hdlc.sldd data dictionary.

Running HDL Workflow Advisor

Before the example can be deployed and run on the real-time target machine, you will need to run through the HDL Coder Workflow Advisor steps to actually generate HDL code and a FPGA bitstream using HDL Coder (FPGA Synthesis Software Settings).

New: Reference design parameters, set at step 1.2 now control which interfaces will be available to target in step 1.3 of the workflow. This has reduced the total number of reference designs, and the list of interfaces available. Please remember to select the front plug-in and rear plug-in setting that is appropriate for your module, as well as the Aurora settings that should be used for your model (if applicable). These additional reference design parameter settings are further described in the interface sections for which they are relevant.

New: Prior to running the workflow advisor, be sure to double click the Select Module block in the demo model. If one or more of your modules support the model (due to available interface compatibility), a pop-up will display prompting you to select the module you would like to target. If only a single module is installed, and providing it is compatible, it will be automatically selected when the box is double clicked.

Upon completion, a newly generated model containing the Simulink Real-Time interface subsystem appears. At first sight, this subsystem resembles the FPGA subsystem. However, inside, the Simulink algorithm has been removed and replaced with blocks that the real-time application will use to communicate with the FPGA during simulation execution. The newly generated model is now ready to be deployed to a real-time target machine. To download the FPGA bitstream and the Simulink model to the target, click the Build Model button on the Simulink Editor toolbar. The real-time application loads on the Speedgoat target machine and the FPGA algorithm bitstream loads on the FPGA. If you are using I/O lines, check that you have connected the lines to the external hardware under test. Please note that some example models do have Global Delay Balancing intentionally disabled. If an error is displayed about delay balancing in step 2.3 of the HDL Coder Workflow Advisor, it can be safely ignored by checking the Ignore warnings checkbox.

Running the Example

To run the generated model, simply run the IO3xx_direct_stream_hdlc_run.m example script. The script configures, builds and runs the model. Finally, it retrieves the logged data after the run is complete. Once the model has been downloaded and the target application has been started, the script will read out the logged data directly from the Simulation Data Inspector and illustrate them in a plot.

Logged DMA data of the top-level model