Speedgoat IO3xx DMA read
Speedgoat IO3xx DMA read — The Speedgoat IO3xx DMA read driver block is used to transfer data
with DMA from the FPGA to the CPU using a streamed style
Direct Memory Access (DMA) is used to reduce the latency of data transferred
between the FPGA and the target CPU domain, especially if larger amounts of data
need to be transferred. There are different use cases depending on whether the data
transfer is in the direction FPGA to CPU (data logging), or in the direction CPU to
FPGA (playback) or bidirectional (coprocessor mode). A simplified setup including
the basic blocks required to discuss the DMA use cases in a system comprising an
FPGA-based I/O module and a Real-Time Target x86 CPU is illustrated below.
The FPGA I/O Module consists of the I/O channels
(digital, analog, multi-gigabit transceiver, etc.), the FPGA itself, an external RAM
(typically DDR3/DDR4 SDRAM), the design under test (DUT), the DMA engine and the
PCIe Endpoint (used to communicate with the Target x86 CPU).
The Motherboard (x86) consists of the Target CPU
(x86), the System Memory and the Solid-State Drive (SSD) for persistent data
The DMA read block allows the transfer of data from the FPGA directly into the CPU
RAM using an AXI4-Stream interface from the FPGA design under test (DUT) model. In
the CPU, a large ring buffer is pre-allocated to receive this data (15 x 1 MB).
This ring buffer will be filled during the model step. Upon the receipt of either a
"last" signal from the FPGA model, or the complete filling of the buffer, a new
buffer will be incremented into. At each model execution, the first N values stored
in the buffer will be output onto the Data port, where N is defined by the DMA
Transfer Size parameter. All completed buffers will be cleared, and the total number of values that have been read out will
be written on the Transferred Words port, if this is enabled. If insufficient values
have been read, the values that were read will be presented, and the vector will be
zero-padded. A partially filled buffer will not be cleared. As such, the Transferred
Words port can display more or less than the value set in the DMA Transfer Size
parameter. This behavior will depend on the modelling in the DUT which provides the
data. Data may also be overwritten before it has been transferred if care is not
taken in the DUT model and more than the total buffer size (15 x 1 MB) is
streamed out of the FPGA model.
The DMA write interface is an AXI4-Stream OUTPUT from the DUT design. The required
interfaces in the FPGA design are a data output of uint32 data
type, a tvalid output of boolean data type, a
tlast output of boolean data type and a
tready data input of boolean data type that can be used to
create backpressure so that the model does not pass out data when the DMA engine is
not ready to accept it. For more information about the AXI4-Stream interface, refer
to the MathWorks Model Design for AXI4-Stream Interface Generation.
This driver block has no input ports.
The Data port supplies data which is streamed from the FPGA to the
model running on the CPU. The output data type is uint32. The output
will be a vector whose length matches the set data size. This data is
acquired from the FPGA as an AXI stream. For more information about the
AXI4-Stream interface, refer also to the MathWorks Model Design for AXI4-Stream Interface
- Transferred Words
This optional data output port, a uint32 value, signifies the
total number of data values
transferred successfully during the previous model step. Note this
expresses the number of words that were
transferred. A word is a 32-bit value.
Tab: Engine Setup
FPGA Module Identifier - Unique identifier for FPGA I/O
1 (default) | n
Enter a unique number. This setting must match the setting of the
corresponding design under test (DUT) subsystem generated. This is
usually only relevant in a multi-module model, as otherwise the default
value 1 is applied. The module identifier ordinal sequence will also
correspond to the ordinal sequence of the modules' positions on the PCIe
bus. In this way such modules of the same device type can be uniquely
identified. For this reason, if multiple modules of the same type are
installed in the target machine and one module is not in use within a given model, a bus and slot for the
modules which should be used must be
DMA transfer size (32 bit words) - Frame size of the DMA
The length of the output port vector (uint32) is configured according
to the transfer size setting. Please note that this value expresses the
transfer size in words (32-bit values).
checked | not checked (default)
Use this toggle switch to select polling mode. Polling mode means that
one of the CPU cores is dedicated to checking for the completion of the
(C)DMA transfer. In this mode, the model execution is triggered when the
polling criterion is met. The behavior of the model is identical to a
model whose execution is driven by an interrupt, in that the model
operates independently of the system timer. This mode will provide the
lowest possible latencies and the best performance due to the fact that
the latency and time cost of an interrupt service routine are avoided.
It is worth noting that depending on the data structure or device being
polled, the execution may not be a deterministic process; there could be
more jitter in the actual execution time than there would be if the
system timer was used. For more details on polling, please consult the
MathWorks execution modes documentation. Note that only one block (of
any type) may be used for polling, as only one polling function may be
registered with Simulink Real Time.
Tab: Status Outputs
Enable Word Count
By setting the check-mark as appropriate, the status port described in
the ports section can be enabled or disabled.