Video Transcript
In this video, I want to introduce Speedgoat’s FPGA solutions. I’ll explain why using FPGAs can benefit your work and show you what Speedgoat FPGAs can do. Most importantly, I’ll show you how you include FPGAs in your workflow. Finally, I‘ll introduce how FPGAs work and what makes them so powerful.
Speedgoat real-time test systems are equipped with powerful multicore CPUs capable of handling most performance-demanding applications.
For some applications, though, you may need to accelerate your algorithms and offload them from the CPU to an FPGA. For example, if you want to control highly dynamic systems. You may want to access and process multiple high-bandwidth signals simultaneously at ultra-low latencies. Also, high-fidelity plant models may need to run on high-performance FPGAs. You may also use FPGAs as a cost-effective deployment path for your final application on an ASIC.
Now, let’s have a look at what Speedgoat FPGAs can do.
You can target them directly from your Simulink model either by configuring an FPGA driver block and then simply hitting “Run on target”, or with the programmable workflow using HDL Coder, when you plan to run parts of your algorithm on the FPGA. You even have a streamlined path for deploying Simscape models to FPGAs via the Simscape HDL Workflow.
Sometimes, only the I/O of your model needs to run faster. For instance, you may want to generate or capture fast PWM or encoder sensor signals. And even if you are not an FPGA expert, there is no need to worry. Speedgoat offers you ready to use FPGA I/O and protocol functionalities. So you can focus on your application.
You can connect multiple FPGAs using lowest latency links, operate several of them on a single or on multiple real-time systems. You can also use FPGAs to synchronize multiple real-time systems and data acquisition with other I/O modules.
Now, let’s have a look at Speedgoat’s FPGA workflows.
Programmable FPGAs allow you to outsource both, parts of your algorithm and signal I/O to the FPGA using the HDL Coder workflow from within Simulink. Speedgoat provides you with ready-to-program I/O and protocol driver blocks. So, it doesn’t necessarily become more complicated, because you can leverage and start rapidly using hardware-proven example models. Ultimately, you have more flexibility for your advanced use cases.
Speedgoat Simulink-programmable FPGAs can be installed in your test system like any other I/O module. But how can you tell if your applications require an FPGA performance boost?
Model sample rates are a very good measure to decide if FPGAs are required or not. If your application has a sample time larger than 1 ms, you can 100% rely on the CPU. For sample times higher than 250µs, we suggest checking for fast IO modules. If you tackle shorter sample times configurable I/O modules may be needed. Speedgoat will help you to qualify for that. If sample times are below 50 µs, it starts to become worthwhile to run parts of the I/O on programmable FPGAs. And for lower sample times, both algorithms and I/O need to run on a Simulink programmable FPGA.
Let’s now check what FPGAs are and why they are so performant.
An FPGA is an integrated circuit comprised of configurable logic blocks. These blocks have dedicated functions, for example
- On-chip RAM, called “Block RAM” to efficiently store specific data types.
- DSP slices, that efficiently implement multipliers.
- Look up tables and flip flops to implement logic functions.
- Or IP blocks, that are pre-verified building blocks to perform common tasks such as memory access.
An FPGA is a piece of electronic hardware, and the term is an acronym for “field programmable gate array”. “field programmable” means you can configure the interconnects using hardware description language, HDL. FPGAs are reconfigured or reprogrammed via synthetized HDL, called bitstream. An FPGA also has Input / Output interfaces, which allows the FPGA to interface with other hardware devices. I/O signals can be digital, coming from ADCs or the CPU via by the PCI bridge.
Let us summarize the key advantages of FPGAs before showing an example. An FPGA is programable hardware. By programming it, you build a custom processor.
In contrast, a CPU executes instructions, and is programmed with software. An FPGA can be tailored for very high throughput, such as processing sensor data. Assuming you have enough logic cells available, all tasks can run in parallel. FPGAs also have ultra-low latencies. For example, FPGAs do not require an operating system. Also, communication does not have to go via generic buses such as USB or PCIexpress (PCIe), because FPGAs have their own I/O interfaces. Another important factor is: Assuming the FPGA is described correctly, deterministic processing behavior can be achieved. Computations are conducted on hardware, independent of background processes or scheduler priorities. Let us go through an example.
Let us walk through an example and process a camera output. Simplistically speaking, a frame or image is a grid of pixels. In our example, let us assume one with 8x8 pixels, and the algorithm conducts some processing steps on pixel level.
Let’s compare execution times qualitatively for different compute architectures. On a single core CPU, pixel by pixel will be processed sequentially. We assume that each algorithmic step takes one clock-cycle to run on one pixel. Nowadays, most CPUs have multiple cores that speed up processing. FPGAs due to their parallel architecture can run tasks concurrently. Theoretically all pixels can be processed at the same time. Allow me a quick disclaimer, just to state some key assumptions that we are making in these visualizations: We are neglecting that CPUs typically have faster clock cycles than FPGAs. Also, we do not consider CPU idle cycles to for example access memory. For understanding the example concept, we believe it does not matter, and for programming FPGAs there are even some more topics to consider.
Programming FPGAs is essentially programming hardware. This allows a lot of freedom, such as for streaming and pipelining of instructions. So, let us imagine our algorithm is comprised of three steps. CPUs operate sequentially, so the first operation needs to run on the entire image before the second one can start. Now let us consider the same algorithm running on the FPGA. Streaming data in and pipelining allows parallel execution of program instructions. So, pixel operations can be done one after another in just three clock cycles, resulting in a much lower latency. Additionally, on an FPGA, we can execute the operation on available logic cells concurrently and obtain a significant increase in through-put. So, in summary, FPGAs allow high data throughput while keeping ultra-low latency.
Thanks for staying until the end of this info packed video. We hope you got you some valuable insights, about Speedgoat FPGA technology. We think we gave you some helpful answers WHY it is worth exploring FPGAs and HOW they work. Thanks for watching! And for more information and learning content, I invite you to check out our website speedgoat.com.