top of page
Hexagon Background

FPGA Programming Fundamentals for Every LabVIEW Developer

FPGA Programming Fundamentals for Every LabVIEW Developer

Introduction


Learning to program, let alone optimize, an FPGA application can take years, if not decades of digital engineering experience. While the technology continues to evolve, certain fundamental concepts remain essential across all FPGA platforms.


  • Timing paradigms and clocking

  • Data types and structures

  • Memory and data transfer

  • Compilation and debugging


While this article does not intend to outline all of the considerations and benefits of FPGA-based application development, it does seek to provide context into these concepts both generally and in particular consideration for FPGA application development in LabVIEW. Hopefully you’ll walk away with some insight into some common programing tips and tradeoffs so that you can get your application up and running and refine it over time.


 


Section 1: Why Use an FPGA?


When developing on an FPGA, you are doing more than providing instructions to be executed on a pre-defined chip, you are actually customizing the FPGA chip itself. The benefit of this is that the application can achieve very high loops rates, parallelism, and responsivity to I/O and logic states with minimal software overhead. Depending on your experience, this may be review, so you may want to skip down to some of the other sections covering pipelining, parallel loop operations, or host synchronization. But if you want the full story, read on. 


FPGAs, or Field Programmable Gate Arrays, are digital circuits that can be modified time and again for different applications or updates to existing applications. For test and measurement applications, FPGAs can be programmed to perform tasks more traditionally handled by a processor – anywhere from a simple microcontroller through a multicore CPU. While not all tasks make sense to execute on a FPGA, applications requiring inline signal processing for minimal latency, high hardware-timed reliability, and/or high-speed, deterministic control are good candidates for FPGA integration. 



Closed loop control timing comparison for FPGA and CPU schemes
System diagram showing generic timing capabilities for FPGA and CPU control loops

Below are some of the most common use cases for FPGAs in control, monitoring, and test applications.


  • Reduce data processing time → sub-µs loop rates

  • Customizable algorithms and filtering → hardware-timed execution

  • Lower level memory control → DMA streaming with low overhead

  • Custom protocol support and decoding → digital communications and integration

  • Complex and deterministic triggers → customize responsiveness

 

Now, if you know you need any of the system or performance benefits listed above, an alternative option to using an FPGA is to use an ASIC. An ASIC, or Application-Specific Integrated Circuit, is a chip that has a predefined set of functionality that can be programmatically accessed through an API.


Depending on the application requirements and planned deployment volume, selecting or building an ASIC for some set of functionality may be the right design decision, but the core limitation is that once they are built, the underlying hardware cannot be modified. From the moment the chip ASIC is fabricated, you take it as it is.


And here is the beauty of FPGA-based development. You can not only modify the application software running on top of the FPGA, but also modify the hardware circuit implemented on the FPGA itself. This gives development teams massive flexibility to adapt to changing application requirements, bugs, and new features over time.



 


Section 2: FPGA Programming Basics


HDLs, or Hardware Description Languages, are used to program an FPGA chip. To oversimplify what programming a circuit means, an HDL program is compiled and pushed to the FPGA target which intakes that compiled program and modifies the logic gates on the chip. The result is a temporarily static instantiation of a “personality” on the FPGA that incorporates:

    

Core blocks of FPGA fabric
FPGA fabric overview highlighting core blocks available across platforms.

  • Memory blocks – data storage in user-defined RAM

  • Logic blocks – Logic, arithmetic, DSP algorithms, etc.

  • I/O blocks – connections between external circuits (e.g., sensors, processors, other FPGAs) and logic blocks

  • Interconnections – connections between multiple I/O  blocks common in any integrated hardware application

 

If the developer wants to make a tweak to some block or repurpose the FPGA entirely, they must modify the HDL source code, re-compile, and re-push the compiled code to the FPGA. Pretty neat, right?


Yes, very neat, though there are some caveats:


  • When compared to an ASIC, FPGAs are typically more power hungry, less performant, and higher cost for post-design deployment, though the application design cycles are typically far shorter, thereby lowering total engineering cost.

  • In terms of abstraction, HDLs resemble assembly languages, which are quite low in the compute hierarchy. To put it another way, they are not easy to program.

  • While there are some early AI HDL copilot tools out there, they seem to be far from maturity in truly expediting the digital software design process.

 

While there are numerous HDLs available, the two most common are VHDL and Verilog. While there are some savants out there, if you haven’t been using these languages for some time or have a library of existing IP at your disposal, the learning curve on these languages is both steep and long. Don’t forget your oxygen tank.


This is where LabVIEW FPGA comes in. It provides programming access points at a higher level in the abstraction hierarchy targeting developers who see the benefit of using an FPGA but don’t have expertise in HDL development and validation. While LabVIEW is a full-featured graphical programming language with full IDE support, the FPGA Module extends some of the core tenets to FPGA application development, making high-performance, low-latency FPGA-based systems more accessible to a wider swath of engineering teams. Again, if you’re an experienced HDL programmer, all the power to you.


LabVIEW FPGA

The remainder of this article intends to provide additional details on how FPGA-based application development can be simplified in LabVIEW and some tips to use along the way.



Want to see some application examples where LabVIEW FPGA shines?



 


Section 3: FPGA Clocking


If you’re developing on an FPGA, you almost certainly have some critical design considerations around closed loop control and timing. While FPGAs typically run at lower rates than CPUs and GPUs, the level of control and parallelism they provide can enable very complex orchestration of processes. LabVIEW FPGA provides a number of tools to control timing in your application, but here we’ll cover Single-Cycle Time Loops and derived (divided) clocks.


The Single-Cycle Timed Loop is a While loop intended to execute all of the functionality inside it within one clock cycle of the FPGA. In this code example, all of the functionality added to the loop would need to execute within 5ns (200 MHz loop interval). IF you try to compile and all of the functionality cannot be executed on the FPGA within that interval, LabVIEW will through a timing violation, suggesting that you make some optimization or removal of functionality, or lower the clock rate.



Timing diagrams and clocking in LabVIEW FPGA loops
A Single-Cycle Timed Loop executing at 200MHz

Derived clocks easily enable developers to create loops of different clock domains in the same application. This gives the developer timing flexibility in which functionality gets executed at which rate, empowering them to parallelize processes and reserve FPGA resources for higher-speed tasks and avoid timing violations. The clock configuration utility in LabVIEW FPGA allows for any combination of integer multipliers and divisors.



Create different FPGA clock domains through clock dividers
FPGA Derived Clock configuration utility showing a 35MHz clock derived from the 200MHz parent clock

 


Section 4: Numeric Data Types


In LabVIEW FPGA applications, there are three main numeric data types: integers, fixed point, and single-precision floating point. It is non-trivial to decide which data type to use for different scenarios, so here’s a simplifying outline of the choices.


Integers – Our understanding of integers dates back to grade school and then was bolstered when we first learned to program (ANSI C for me). You can use integers when there is no need for precision beyond the decimal, but the story goes deeper than that. Integers can be a good choice for numeric representation if you have the following requirements:


  • Bit manipulations, such as masking, shifting, or inverting.

  • Packing of multiple 8- or 16-bit integers into 32- or 64-bit words. This can be helpful with data sharing as it minimizes the overhead associated with each numeric read/write.

  • Choosing between calibrate fixed-point or uncalibrated integer I/O node outputs

 

Fixed-point - As opposed to floating point number which have a relative precision (the decimal placement can “float”), fixed point numbers have an absolute precision (the decimal point is set).

 

  • Resource-efficient arithmetic

  • You’re using High Throughput Math functions*

  • Default datatype with C Series analog I/O

  • Watch out for data saturation and LSB (least significant bit) underflow errors.

 

Single-precision floating point – This numeric datatype in LabVIEW FPGA provides 24-bit procession with a variable position of the decimal. Naturally, arithmetic operations on floating-point numbers are more resource intensive than integers or fixed-point numbers, though there are a couple interesting use cases:

·    

  •  High dynamic range data paths. Many analog I/O channels have multiple ranges where the best precision and accuracy  is provided in the channel whose max value is as close to the measurement or setpoint as possible. Oftentimes changing ranges means changing digits of precision implying your FPGA designs needs to be flexible across that set of ranges.

  • Prototype algorithms and designs quickly without losing precision and worry about resource optimization later.

 

Oh, and don’t forget to watch out for data type tradeoffs. Extra resources for arithmetic can scale quickly when performing operations on large arrays and other data structures. Sometimes precision is worth the extra FPGA resources and sometimes it is not. Only you as the system developer can determine that.



 


Section 5: Pipelining


Pipelining is an extremely powerful paradigm applicable across computing architectures. Pipelining is a process by which parallel execution of operations (or instructions at the chip level) can occur, thereby increasing throughput and operational clock frequency for a given amount of resource utilization. This means that assuming there is FPGA fabric available, transforming a non-pipelined design to a pipelined design implies you can increase the clock speed for a given set of process operations. If you don’t have a need to execute faster or increase throughput, pipelining may not be worth your trouble.


Implementing a pipelined design is aided by feedback nodes in LabVIEW FPGA. Feedback nodes incorporate a data register under the hood such that data can be shared between loop iterations at each step along the process.



Pipelining increases FPGA processing throughput
Pipelined data flow using feedback nodes to share data between loop cycles. The result is a higher throughput algorithm.

 


Section 6: High Throughput Math Functions


These specialized functions in the LabVIEW FPGA API are implemented with pipelining under the hood, thereby saving significant development and debugging time compared to custom-designed functions. While they may not be as performant as alternatives composed in lower-level HDLs, they tend to work pretty well for out-of-the-box functionality.

·      

  • The API includes trigonometric, exponential, logarithmic, and polar operations, in addition to (no pun intended) basic arithmetic operations.

  • These functions operate on fixed-point numbers

  • Some of these functions have Throughput controls that strive to operate to the data throughput level you specify.



High throughput math functions in LabVIEW FPGA
High Throughput function palette in LabVIEW FPGA. These functions are implemented with pipelining under the hood, thereby helping increase out-of-the-box throughput.

 


Section 7: Parallel Loop Operations


As previously described, you may require multiple loops running on your FPGA, such as the need to have different processes running at different loop rates to optimize FPGA resource utilization. There are a number of mechanisms available in LabVIEW FPGA used to communicate between these various loops.

·      

  • Local variables, global variables, register items – These mechanism are used for communicating latest data without buffering. Because of this, they are subject to pesky race conditions that arise when you have multiple writers and 1-N readers of that memory space. Also, because there is no buffering, they are generally not good for data streaming which typically requires a lossless communication mode.

    • Local variables – access scope to a single VI

    • Global variables – access scope to multiple VIs

    • Register items – Because you can generate a reference to a register, you can re-use subVIs that access different registers (through the provided reference) given different calling conditions. This makes them more flexible in practice than  local and global variables.


  • FIFOs and handshake Items – These “first in, first out” data structures are bread and butter for lossless data communication because they have allocated memory to buffer data. This is useful when the producer of the data and the consumer of the data do not always run at the same rate. Because all memory blocks in the FPGA are allocated at compile time, it is possible for these data structures to run out of memory. For FIFOs, you can have multiple writer and multiple readers accessing the same data buffer, whereas handshake items are single writer / single reader.


  • Memory items – Block memory and lookup tables (LUTs) provide mechanisms for re-writeable data storage that can be accessed across your FPGA application. They are lossy and therefore a poor choice for data streaming, though widely flexible otherwise.


We’re starting to get into some rather non-trivial concepts here with various caveats and implications, meaning care and caution must be exercised when choosing which data communication mechanism should be used for different pieces of functionality across complex FPGA applications.



 


Section 8: Host Synchronization


Depending on the application, you may need to send data between an FPGA and an external processor, such as a Real-Time processor or a Windows host PC, for datalogging, further processing, or visibility in a UI. One tool to help with this data sharing is a Direct Memory Access (DMA) FIFO. These data structures provide an efficient mechanism for data streaming which minimizes the processor overhead for lossless data fetching.



DMA FIFO in LabVIEW FPGA enables lossless data communication
Direct Memory Access (DMA) FIFO in LabVIEW FPGA with a warning indicator for underflow conditions.

In this code snippet, the FPGA is taking accelerometer data, converting that data to single point, and writing it to a lossless DMA FIFO. What is not shown here is the corresponding FIFO Read function call that would be made asynchronously from code running external to the FPGA.



Section 9: Compilation


There exists a world of information and complexity on this topic, but here we’ll highlight some core points to help you take your design implemented in LabVIEW FPGA and get it running on real hardware.


While LabVIEW code intended to run on Windows does not necessarily require a compilation step, LabVIEW FPGA code always does. The output of a LabVIEW FPGA compilation is a bitfile which can be referenced from calling programs and re-used across similar FPGA targets. When you select to build your code, LabVIEW presents you with a number of configuration and optimization options which must be selected (or implicitly accepted in common click-through practice) before kicking off the compilation. From there, LabVIEW generate intermediate files which are then passed to a Xilinx compiler (don’t worry, you don’t need to install these tools separately)


Lastly, you have a few different options of where the code is actually compiled. You can do so locally on the development machine, on a networked server, or on an NI-provided cloud service. If you’re just getting started with LabVIEW FPGA, compiling locally is probably the easiest option, but once you get up and running with the tool chain, the cloud compile service is pretty neat and offloads a ton of work from your machine.



FPGA compile options
Compile options in the LabVIEW FPGA build specifications

 


Section 10: Debugging


Ever write perfect code the first time around? Me neither. Debugging is a fact of life, but you don’t need me to tell you that.


Because FPGA code can take a long time to compile, you don’t necessarily want to go through that step every time you make a small tweak to an algorithm or want to test out a new subVI. Given this, LabVIEW FPGA offers a few different execution modes which can be very helpful in debugging. I’ve only ever used Simulation (Simulated I/O) when not going through a full compilation.

 


FPGA debugging options
FPGA execution modes options relevant for simulation and debugging.

With that said, verifying full system functionality with I/O and memory simulated on the FPGA is generally a bad idea.

 

  • Host visibility: Use indicators and FIFOs to pass data from key areas up to the host to get quicker top-level visibility into different pieces of lower level functionality in subVIs that are otherwise difficult to access. In final deployment applications, be sure to remove unnecessary indicators and data structures as they consume limited resources.

  • Performance benchmarking: Use sequence structures and tick counters to benchmark timing in critical sections of code. From experience, this is often an iterative process where you ratchet up loop rates until timing violations occur or you identify areas of the code that can be further optimized through pipelining or other tactics.

  • Xilinx Toolkit: Use the  Xilinx ChipScope toolkit to probe, trigger, and view internal FPGA signals on FlexRIO targets.



 


Conclusion


Whether you’re assessing FPGA technology fit, trying to choose a future-proofed platform, or developing a system, the concepts outlined in this article are intended to bolster your knowledge so you can make the best decisions possible for your application. While specifically focused on fundamentals applied through LabVIEW FPGA, these discussed concepts span across chipsets, programming languages, and application requirements.


  • Timing paradigms and clocking

  • Data types and structures

  • Memory and data transfer

  • Compilation and debugging

 

If you’re interested in learning more about design patterns associated with common processes, such as analog data streaming, custom triggers, and serial protocol decoding, you can review this article.



Ready to enhance your FPGA development skills or want help selecting a technology?





bottom of page