Reproducible research data doesn't just happen

The instrumentation and automation problems quietly draining your lab's time, results, and discovery potential

Quentin Smith

June 17, 2026

5 minutes

Time and again, we've seen brilliant researchers and enterprising grad students get stuck, not on their fundamental research challenges or dissertation way-finding, but on the instrumentation setups underpinning their experiments. Regardless of the specific STEM domain, there exists an ongoing challenges of developing and sustaining reliable, reproducible measurements and control systems. How often do experiments need to be re-run not because of a flawed (null) hypothesis, but because the data wasn't captured cleanly the first (or second, or third...) time around?

From the PI's perspective, it's no secret that grad student labor is relatively inexpensive. But the experimental resources being consumed, the lab management overhead, and the lost time of not generating good, validated results is where the real costs come in. A semester of noisy data doesn't just waste time, it delays publication, funding, and the next experiment that this one was meant to build on.

Some of the problems are predictable, which means they're solvable

One of the things I love most about working with academics and researchers is the foundational science this group undertakes to seed both breakthrough innovations, but also everyday technology that benefit industry and society at large. While I'm just an engineer versed in measurement and automation, the scientific discovery being undertaken across this group is humbling for me. With that said, after working across STEM labs in fields ranging from plasma physics to bioengineering, the tripping points start to look familiar.

Instrumentation setups - Without experience and training, optimizing for configuration settings, cable management, signal integrity, and synchronization is downright tough. A measurement chain is only as good as its weakest link, and in a lab environment with sensitive signals and no gold standard to compare against, that link is often something as unglamorous as a grounding issue or non-ideal cable selection.

Manual measurements and processes - For getting initial data, manual measurement processes are simple and easy, but when your trying to sweep across conditions, while holding some variables as controls, human-in-the-loop steps introduce unacceptable measurement-to-measurement variance. Every manual trigger, every eyeballed step, every hand-recorded data point is an opportunity for compounding error. And by the time you're analyzing results, you may not know where the uncertainty came from to make it better the next time around.

The reproducibility gap - There's often a meaningful difference, if not an ocean, between the data you intend to collect and the data you can actually publish. Reproducing an experimental result requires that your measurement setup, timing, calibration state, and data pipeline are all documented and repeatable, and not just "roughly the same as last time." Reviewers are increasingly scrutinizing methodology, and rightfully so in an era of increasingly passed rigor and accountability to AI.

Overreliance on AI - LLMs are obviously useful for navigating public documentation, troubleshooting, , and processing data. I'd also be lying if I told you that I did not use AI, in part, to write this post, though I swear it is far from pure slop. But at the end of the day, AI cannot fix a bad measurement. "Garbage in, garbage out" is truer now more than ever. There's also a subtler cost that when AI unknowingly, and possible incorrectly, patches over a measurement problem, the researcher or student doesn't actually learn from what went wrong and what can be improved upon.

Automated data acquisition and control system designed for the lab

A real problem, a hard constraint, a better way

About a year ago, a plasma science research group based at a certain university in Massachusetts came to us with a hard problem. They were running a high power experiment on a multi-million-dollar piece of equipment and their existing PLC-based protection system was too slow to prevent equipment damage in a fault condition. Specifically, the PLC was responding in ~150ms, which was an order of magnitude too slow considering their process rate.

Lab equipment requiring control software and protection hardware

Now, to be clear, this is a slightly different challenge that those I mentioned earlier associated with manual measurements and data variance, but the fact still held that they needed a new system that could reliably and deterministically trigger off of non-trivial (i.e., beyond basic level threshold) conditions.

The Cyth engineering team helped the research group develop a proof-of-concept using NI CompactRIO and LabVIEW FPGA to assert a protection output in under 5µs, a full 50 times faster than what their existing solution could execute at. Also, these were not eye-balled metrics; they were thoroughly tested and quantified using a high-speed oscilloscope, thereby proving that the mechanism was performant and trustworthy to protect their equipment.

Benchtop proof of concept, including process control hardware, software, and performance validation instrumentation

What did this mean for the research team? Well, first off, they are plasma scientists, not control engineers, and this type of side project would have drained time and resources away from the core experiments they were undertaking. To put it simply, it freed them up to focus on the science. The right automated instrumentation setup doesn't just improve the measurement or process, it removes constraints limiting the efficiency and discovery potential of the research team.

How Cyth collaborates with research labs

When a research team engages with us, the conversation starts with the measurement or automation problem, where we can bring our experience and tooling to the table, so that you can take that next step in evolving your science. This is how our partnership methodology translates in practice:

Start with the problem and match the tools to the experiment. What are you actually trying to measure or automate? What are the timing requirements? What does "good data" look like for your project? The answers to those questions determine the instruments and system architecture, not the other way around. For example, a high-accuracy DC measurement and a fast, closed-loop control system need completely different hardware and software approaches.
Design for reproducibility from the onset. Factors like signal integrity, calibration, synchronization, data processing routines need to be designed in and not bolted on. A measurement system that can't reliably reproduce results is a liability for your lab.
Leave the lab with something they own. We know that research groups, just like engineering teams in industry, and not static. Grad student graduate, postdocs rotate out. A system that only one person understands and can evolve is a liability. Hence the importance we place on documentation, training, and system maintainability

Learn about the NI Laboratory Accelerator Program

Before your next experiment, three questions worth asking

Iteration is the engine of research. But iteration only works when the initial conditions are solid and the data feedback is trustworthy, otherwise you run the risk of getting bogged down solving the wrong problems, or even worse, diverging from where you need to go.

Do you have the right tools for the job, and does the team know how to use them well?
This obviously goes beyond having access to the hardware, but: is it configured correctly for the specific application, and is the team equipped to automate and optimize it?
Is this a quick & dirty proof of concept or does the system need to be bulletproof?
To be clear, both are legitimate in different scenarios. A rough PoC that informs your next research step or where to invest experimental resources can be incredibly valuable. The risk is when a PoC-quality setup unthinkingly becomes the full experimental setup that publication-critical results are based on.
What is the cost of automation and what is the cost NOT to automate?
Put another way, how reproducible is your setup, really? Manual processes and ad hoc software (dare I even say vibe-coded scripts held together with desperation and caffeine) are real challenges to professional reproducibility required for peer-reviewed science. If you couldn't hand your measurement setup to someone else tomorrow and have them get the same results, that's just might be a problem worth solving.

Working on a measurement or automation challenge?

We're happy to think through it with you, whether it's a quick sanity check, instrument recommendation, or full system design. For inspiration and context, check out our repo of past engineering projects.

Talk to an engineer

Share this article

Build your system. We're here to help.

Precision automation for test, control, and performance-critical applications.

Product Ordering

Get product advice, request a quote, or submit an order with the help of our trusted team.

Project Planning

Application support for your project from startup assistance through turnkey development.

Book a Consultation

Get NI products & support

NEW! USRP B206mini

Cyth at Work

Need Support Now?

Application Library

What is Circaflex?

Training & Certification

Events

Resource Library

Technical Support

Career Center

Online Ordering

Contact Us

About Cyth

Technology Partners

Customers