A Technological Assessment of the Current State of Machine Vision in Industry
As more of our clients come to us looking for help solving complex problems that require machine vision as part of the solution, we find ourselves repeatedly pushing against the current limitations of the industry. Often this requires us to work with traditional suppliers to modify their current product line, sometimes it requires integrating our own hardware and software solutions into client solutions. In this short series of articles, we look at: 1) how to evaluate current product solutions in real terms of the problems industry is trying to solve, 2) current state of integrated product lines and how they perform in relation to the types of problems they can solve, and 3) the theoretical capabilities of readily available, albeit not integrated, technology components and the new problems that this product could solve if it was available on the market.

The machine vision landscape has been thus far dominated by traditional industrial camera suppliers and their integrators. These solution providers fulfill the needs of many automated assembly lines, which use machine vision to perform guidance, gauging, inspection, and identification tasks. Due to limitations of those solutions, human workers are still required to perform many tedious operations. Produce processing plants still employ human inspectors and sorters, metal foundries use highly skilled technicians to inspect castings and forgings, and agricultural growers rely on in-field crews for weeding and picking, all despite growing labor shortages across these industries. Why has machine vision not yet eased the labor requirements on these and other similar tasks?

Until recently, machine vision only performed well in real-time applications with consistent subjects, environments, and lighting – in short traditional industrial applications. Basic techniques such as thresholding, blob detection, and pixel counting work exceptionally well for inspecting well-lit machined components on conveyor belts. For targets such as natural commodities (e.g. fruits & vegetables) or highly varied but predictable form factors (e.g. flash off complex molded or composite components), these vision algorithms are notoriously difficult to tune. The difficulties are exacerbated by the dynamic backgrounds and inconsistent lighting in environments such as farms and urban settings.

Recent technological advances in complex image analysis -- particularly those using neural networks and GPU-based image processing -- have enabled solutions for real-time applications requiring human-like visual perception. These cutting-edge vision systems address many of the limitations of traditional techniques, however implementing them requires expertise in data science. With the growth of machine vision technology to address such difficult challenges, the landscape of industrial machine vision is now more complex and crowded than ever.
Navigating the Machine Vision Landscape
The scattered landscape of available cameras, integrators, and data science consultants makes finding and selecting a viable solution incredibly difficult. By observing the problem space against the backdrop of two principal axes, it’s possible to see the general strengths and weaknesses of each offering with respect to an end-application. The two axes of this Machine Vision Landscape are problem difficulty and decision throughput.
Problem Difficulty

We often think of problem difficulty as being analogous to the training level of a human required to solve a problem or perform a task. Using this analogy, we can rate problems from “line worker” through “subject matter expert” (SME) to “PhD”. For example, the task of aligning a widget before it is conveyed into a cutting appliance might be a line worker-level task. On the other end of the spectrum, deciding whether or not a fruit tree is infected with a particular fungal disease based on photographs might require the expertise of a highly-trained PhD specialist. Somewhere between those extremes, an SME might be tasked with the x-ray inspection of a complex aluminum casting for internal flaws.
Understanding the complexity of a vision problem is a critical first step in determining if machine vision is a viable solution. Current state-of-the-art vision systems can generally solve problems up to around the SME-level for real-time applications and are improving daily, steadily beginning to address “PhD-level” observation tasks.
Decision Throughput
Decision throughput, or latency, defines how fast decisions need to be made. Predicting coastal erosion is at the long end of the time scale -- a system capable of solving this problem could take days or weeks to perform its analysis, yet still be viable. Inspection of assemblies for missing parts could be on the order of a few units per minute. Sorting of ripe/unripe fruit during mechanical harvesting requires decisions to be made tens or hundreds of times per second.

Understanding the throughput requirements of a problem further refines the machine vision solution space. It might be acceptable to wait a few minutes or an hour for an x-ray image to be processed for a diagnosis, but a robot sorter in a produce processing plant needs to make decisions multiple times a second to avoid being a bottleneck in the production line. In such real-time applications, milliseconds can mean the difference between a solution and a technology not worth talking about.
Choosing a Solution
Real problems require real hardware. With that in mind, the focus of any machine vision solution should start with only rugged, industrial-grade products. Restricting the size of this market means that choosing the right solution requires making the right trade-offs.

For a given camera or processor, there is generally a direct trade off between problem difficulty and decision throughput. More difficult decisions (especially those that require more information, such as multiple images or high resolution) require more time to make. Difficult problems can be solved faster by using a higher-performance camera, or a system that utilizes computer hardware in addition to the camera. Easy problems can be solved quickly on less expensive, smaller, more portable hardware. In some cases, application-specific optimizations – such as novel software algorithms – can enable less expensive hardware to perform above its class.
In our next article, we'll assess products that are at the forefront of the machine vision space and are commercially available today against the backdrop of problem difficulty and decision throughput. We will also analyze the current state of the art component sets and determine whether the technology is ready for integration into industrial applications. By looking at technology in these two ways, we can see not just what problems are being solved, but which problems technology is capable of solving if the right product existed.
Motivo Engineering is an innovation engineering firm headquartered in California, USA and has executed numerous projects in mobility, aerospace and AgTech. Motivo’s unique innovation framework and phased product development approach has reduced the risk in transformative product development for audacious visionaries. Motivo projects range from innovation and intellectual property development to low volume manufacturing of these transformative products. Several Motivo clients are now leveraging these technology solutions for additional future revenue through licensing or by selling these unique products.