Why Robots Need to See
Updated: Apr 21, 2022
In April 2019, Elon Musk famously told attendees at Tesla’s Autonomy Day that LiDAR is a
“fool’s errand”—and that anyone relying on it is “doomed.” Today, the debate continues.
Recently, the same debate has emerged in the mobile robot market where traditional 2D LiDARs have been the prevailing navigation sensor for decades. Autonomous mobile robot (AMR) manufacturers including Canvas Technology (acquired by Amazon) and Seegrid have developed AMRs with varying degrees of vision-based navigation. The trend towards vision is being driven by the need for:
3D visual perception,
Increased robustness, and
Ultimately, to achieve truly intelligent autonomous behavior, navigation systems need to deliver human-level, 3D visual perception. For example, because they can detect texture and color, cameras are able to distinguish between the edge of a surface or line. This can create significant safety advantages for many types of robots because the robot can use this visual information to precisely navigate along a marked path, just the way a human would. This capability is useful in warehouses and manufacturing facilities where pedestrian paths are often defined with lines and floor markings.
Camera-based systems can read signs and symbols that can alert both humans and robots to temporary closures, wet floors, and detours. Using object recognition techniques, they can find, validate, and precisely engage pallets and other loads. And vision-based navigations systems are also able to work in both indoor and outdoor environments – opening up new use cases and applications. These are just a few examples of the benefits of true, 3D perception.
Another advantage of vision-based navigation is the ability to handle challenging environments where 2D LiDARs lose robustness. The classic example is a logistics warehouse where rows of racks and shelving systems are repeated throughout the facility. Cameras can
see natural features on the ceiling, floor, and far into the distance on the other side of the facility. But the 2D ‘slice’ of the world that a LiDAR can see is simply not enough to distinguish between the different, repetitive features in these environments. As a result, LiDAR based robots can get confused or even completely lost. These same challenges also apply to open and highly dynamic environments like cross-docking and open warehousing facilities. The ‘slice’ that LiDAR saw and interpreted during their last visit may now be open space – or something else altogether.
In the AV market, the importance of safety and an overall higher cost structure make it feasible for manufacturers to incorporate high-end 3D LiDARs, along with cameras and additional sensors. Although the costs have come down over the past few years, the total system cost for perception continues to be many thousands of dollars.
Those costs are prohibitive in the robotics space. System costs of only hundreds of dollars are supportable in logistics and manufacturing, and in service and consumer robotics that number drops to tens of dollars and lower. These cost restrictions have robot manufacturers seeking less expensive alternatives to 3D LiDAR. Camera-based vision systems are inherently up to the challenge since they can ‘see’ and digitize everything in their field of view. Leveraging economies-of-scale from other industries, even cameras costing under $20 provide enough resolution and field-of-view to support robust localization, obstacle detection, and even higher levels of perception.
Converting the large volume of data from cameras into 3D artificial perception on low-cost hardware is a monumental technology challenge, however, requiring a unique combination of AI, computer vision, and sensor fusion. RGo’s Perception Engine allows mobile robot manufacturers to rapidly productize these new and powerful capabilities today.
With all of this said, there remains significant value in traditional sensor modalities including LiDAR. Recent advancements in low-cost MEMS 3D LiDARs is encouraging and, when combined with cameras, could add cost effective robustness and rich 3D mapping capabilities.
But Elon was right in saying that cameras and computer vision should serve as the foundation of any mobile robot navigation system. The next few years will certainly see dynamic changes as the state-of-the-art evolves with advances in both the AV and robotics industries.