Robots won’t be truly autonomous until they can cope in every day environments, where unexpected things happen.
Your autonomous car is hurtling down a country road when a tractor pulls out of a farm gate in front of you. Are you sure your car has seen it? How confident are you it will know what to do to avoid a collision?
Our lives will depend on robots not only knowing where they are but how to react to what they see around them. It is, perhaps, the biggest challenge in the whole field of robotic vision.
A whole new world opens up to challenge a robot from the minute it steps out of the lab or off the factory floor; and it’s a world without the certainties that a fixed place on an assembly line provides.
Before it can exercise any autonomy, a robot must first know exactly where it is in a complex and changing environment.
How it achieves that is just one of the key challenges being addressed at the Australian Centre for Robotic Vision. And while many systems of orientation are being trialled elsewhere, the Centre’s approach is to use vision.
“Part of the compelling reason for using cameras on robots – for tasks like autonomous driving, for instance – is because humans use sight,” says Ian Reid, professor of computer science and the Centre’s deputy director and one of its leading researchers.
“We have engineered the world to take advantage of that sensing capability,” he says.
Reid, is discussing the advantages of using vision-based sensing systems for robot navigation in preference to set-ups involving laser-based light detection and ranging (LIDAR).
Cameras and LIDARs deliver very different types of data into the front end of robotic systems. Both have their uses, but robots receiving images through cameras use the same type of positioning and reference data as their human designers. Essentially, they are ‘seeing’ the world as people do.
“The camera will tell you about really useful photonic information out there in the world,” Reid says. “For instance, a LIDAR would struggle to read texture or writing on an object. It’s possible, but it’s not really how most people use it and it’s certainly not what LIDAR is designed for – it’s designed for making 3D measurements in the world.
“A camera tells you indirectly about 3D measurements in the world, but it also tells you directly about writing, lettering, texture, and colour. These are all very useful bits of information that tell you not just about the geometry of an object but also about its role in a scene.”
Static floor-based robots of the type typically seen in factory production lines do not need sophisticated location and mapping capabilities. They are bolted to the ground, usually inside a cubicle. Essentially, neither their position nor environment ever changes.
As soon as robots are let loose to navigate around the factory, mapping and navigation become much more complex. A factory, however, is still essentially a closed box, although with additional unpredictable data elements such as people and things moving within it.
Mapping and navigating a truly open system – the congested road network of a major city, for instance – represents another order of magnitude of complexity again.
But both scenarios present the same baseline problem for developers.
“How do you best deploy your limited computational and communications resources so you do the best job?” Reid asks. “There’s a lot of computing power required by some of these things. One approach is to be very careful with your algorithms, and to try to develop smarter algorithms.
“Because what you’ve got in any vision problem is a huge amount of very redundant data.”
It’s a situation familiar to one of Reid’s colleagues, Vincent Lui, a research fellow with the ARC Centre of Excellence in Robotic Vision.
Lui and his supervisor, Centre Chief Investigator Tom Drummond, seek to refine and improve simultaneous localisation and mapping (SLAM). This is the computational process whereby a robot must construct a map of an unknown environment while at the same time keeping track of its position within it.
Mapping large areas, in which both the robot and the environment are moving, means dealing with an ever-increasing number of parameters, Lui says. The information coming in through the robot’s sensors is typically fuzzy and noisy, because even closed worlds are dirty and loud places. This requires ever more detailed corrections if the resulting internal map is to be accurate.
“That means that the costs associated with doing the optimisations become ever higher,” Lui says. “For a robot, having to be able to work in real time, such that it can respond in a short and sensible amount of time, is just critical. And so the real challenge is how do you keep all of these things computationally efficient? It’s not just about responsiveness, it’s also critical when thinking about power consumption.
“A robot that is more efficient in using these algorithms means that it can operate on lower power. That means you can operate using a mobile phone, or a drone, and so forth.”
One of the most promising solutions to reducing the energetic and economic costs associated with multiple and repeated corrections is to find a way in which the robot needs to make fewer of them.
To this end, Lui and his colleagues have been working on a refinement of the SLAM concept, dubbed MO-SLAM, or multi-object SLAM.
In standard SLAM, if a robot-mapping sensor encounters something it’s seen before but in a different place – a second chair, for instance, in a set of six identical ones – it will treat it as an entirely new entity, and laboriously map it.
Building up memory banks full of the same model, made afresh every time, clearly takes up time, energy and storage space.
With MO-SLAM, the system is geared to recognise duplicates without having to repeat the recording process, but to also use them to supply more usable data for its operational map without having to deal with the large amounts of redundant noise that accompany the sensor inputs.
The inside of a robot’s brain is a very frantic place, with its sensors delivering far more data than it needs to perform its functions.
The challenge for researchers is to find ways to do ever more using ever less information power.
And in that challenge lies perhaps the strongest reason for using cameras in robotic systems instead of other, more complex mechanics.
“A camera can be incredibly small, but provides a huge amount of data just from the light that enters the camera’s lens, whereas a LIDAR needs a substantial amount of power before it can project its laser and collect data.”
Australian Centre for Robotic Vision
2 George Street Brisbane, 4001
+61 7 3138 7549