What do popular ‘human’ games like Jenga and Pick Up Sticks have in common with training a robot to grasp and manipulate objects in the real world?
The answer comes in a ground-breaking Australian Centre for Robotic Vision project that’s literally left other global research standing still in the complex task of visual grasp detection in real-world clutter.
“The idea behind it is actually quite simple,” says PhD Researcher Doug Morrison (pictured above, far right), who, in 2018, turned heads with his creation of an open-source GG-CNN network enabling robots to more accurately and quickly grasp moving objects in cluttered spaces.
“Our aim at the Centre is to create truly useful robots able to see and understand like humans. So, in this project, instead of a robot looking and thinking about how best to grasp objects from clutter while at a standstill, we decided to help it move and think at the same time.
“A good analogy is how we ‘humans’ play games like Jenga or Pick Up Sticks. We don’t sit still, stare, think, and then close our eyes and blindly grasp at objects to win a game. We move and crane our heads around, looking for the easiest target to pick up from a pile.”
As outlined in a research paper presented at the 2019 International Conference on Robotics and Automation in Montreal, the project’s ‘active perception’ approach is the first in the world to focus on real-time grasping by stepping away from a static camera position or fixed data collecting routines.
It is also unique in the way it builds up a ‘map’ of grasps in a pile of objects, which continually updates as the robot moves. This real-time mapping predicts the quality and pose of grasps at every pixel in a depth image, all at a speed fast enough for closed-loop control at up to 30Hz.
“The beauty of our active perception approach is that it’s smarter and faster than static, single viewpoint grasp detection methods thanks to our GG-CNN which is 10 times faster than other systems,” Doug said.
“We strip out lost time by making the act of reaching towards an object a meaningful part of the grasping pipeline rather than just a mechanical necessity.
“Like humans, this allows the robot to change its mind on the go in order to select the best object to grasp and remove from a messy pile of others.”
Doug has tested and validated his active perception approach at the Centre’s QUT-based Lab in ‘tidy-up’ trials using a robotic arm to remove 20 objects, one at a time, from a pile of clutter. His approach achieved an 80 per cent success rate when grasping in clutter; up 12 per cent on traditional single viewpoint grasp detection methods.
The genius comes in his development of a Multi-View Picking (MVP) controller, which selects multiple informative viewpoints for an eye-in-hand camera while reaching to a grasp, revealing high-quality grasps hidden from a static viewpoint.
“Our approach directly uses entropy in the grasp pose estimation to influence control, which means that by looking at a pile of objects from multiple viewpoints on the move, a robot is able to reduce uncertainty caused by clutter and occlusions.
“It also feeds into safety and efficiency by enabling a robot to know what it can and can’t grasp effectively. This is important in the real world, particularly if items are breakable, like glass or china tableware messily stacked in a washing-up tray with other household items.”
Doug’s next step, as part of the Centre’s ‘Grasping with Intent’ project funded by a US$70,000 Amazon Research Award, moves from safe and effective grasping into the realm of meaningful vision-guided manipulation.
“In other words, we want a robot to not only grasp an object, but do something with it; basically, to usefully perform a task in the real world,” Doug said.
“Take for example, setting a table, stacking a dishwasher or safely placing items on a shelf without them rolling or falling off.”
Doug also has his sights set on fast-tracking how a robot actually learns to grasp physical objects.
Instead of using ‘human’ household items, he wants to create a truly challenging training data set of weird, adversarial shapes.
“It’s funny because some of the objects we’re looking to develop in simulation could better belong in a futuristic science fiction movie or alien world – and definitely not anything humans would use on planet Earth!”
There is, however, method in this scientific madness. As Doug explains, training robots to grasp on ‘human’ items is not efficient or beneficial for a robot.
“At first glance a stack of human household items might look like a diverse data set, but most are pretty much the same. For example cups, jugs, flashlights and many other objects all have handles, which are grasped in the same way and do not demonstrate difference or diversity in a data set.
“We’re exploring how to put evolutionary algorithms to work to create new, weird, diverse and different shapes that can be tested in simulation and also 3D printed.
“A robot won’t get smarter by learning to grasp similar shapes. A crazy, out-of-this world data set of shapes will enable robots to quickly and efficiently grasp anything they encounter in the real world.”
Did you know? Researchers from the Australian Centre for Robotic Vision are this week (4-8 November 2019) leading workshops, including a focus on autonomous object manipulation at the International Conference on Intelligent Robots and Systems (IROS 2019) in Macau. Centre Research Fellow Jürgen ‘Juxi’ Leitner steps up as a speaker in this workshop, focusing on the area of grasping with intent, and is the organiser of a separate manipulation workshop about bridging the gap between research community and industry.
In another hot topic on the conference program, Centre researchers will delve into why robots, like humans, can suffer from overconfidence in a workshop on the importance of uncertainty for deep learning in robotics, as previewed in this Xinhua News article. Read more about the Centre’s creation of a world-first Robotic Vision Challenge to help robots sidestep the pitfalls of overconfidence.
Shelley Thomas, Communications Specialist
Australian Centre for Robotic Vision
P: +61 7 3138 4265 | M: +61 416 377 444 | E: firstname.lastname@example.org
About The Australian Centre for Robotic Vision
The Australian Centre for Robotic Vision is an ARC Centre of Excellence, funded for $25.6 million over seven years to form the largest collaborative group of its kind generating internationally impactful science and new technologies that will transform important Australian industries and provide solutions to some of the hard challenges facing Australia and the globe. Formed in 2014, the Australian Centre for Robotic Vision is the world’s first research centre specialising in robotic vision. They are a group of researchers on a mission to develop new robotic vision technologies to expand the capabilities of robots. Their work will give robots the ability to see and understand for the sustainable well-being of people and the environments we live in. The Australian Centre for Robotic Vision has assembled an interdisciplinary research team from four leading Australian research universities: QUT, The University of Adelaide (UoA), The Australian National University (ANU), and Monash University as well as CSIRO’s Data61 and overseas universities and research organisations including the French national research institute for digital sciences (INRIA), Georgia Institute of Technology, Imperial College London, the Swiss Federal Institute of Technology Zurich (ETH Zurich), University of Toronto, and the University of Oxford.
Australian Centre for Robotic Vision
2 George Street Brisbane, 4001
+61 7 3138 7549