A human child is able to reliably grasp objects
after one year, and takes around four years to acquire more sophisticated precision grasps. However, networked robots can instantaneously
share their experience with one another, so if we dedicate 14 separate robots to the job
of learning grasping in parallel, we can acquire the necessary experience much faster. Google Research Scienctists are working on
implementing this concept. While initially the grasps are executed at
random and succeed only rarely, each day the latest experiences are used to train a deep
convolutional neural network (CNN) to learn to predict the outcome of a grasp, given a
camera image and a potential motor command. This CNN is then deployed on the robots the
following day, in the inner loop of a servoing mechanism that continually adjusts the robot’s
motion to maximize the predicted chance of a successful grasp. In essence, the robot is constantly predicting,
by observing the motion of its own hand, which kind of subsequent motion will maximize its
chances of success. The result is continuous feedback: what we
might call hand-eye coordination. Neural networks have made great strides in
allowing us to build computer programs that can process images, speech, text, and even
draw pictures. However, introducing actions and control adds
considerable new challenges, since every decision the network makes will affect what it sees
next. Overcoming these challenges will bring us
closer to building systems that understand the effects of their actions in the world. If we can bring the power of large-scale machine
learning to robotic control, perhaps we will come one step closer to solving fundamental
problems in robotics and automation.

Leave a Reply

Your email address will not be published. Required fields are marked *