"Self discovery enables robot social cognition: Are you my teacher?" was written by Kaipa, Bongard, and Meltzoff with the Universities of Vermont and Washington. It can be found for free online here.
Problem/BackgroundAs they say, imitation is the finest form of flattery. However, the question for a robot is who to imitate. Work done in the field of robotics on how to learn from watching has been performed by Breazeal/Scassellati (2002) and Dautenhan/Nehaniv(2002), but the question of who to imitate is currently unsolved. In 1997, Schaal showed 30 seconds of pole balancing video o a robot, and this was shown to increase learning rates. This was one of the earlier experiments in this field, with more work done each year. The paper does a great job of summarizing.
The goal in this experiment is to have a mobile robot:
- Figure out what joints it has, and how it is able to move
- Figure out what other robot is similar enough to make a good teacher
- Figure out what individual actions imitate the teacher
- Actually perform said imitation
There has been prior work done in self-discovery. The robot is told of the part that make it up: how much each joint can move, and the shape/size/density of the materials that make it up. The robot discofers possible position combinations through the technique used in Resilient Machines Through Continuous Self-Modeling (Bongard/Zykov/Lipson(2006)).
A hill climbing algorithm is used to search the space of self models after all of the min/max parameter values have been found. The self model is developed using a single genome. This is then mutated (random -directioned Gaussian shift) and evaluated for error. Lower error models are kept while higher error models are thrown away.
Left and right cameras were used to capture data of a teacher robot using the process that Kaipa/Bongard/Meltzoff used in Combined Structure and Motion Extraction from visual data using Evolutionary Active Learning (2009).
A 3x2 neural network is used to control the output of the student motors. The output of the teacher is mapped to a hypothetical output of the student, and the network weights are trained. The question of which of the teacher nodes map to the student nodes is solved during error reduction function. It is worthwhile to note that because of the error reduction model, the student never performs the exact maneuver that the teacher does.
Future Work (and why you care)
Very simple models were used here, but many of them were used in series in order to perform the task. Whenever the choice of optimization was presented, hill climbing was used. Simple joints were used, and only two of them at that. Simple cameras, teacher, etc. were used. However the conclusion that a robot which doesn't even know it's own body may be able to learn from another robot or human teacher is a powerful conclusion.