Machine Learning for Robot Locomotion: Grounded Learning and Adaptive Parameter Learning

Dr.Peter Stone

Robot locomotion continues to remain a challenging problem in spite of the advances in the field. To gain more insights on the topic, a talk on “Machine Learning for Robot Locomotion: Grounded Learning and Adaptive Parameter Learning” was organized as a part of the fourth latent colloquium on 30th September 2021. The talk was delivered by Prof. Peter Stone, who is the David Bruton, Jr. Centennial Professor and Associate Chair of Computer Science, as well as Director of Texas Robotics, at the University of Texas at Austin. Dr Stone started his talk by mentioning that his research lab works in various research areas such as autonomous agents, multiagent systems, robotics and reinforcement learning and said that his lab works on both fundamental and applied research problems while addressing a fundamental research question as to what degree can autonomous intelligent agents learn in the presence of teammates and/or adversaries in real-time. Next, he moved on to talk about RoboCup Soccer where robots are being trained to play soccer and the aim is that the robots are able to beat world cup soccer champions by the year 2050. He showed several videos of how the robots have fared in playing soccer over the years and also talked about the various issues in robot locomotion that are being worked upon. He said that reinforcement learning on physical robots directly to improve their performance has various challenges and therefore reinforcement learning in simulation is tempting as it allows thousands of trials in parallel, requires no supervision, performs automatic resets and doesn’t lead to breaking off robots. However, he pointed out, that the major issue here is that policies learned in simulation often fail in the real world. He said that this problem can be sought out by two approaches- one is to inject the noise from the real world into the simulated environment and the other approach is to take some data from the real world and use it to make your simulator more real. He next talked about the Grounded Simulation learning method which is one of the methods of the latter approach and was developed by his research group. Describing the working of this method, he said that first, it collects some real-world policy execution, then gather real-world state action trajectories from that and then simulator grounding is done to make the simulator more real-world and finally, the grounded simulator is used for improving policy using standard learning methods in simulation. He further informed that the robots trained by these means were tested by the team in different settings and the trained robot has been found to be the fastest stable walking robot.

Talking about his other work- Adaptive Planner Parameter Learning (APPL), Prof. Stone said that the motivation from this work came from the fact that deploying an autonomous navigation system in a new environment is not as straightforward as it may seem but humans can do this effortlessly. This research was based on the central question that if we can squeeze more robust performance out of our existing navigation systems using limited human interaction and learning. He told that the team proposed to use the behaviour cloning approach to tune any navigation system where parameters are learned from a demonstration using the supervised learning method. Elaborating on the procedure of this learning method, he said that it involves collecting demonstration of the action of human joy stocking a robot followed by performing the automatic demonstration segmentation and then use of black-box optimization method to find a set of optimal parameters and finally use of supervised learning to train a context predictor. The results, he said, showed that traversal time and behaviour cloning loss is significantly better for the APPL method than that of other alternatives. He also talked about the other versions of APPL i.e. APPLI, APPLR and APPLE and finally concluded the talk by briefly touching upon other works of his research group.

The video is available on our YouTube channel: Link.


Robot Locomotion, Machine Learning