WildLMa: Long Horizon Loco-Manipulation in the Wild
Ri-Zhao Qiu, Yuchen Song, Xuanbin Peng, Sai Aneesh Suryadevara, Ge Yang, Minghuan Liu, Mazeyu Ji, Chengzhe Jia, Ruihan Yang, Xueyan Zou, Xiaolong Wang
2024-11-25

Summary
This paper presents WildLMa, a new framework designed to enhance the abilities of quadruped robots to perform complex tasks in real-world environments, like picking up objects and moving around obstacles.
What's the problem?
Robots that need to operate in diverse, real-world settings face several challenges. They must be able to adapt to different object arrangements, complete long tasks without guidance, and perform complicated actions beyond just picking up and placing items. Current robotic systems often struggle with these tasks, especially when they have to move and manipulate objects at the same time.
What's the solution?
WildLMa addresses these challenges by introducing three main components: (1) a low-level controller that allows for precise control of the robot's movements using virtual reality (VR), (2) a library of general skills learned through imitation, and (3) a planner that helps the robot coordinate its actions over extended tasks. This system allows robots to learn from fewer examples and adapt to new situations more effectively. For instance, it can help a robot clean up trash or rearrange items on a shelf by breaking down tasks into manageable steps.
Why it matters?
This research is significant because it pushes the boundaries of what robots can do in everyday environments. By improving how robots learn and perform tasks, WildLMa could lead to more practical applications in homes and workplaces, making robots more useful for chores and other activities. Ultimately, this work aims to create robots that are affordable and accessible for everyone.
Abstract
`In-the-wild' mobile manipulation aims to deploy robots in diverse real-world environments, which requires the robot to (1) have skills that generalize across object configurations; (2) be capable of long-horizon task execution in diverse environments; and (3) perform complex manipulation beyond pick-and-place. Quadruped robots with manipulators hold promise for extending the workspace and enabling robust locomotion, but existing results do not investigate such a capability. This paper proposes WildLMa with three components to address these issues: (1) adaptation of learned low-level controller for VR-enabled whole-body teleoperation and traversability; (2) WildLMa-Skill -- a library of generalizable visuomotor skills acquired via imitation learning or heuristics and (3) WildLMa-Planner -- an interface of learned skills that allow LLM planners to coordinate skills for long-horizon tasks. We demonstrate the importance of high-quality training data by achieving higher grasping success rate over existing RL baselines using only tens of demonstrations. WildLMa exploits CLIP for language-conditioned imitation learning that empirically generalizes to objects unseen in training demonstrations. Besides extensive quantitative evaluation, we qualitatively demonstrate practical robot applications, such as cleaning up trash in university hallways or outdoor terrains, operating articulated objects, and rearranging items on a bookshelf.