LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry
Jiaqi Peng, Wenzhe Cai, Yuqiang Yang, Tai Wang, Yuan Shen, Jiangmiao Pang
2025-12-23
Summary
This paper introduces a new system called LoGoPlanner that helps robots navigate complex, real-world environments without getting stuck or lost. It's a big step towards robots being able to move around independently in places they haven't been specifically programmed for.
What's the problem?
Traditionally, robots navigate by breaking down the process into separate steps like seeing the world, figuring out where they are, building a map, and then planning a path. Each step can introduce small errors, and these errors build up over time, leading to inaccurate navigation. Recent attempts to simplify this with 'end-to-end' learning, where the robot learns to go directly from what it sees to how it moves, still often rely on accurate knowledge of the robot's own position, which can be hard to get and limits how well the robot adapts to different situations.
What's the solution?
LoGoPlanner solves this by creating a system that learns to navigate while *also* figuring out where it is, all at the same time. It does this in three main ways: first, it learns to understand the scale of the environment from images, giving it a sense of distance. Second, it remembers what the environment looked like in the past to build a detailed understanding of obstacles. Finally, it uses this understanding of the environment to make better decisions about where to go, reducing errors and improving obstacle avoidance.
Why it matters?
This research is important because it allows robots to navigate more reliably in unpredictable environments. By reducing the reliance on perfect positioning information and improving how robots understand their surroundings, LoGoPlanner makes robots more adaptable and capable of operating in the real world, potentially leading to advancements in areas like delivery services, search and rescue, and automated exploration.
Abstract
Trajectory planning in unstructured environments is a fundamental and challenging capability for mobile robots. Traditional modular pipelines suffer from latency and cascading errors across perception, localization, mapping, and planning modules. Recent end-to-end learning methods map raw visual observations directly to control signals or trajectories, promising greater performance and efficiency in open-world settings. However, most prior end-to-end approaches still rely on separate localization modules that depend on accurate sensor extrinsic calibration for self-state estimation, thereby limiting generalization across embodiments and environments. We introduce LoGoPlanner, a localization-grounded, end-to-end navigation framework that addresses these limitations by: (1) finetuning a long-horizon visual-geometry backbone to ground predictions with absolute metric scale, thereby providing implicit state estimation for accurate localization; (2) reconstructing surrounding scene geometry from historical observations to supply dense, fine-grained environmental awareness for reliable obstacle avoidance; and (3) conditioning the policy on implicit geometry bootstrapped by the aforementioned auxiliary tasks, thereby reducing error propagation.We evaluate LoGoPlanner in both simulation and real-world settings, where its fully end-to-end design reduces cumulative error while metric-aware geometry memory enhances planning consistency and obstacle avoidance, leading to more than a 27.3\% improvement over oracle-localization baselines and strong generalization across embodiments and environments. The code and models have been made publicly available on the https://steinate.github.io/logoplanner.github.io/{project page}.