GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Modi Jin, Yiming Zhang, Boyuan Sun, Dingwen Zhang, MingMing Cheng, Qibin Hou

2026-02-16

GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Summary

This paper introduces GeoAgent, a new AI model designed to be really good at understanding and reasoning about locations and addresses, much like a human would.

What's the problem?

Current AI models that try to solve problems step-by-step, a technique called 'chain-of-thought,' often struggle with geography. They're usually trained using data created by other AIs, which isn't always accurate when it comes to real-world locations and geographic details. This means they can make mistakes that a person with basic geographic knowledge wouldn't, and their reasoning isn't always logical from a geographic standpoint.

What's the solution?

The researchers created a new dataset called GeoSeek, filled with location-based questions and answers created by geography experts and experienced gamers. They also developed two new ways to 'reward' the AI during training: a 'geo-similarity reward' that encourages answers that make sense geographically, and a 'consistency reward' that makes sure the AI's reasoning stays logical and doesn't contradict itself. This helps GeoAgent learn to think about locations correctly and explain its answers in a way that humans can easily follow.

Why it matters?

This work is important because it improves AI's ability to understand and reason about the real world, specifically locations. This has lots of potential applications, like better navigation apps, more accurate delivery services, and AI assistants that can give you helpful directions or information about places.

Abstract

This paper presents GeoAgent, a model capable of reasoning closely with humans and deriving fine-grained address conclusions. Previous RL-based methods have achieved breakthroughs in performance and interpretability but still remain concerns because of their reliance on AI-generated chain-of-thought (CoT) data and training strategies, which conflict with geographic characteristics. To address these issues, we first introduce GeoSeek, a new geolocation dataset comprising CoT data annotated by geographic experts and professional players. We further thoroughly explore the inherent characteristics of geographic tasks and propose a geo-similarity reward and a consistency reward assessed by a consistency agent to assist training. This encourages the model to converge towards correct answers from a geographic perspective while ensuring the integrity and consistency of its reasoning process. Experimental results show that GeoAgent outperforms existing methods and a series of general VLLMs across multiple grains, while generating reasoning that closely aligns with humans.

View Paper