Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Yining Hong, Rui Sun, Bingxuan Li, Xingcheng Yao, Maxine Wu, Alexander Chien, Da Yin, Ying Nian Wu, Zhecan James Wang, Kai-Wei Chang
2025-06-19
Summary
This paper talks about Embodied Web Agents, a new kind of AI system that combines the ability to act and sense in the physical world with the power to search and reason using information from the web.
What's the problem?
The problem is that most AI agents today can either work well with information online or interact with the physical world, but usually not both together. This makes it hard for AI to do real-world tasks that need both digital knowledge and physical actions.
What's the solution?
The researchers built Embodied Web Agents that can move around and manipulate objects in a realistic 3D environment while also browsing and using live web applications for information. They created a unified testing platform where these agents perform tasks like cooking using online recipes, navigating using maps, and shopping by combining online and physical experiences.
Why it matters?
This matters because these agents can better understand and interact with the world like humans do, helping AI assist in complex tasks that require both physical presence and knowledge from the internet, improving how AI works in daily life and various industries.
Abstract
Embodied Web Agents integrate physical interaction and web-scale reasoning to assess cross-domain intelligence in a novel benchmark environment.