CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham, Graham Neubig
2025-01-31

Summary
This paper talks about CowPilot, a new system that helps people and AI work together to navigate websites more effectively. It's designed to make online tasks easier by combining the strengths of both humans and AI assistants.
What's the problem?
AI web agents, which are supposed to help users do things online automatically, often struggle with complex real-world tasks and understanding what users really want. This means they can't always work on their own, but they still have useful abilities that could help people.
What's the solution?
The researchers created CowPilot, a system that lets humans and AI work as a team when browsing the web. CowPilot suggests what to do next, but users can pause, reject, or change these suggestions if needed. Users can also let the AI take over again whenever they want. They tested this on five popular websites and found that when humans and AI worked together, they completed tasks successfully 95% of the time, with humans only needing to do about 15% of the work.
Why it matters?
This matters because it shows a new way for people and AI to work together online that's more successful than either working alone. It could make many online tasks much easier and faster for people. Also, CowPilot can help researchers study how humans and AI interact, which could lead to even better AI assistants in the future. By finding a balance between human control and AI help, CowPilot points towards a future where technology enhances our abilities instead of just trying to replace them.
Abstract
While much work on web agents emphasizes the promise of autonomously performing tasks on behalf of users, in reality, agents often fall short on complex tasks in real-world contexts and modeling user preference. This presents an opportunity for humans to collaborate with the agent and leverage the agent's capabilities effectively. We propose CowPilot, a framework supporting autonomous as well as human-agent collaborative web navigation, and evaluation across task success and task efficiency. CowPilot reduces the number of steps humans need to perform by allowing agents to propose next steps, while users are able to pause, reject, or take alternative actions. During execution, users can interleave their actions with the agent by overriding suggestions or resuming agent control when needed. We conducted case studies on five common websites and found that the human-agent collaborative mode achieves the highest success rate of 95% while requiring humans to perform only 15.2% of the total steps. Even with human interventions during task execution, the agent successfully drives up to half of task success on its own. CowPilot can serve as a useful tool for data collection and agent evaluation across websites, which we believe will enable research in how users and agents can work together. Video demonstrations are available at https://oaishi.github.io/cowpilot.html