Building the Web for Agents: A Declarative Framework for Agent-Web Interaction

Sven Schultze, Meike Verena Kietzmann, Nils-Lucas Schönfeld, Ruth Stock-Homburg

2025-11-17

Building the Web for Agents: A Declarative Framework for Agent-Web Interaction

Summary

This paper introduces a new system called VOIX designed to help AI programs, often called 'agents,' interact with websites more reliably and securely.

What's the problem?

Currently, AI agents trying to use websites have a hard time because websites are built for humans, not machines. Agents have to *guess* what actions are possible on a webpage, which is often inaccurate and can lead to errors, security risks, and inefficient use of the website. It's like trying to read someone's mind instead of having clear instructions.

What's the solution?

VOIX solves this by letting website developers specifically tell AI agents what they can do. They do this by adding special tags, called `<tool>` and `<context>`, to the website's code. These tags clearly define the actions an agent can take and what information is available, creating a direct and understandable 'contract' between the website and the AI. Importantly, this system keeps user privacy safe by separating the AI's actions from the user's direct interaction with the website.

Why it matters?

This work is important because it's a step towards a future 'Agentic Web' where AI agents can seamlessly and safely work with websites to help us with tasks. By making these interactions more reliable and secure, it opens the door for more powerful and helpful AI applications on the internet, improving how humans and AI collaborate.

Abstract

The increasing deployment of autonomous AI agents on the web is hampered by a fundamental misalignment: agents must infer affordances from human-oriented user interfaces, leading to brittle, inefficient, and insecure interactions. To address this, we introduce VOIX, a web-native framework that enables websites to expose reliable, auditable, and privacy-preserving capabilities for AI agents through simple, declarative HTML elements. VOIX introduces <tool> and <context> tags, allowing developers to explicitly define available actions and relevant state, thereby creating a clear, machine-readable contract for agent behavior. This approach shifts control to the website developer while preserving user privacy by disconnecting the conversational interactions from the website. We evaluated the framework's practicality, learnability, and expressiveness in a three-day hackathon study with 16 developers. The results demonstrate that participants, regardless of prior experience, were able to rapidly build diverse and functional agent-enabled web applications. Ultimately, this work provides a foundational mechanism for realizing the Agentic Web, enabling a future of seamless and secure human-AI collaboration on the web.

View Paper