Safe and Scalable Web Agent Learning via Recreated Websites
Hyungjoo Chae, Jungsoo Park, Alan Ritter
2026-03-17
Summary
This paper introduces VeriEnv, a new way to train AI agents to interact with websites. It tackles the challenges of training these agents in the real world by creating safe, controllable, and verifiable simulated web environments.
What's the problem?
Training AI agents to use websites is difficult because the real internet is risky – websites can be unpredictable, it's hard to start over from a clean slate, and it's tough to know if the agent is actually succeeding at its task. Simply relying on the agent's own understanding or asking another AI to judge its performance isn't reliable enough for consistent learning.
What's the solution?
The researchers built VeriEnv, which uses powerful language models to automatically copy real websites and turn them into environments where agents can learn. These simulated websites aren't just visual copies; they're fully functional and allow the agent to interact with them through code. This allows for precise control and the ability to check if the agent is completing tasks correctly using a program, rather than relying on guesswork or another AI's opinion. Agents can even create their own tasks and check their own success, leading to continuous improvement.
Why it matters?
VeriEnv is important because it allows for safer and more efficient training of web agents. By removing the risks of the real internet and providing clear feedback, agents can learn faster and become more reliable. This could lead to better AI assistants for tasks like online shopping, research, or automating web-based processes, and the ability to create more complex agents that can adapt and learn on their own.
Abstract
Training autonomous web agents is fundamentally limited by the environments they learn from: real-world websites are unsafe to explore, hard to reset, and rarely provide verifiable feedback. We propose VeriEnv, a framework that treats language models as environment creators, automatically cloning real-world websites into fully executable, verifiable synthetic environments. By exposing controlled internal access via a Python SDK, VeriEnv enables agents to self-generate tasks with deterministic, programmatically verifiable rewards, eliminating reliance on heuristic or LLM-based judges. This design decouples agent learning from unsafe real-world interaction while enabling scalable self-evolution through environment expansion. Through experiments on web agent benchmarks, we show that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments. Code and resources will be released at https://github.com/kyle8581/VeriEnv upon acceptance.