UFO2: The Desktop AgentOS
Chaoyun Zhang, He Huang, Chiming Ni, Jian Mu, Si Qin, Shilin He, Lu Wang, Fangkai Yang, Pu Zhao, Chao Du, Liqun Li, Yu Kang, Zhao Jiang, Suzhen Zheng, Rujia Wang, Jiaxu Qian, Minghua Ma, Jian-Guang Lou, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
2025-04-22
Summary
This paper talks about UFO2, a new operating system for desktop computers that lets AI agents control and automate tasks on Windows by combining language models with direct access to computer functions.
What's the problem?
The problem is that most AI systems can't easily interact with desktop software or perform complicated tasks on a computer by themselves, which limits how much they can actually help users with real-world work on their computers.
What's the solution?
The researchers created UFO2, which connects advanced language models to Windows' built-in tools and uses a special system to figure out when to use AI or direct computer commands. This makes it possible for AI agents to handle a wide range of tasks, from managing files to running programs, in a reliable and efficient way.
Why it matters?
This matters because it opens up new possibilities for automating everyday work on computers, making people more productive and allowing AI to help with complex or repetitive tasks that would otherwise take up a lot of time.
Abstract
UFO2, an AgentOS for Windows, integrates multimodal LLMs with native APIs and hybrid control detection to enhance robust and scalable desktop automation.