Natural-Language Agent Harnesses

Linyue Pan, Lexiao Zou, Shuo Guo, Jingchen Ni, Hai-Tao Zheng

2026-03-30

Summary

This paper explores a new way to build and manage the 'harness' that controls AI agents, moving away from complex code and towards something more easily understandable and reusable.

What's the problem?

Currently, the instructions that tell an AI agent *how* to act – things like how to interact with tools or follow specific rules – are usually tangled up within the agent's core programming. This makes it difficult to change those instructions, compare different approaches, or even use the same instructions with different AI models. It's like trying to swap out the engine in a car without being able to access the engine compartment easily.

What's the solution?

The researchers developed 'Natural-Language Agent Harnesses' (NLAHs). These harnesses use plain English to define how the agent should behave. They also created a 'Intelligent Harness Runtime' (IHR) which acts as a translator, taking these English instructions and making them work with the AI agent. This separates the control logic from the agent itself, making it more flexible and portable.

Why it matters?

This work is important because it makes it much easier to experiment with and improve how AI agents are controlled. By using natural language, anyone can understand and modify the agent's behavior without needing to be a coding expert. This could lead to faster progress in building more reliable and effective AI systems, and allows for easier sharing and comparison of different control strategies.

Abstract

Agent performance increasingly depends on harness engineering, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce Natural-Language Agent Harnesses (NLAHs), which express harness behavior in editable natural language, and Intelligent Harness Runtime (IHR), a shared runtime that executes these harnesses through explicit contracts, durable artifacts, and lightweight adapters. Across coding and computer-use benchmarks, we conduct controlled evaluations of operational viability, module ablation, and code-to-text harness migration.

View Paper