AgroBench: Vision-Language Model Benchmark in Agriculture
Risa Shinoda, Nakamasa Inoue, Hirokatsu Kataoka, Masaki Onishi, Yoshitaka Ushiku
2025-08-01
Summary
This paper talks about AgroBench, a new benchmark designed to test how well vision-language models perform in agricultural tasks, such as identifying different plants and weeds.
What's the problem?
The problem is that current vision-language models struggle to accurately identify fine details in agricultural images, especially when it comes to distinguishing weeds from crops, which is very important for farming.
What's the solution?
AgroBench solves this by providing a carefully labeled dataset with expert-annotated categories that focus on these challenging identification tasks, allowing models to be evaluated and improved specifically for agriculture.
Why it matters?
This matters because better vision-language models for agriculture can help farmers more effectively monitor and manage crops, leading to higher yields and more sustainable farming practices.
Abstract
AgroBench evaluates vision-language models across agricultural tasks, revealing areas for improvement in fine-grained identification, particularly weed identification, with expert-annotated categories.