JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
Yunlong Lin, Zixu Lin, Kunjie Lin, Jinbin Bai, Panwang Pan, Chenxin Li, Haoyu Chen, Zhongdao Wang, Xinghao Ding, Wenbo Li, Shuicheng Yan
2025-06-25
Summary
This paper talks about JarvisArt, an intelligent agent powered by a multi-modal large language model that helps users improve their photos by understanding what they want and using multiple editing tools to make detailed adjustments.
What's the problem?
The problem is that photo retouching can be complex and requires skill to use many tools effectively, and existing AI models like GPT-4o don’t always understand or fulfill the user’s artistic intentions well.
What's the solution?
The researchers created JarvisArt to act like a smart assistant that understands user requests and coordinates different photo editing functions in Lightroom to retouch photos more accurately and creatively.
Why it matters?
This matters because it makes photo editing easier and more accessible, helping people bring their artistic ideas to life more effectively, even if they are not expert editors.
Abstract
JarvisArt, an MLLM-driven agent, achieves superior photo retouching by understanding user intent and coordinating multiple retouching tools in Lightroom, outperforming GPT-4o on a novel benchmark.