HOComp: Interaction-Aware Human-Object Composition

Dong Liang, Jinyuan Jia, Yuhao Liu, Rynson W. H. Lau

2025-07-23

HOComp: Interaction-Aware Human-Object Composition

Summary

This paper talks about HOComp, a new method that uses large multi-modal language models (MLLMs) and special techniques to create images where humans and objects interact naturally and look consistent together.

What's the problem?

When combining humans and objects in images, it can be hard to make the interaction look real and the appearances of humans and objects match well, leading to unnatural or mismatched images.

What's the solution?

The researchers developed HOComp, which uses advanced language models to understand the interaction and appearance preservation strategies to keep the look consistent, producing images where humans and objects fit together harmoniously.

Why it matters?

This matters because it helps create more realistic and visually pleasing images for applications like virtual reality, gaming, and content creation, improving how machines can generate human-object scenes.

Abstract

HOComp, a novel method using MLLMs and appearance preservation techniques, generates harmonious human-object interactions with consistent appearances in images.

View Paper