On the rankability of visual embeddings

Ankit Sonthalia, Arnas Uselis, Seong Joon Oh

2025-07-08

Summary

This paper talks about how visual embedding models, which convert images into numerical codes, can also organize images in an order based on continuous features like age or brightness. This ability is called rankability and it means the models capture meaningful information beyond just grouping similar images.

What's the problem?

The problem is that while these models are good at grouping similar images, it's less clear if they can also sort images in a meaningful sequence based on specific continuous traits, which is important for applications like ranking photos by how old people look or how crowded a scene is.

What's the solution?

The researchers studied popular visual embedding models and found that many of them do naturally capture continuous features along linear directions in their code space. They showed that you can find these ranking directions with just a few examples, making it easier to order images by attributes without extensive training.

Why it matters?

This matters because being able to rank images automatically using embeddings opens up new possibilities for fast and simple image sorting and searching in apps like photo albums, online shopping, and content recommendation systems.

Abstract

Visual embedding models often capture continuous, ordinal attributes along linear directions, enabling image ranking with minimal supervision.

View Paper