Key Features

Action-conditioned world model for robotics and embodied agents.
Uses unified 2D kinematic skeleton conditioning across robot embodiments.
Covers four robot embodiments and a MANO hand according to the project description.
Fine-tuned from Cosmos-Predict2.5-2B on robot teleoperation and egocentric human video.
Supports policy-evaluation style workflows through predicted visual futures.
Provides a public paper, code repository, and Hugging Face model link.
Designed to reduce embodiment fragmentation with a common conditioning format.
Useful for studying robot-world prediction, action planning, and cross-embodiment transfer.

The model is designed to predict action-conditioned future visual observations across different embodiments. By using a shared 2D skeleton interface, OSCAR can reason over robots and hands in a common control representation instead of building a separate world model for each hardware form.


OSCAR is useful for robotics researchers who want a visual world model for policy evaluation, data processing, and embodiment-transfer experiments. The page links to an arXiv paper, GitHub code, and a Hugging Face model, making it practical for research reproduction.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!