Key Features

Supports promptable monocular 3D object detection.
Handles open-vocabulary categories with text prompts.
Accepts point and box prompts for spatial guidance.
Can use optional depth to improve geometric estimates.
Targets real-world in-the-wild scene understanding.
Useful for robotics, AR, mapping, and embodied AI.
Combines semantic prompting with 3D geometry reasoning.
Provides a public research reference for flexible 3D detection.

The system combines open-vocabulary prompting with monocular 3D geometry reasoning. Text prompts specify categories, point or box prompts provide spatial guidance, and optional depth can improve geometric estimates. The technical challenge is inferring 3D position, extent, and object identity from limited visual evidence while remaining flexible across categories not fixed at training time.


WildDet3D is valuable for robotics, AR, mapping, embodied AI, and scene-understanding systems. It enables more flexible 3D perception because users can prompt for objects instead of relying only on a closed detector label set.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!