CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence
Tianle Zeng, Hanxuan Chen, Yanci Wen, Hong Zhang
2026-04-01
Summary
This paper introduces CARLA-Air, a new simulation environment that combines realistic city driving with accurate drone flight, all happening in the same virtual world at the same time.
What's the problem?
Currently, simulators for cars and drones are separate. Car simulators don't handle flying well, and drone simulators don't have detailed city environments. Trying to connect these different simulators together causes delays and can lead to inconsistencies, making it hard to test systems where cars and drones need to work together seamlessly.
What's the solution?
The researchers created CARLA-Air by combining the CARLA driving simulator with the AirSim drone simulator *within* the same program (Unreal Engine). This means everything runs together, synchronized perfectly, and shares the same physics. They also made sure existing code written for CARLA and AirSim still works without changes, and it can handle lots of different sensors and data types.
Why it matters?
This is important because we're starting to see more applications where drones and ground vehicles interact, like delivery services or coordinated search and rescue operations. CARLA-Air provides a realistic and efficient way to develop and test the 'brains' (artificial intelligence) for these kinds of systems, and it keeps the AirSim flight technology alive after its original development stopped.
Abstract
The convergence of low-altitude economies, embodied intelligence, and air-ground cooperative systems creates growing demand for simulation infrastructure capable of jointly modeling aerial and ground agents within a single physically coherent environment. Existing open-source platforms remain domain-segregated: driving simulators lack aerial dynamics, while multirotor simulators lack realistic ground scenes. Bridge-based co-simulation introduces synchronization overhead and cannot guarantee strict spatial-temporal consistency. We present CARLA-Air, an open-source infrastructure that unifies high-fidelity urban driving and physics-accurate multirotor flight within a single Unreal Engine process. The platform preserves both CARLA and AirSim native Python APIs and ROS 2 interfaces, enabling zero-modification code reuse. Within a shared physics tick and rendering pipeline, CARLA-Air delivers photorealistic environments with rule-compliant traffic, socially-aware pedestrians, and aerodynamically consistent UAV dynamics, synchronously capturing up to 18 sensor modalities across all platforms at each tick. The platform supports representative air-ground embodied intelligence workloads spanning cooperation, embodied navigation and vision-language action, multi-modal perception and dataset construction, and reinforcement-learning-based policy training. An extensible asset pipeline allows integration of custom robot platforms into the shared world. By inheriting AirSim's aerial capabilities -- whose upstream development has been archived -- CARLA-Air ensures this widely adopted flight stack continues to evolve within a modern infrastructure. Released with prebuilt binaries and full source: https://github.com/louiszengCN/CarlaAir