RAPTOR: A Foundation Policy for Quadrotor Control

Jonas Eschmann, Dario Albani, Giuseppe Loianno

2025-09-17

RAPTOR: A Foundation Policy for Quadrotor Control

Summary

This paper introduces a new method, RAPTOR, for creating a single, adaptable control system for quadrotors – those little flying robots. Instead of needing to be specifically trained for each different type of quadrotor, this system can quickly adjust to control almost any quadrotor it encounters.

What's the problem?

Current robotic control systems, especially those using Reinforcement Learning, are really good at one specific task in one specific environment. However, they struggle when things change even a little bit, like moving from a computer simulation to a real-world environment or switching to a slightly different robot. This is because they essentially 'overfit' to the original conditions and need to be completely retrained for even minor adjustments, which is time-consuming and impractical.

What's the solution?

The researchers developed RAPTOR, which uses a combination of techniques. First, they trained many different 'teacher' policies, each specialized for a different simulated quadrotor. Then, they 'distilled' the knowledge from all these teachers into a single 'student' policy. This student policy is designed with a special internal structure that allows it to quickly learn and adapt to new quadrotors it hasn't seen before, using a process called In-Context Learning. It’s surprisingly small, needing very few settings to work effectively.

Why it matters?

This work is important because it represents a step towards more versatile and practical robots. Instead of needing to reprogram a robot every time you change something about it, you could use a system like RAPTOR that can adapt on the fly. This makes robots more useful in real-world situations where conditions are constantly changing and where you might want to use the same control system across a fleet of different robots.

Abstract

Humans are remarkably data-efficient when adapting to new unseen conditions, like driving a new car. In contrast, modern robotic control systems, like neural network policies trained using Reinforcement Learning (RL), are highly specialized for single environments. Because of this overfitting, they are known to break down even under small differences like the Simulation-to-Reality (Sim2Real) gap and require system identification and retraining for even minimal changes to the system. In this work, we present RAPTOR, a method for training a highly adaptive foundation policy for quadrotor control. Our method enables training a single, end-to-end neural-network policy to control a wide variety of quadrotors. We test 10 different real quadrotors from 32 g to 2.4 kg that also differ in motor type (brushed vs. brushless), frame type (soft vs. rigid), propeller type (2/3/4-blade), and flight controller (PX4/Betaflight/Crazyflie/M5StampFly). We find that a tiny, three-layer policy with only 2084 parameters is sufficient for zero-shot adaptation to a wide variety of platforms. The adaptation through In-Context Learning is made possible by using a recurrence in the hidden layer. The policy is trained through a novel Meta-Imitation Learning algorithm, where we sample 1000 quadrotors and train a teacher policy for each of them using Reinforcement Learning. Subsequently, the 1000 teachers are distilled into a single, adaptive student policy. We find that within milliseconds, the resulting foundation policy adapts zero-shot to unseen quadrotors. We extensively test the capabilities of the foundation policy under numerous conditions (trajectory tracking, indoor/outdoor, wind disturbance, poking, different propellers).

View Paper