Diverse Controllable Diffusion Policy with Signal Temporal Logic

Yue Meng, Chuchu fan

2025-03-06

Diverse Controllable Diffusion Policy with Signal Temporal Logic

Summary

This paper talks about a new way to create realistic simulations for self-driving cars and robots that interact with humans, using a combination of mathematical rules and advanced AI techniques

What's the problem?

Current simulators struggle to create diverse and realistic behaviors for virtual cars and pedestrians that follow traffic rules. Rule-based systems are too rigid, while AI-based systems that learn from real data don't explicitly follow rules and can't create varied behaviors

What's the solution?

The researchers combined a mathematical language called Signal Temporal Logic (STL) with AI models called Diffusion Models. They first used STL to describe traffic rules, then created a lot of fake but realistic data that follows these rules. Finally, they trained an AI using this data to create diverse, rule-following behaviors. They tested their method on a real-world dataset and found it created the most varied rule-following behaviors, much faster than other methods

Why it matters?

This matters because better simulations can help develop safer self-driving cars and robots that interact with humans. By creating more realistic and varied scenarios in simulations, we can test and improve these systems more effectively before putting them in real-world situations, potentially saving lives and making autonomous systems more reliable

Abstract

Generating realistic simulations is critical for autonomous system applications such as self-driving and human-robot interactions. However, driving simulators nowadays still have difficulty in generating controllable, diverse, and rule-compliant behaviors for road participants: Rule-based models cannot produce diverse behaviors and require careful tuning, whereas learning-based methods imitate the policy from data but are not designed to follow the rules explicitly. Besides, the real-world datasets are by nature "single-outcome", making the learning method hard to generate diverse behaviors. In this paper, we leverage Signal Temporal Logic (STL) and Diffusion Models to learn controllable, diverse, and rule-aware policy. We first calibrate the STL on the real-world data, then generate diverse synthetic data using trajectory optimization, and finally learn the rectified diffusion policy on the augmented dataset. We test on the NuScenes dataset and our approach can achieve the most diverse rule-compliant trajectories compared to other baselines, with a runtime 1/17X to the second-best approach. In the closed-loop testing, our approach reaches the highest diversity, rule satisfaction rate, and the least collision rate. Our method can generate varied characteristics conditional on different STL parameters in testing. A case study on human-robot encounter scenarios shows our approach can generate diverse and closed-to-oracle trajectories. The annotation tool, augmented dataset, and code are available at https://github.com/mengyuest/pSTL-diffusion-policy.

View Paper