DiffusionLane: Diffusion Model for Lane Detection

Kunyang Zhou, Yeqin Shao

2025-10-28

DiffusionLane: Diffusion Model for Lane Detection

Summary

This paper introduces a new method called DiffusionLane for detecting lanes on roads, using a technique inspired by how images are generated with diffusion models.

What's the problem?

Existing lane detection systems often struggle with accurately identifying lanes, especially when conditions change like different lighting or road markings. They also sometimes have trouble representing the lanes well enough for precise detection, and can be overly reliant on the initial information they receive about where lanes might be.

What's the solution?

DiffusionLane tackles this by framing lane detection as a process of gradually removing noise. It starts with a rough guess of where the lanes are, adds random noise to it, and then trains a model to progressively 'denoise' this guess, refining it into a precise lane detection. To improve the initial representation, the researchers combined two types of decoders – one looking at the big picture and one focusing on details – and also added an extra step during training to help the model learn better features for identifying lanes. Essentially, they're teaching the model to start with a blurry idea and sharpen it into a clear lane marking.

Why it matters?

This research is important because DiffusionLane shows significant improvements over existing lane detection methods on several standard datasets. It's more accurate and adaptable to different road conditions, which is crucial for self-driving cars and advanced driver-assistance systems that rely on accurate lane keeping. The fact that it works well even with simpler underlying network architectures, like MobileNetV4, suggests it could be used in systems where computational resources are limited.

Abstract

In this paper, we present a novel diffusion-based model for lane detection, called DiffusionLane, which treats the lane detection task as a denoising diffusion process in the parameter space of the lane. Firstly, we add the Gaussian noise to the parameters (the starting point and the angle) of ground truth lanes to obtain noisy lane anchors, and the model learns to refine the noisy lane anchors in a progressive way to obtain the target lanes. Secondly, we propose a hybrid decoding strategy to address the poor feature representation of the encoder, resulting from the noisy lane anchors. Specifically, we design a hybrid diffusion decoder to combine global-level and local-level decoders for high-quality lane anchors. Then, to improve the feature representation of the encoder, we employ an auxiliary head in the training stage to adopt the learnable lane anchors for enriching the supervision on the encoder. Experimental results on four benchmarks, Carlane, Tusimple, CULane, and LLAMAS, show that DiffusionLane possesses a strong generalization ability and promising detection performance compared to the previous state-of-the-art methods. For example, DiffusionLane with ResNet18 surpasses the existing methods by at least 1\% accuracy on the domain adaptation dataset Carlane. Besides, DiffusionLane with MobileNetV4 gets 81.32\% F1 score on CULane, 96.89\% accuracy on Tusimple with ResNet34, and 97.59\% F1 score on LLAMAS with ResNet101. Code will be available at https://github.com/zkyntu/UnLanedet.

View Paper