Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

Fei Zhang, Rob Chancia, Josie Clapp, Amirhossein Hassanzadeh, Dimah Dera, Richard MacKenzie, Jan van Aardt

2025-10-14

Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

Summary

This paper introduces a new method for automatically labeling 3D scans of forests, specifically mangrove forests, to help with ecological studies. It aims to reduce the amount of time and effort researchers spend manually identifying different parts of the forest in the scans.

What's the problem?

Currently, accurately identifying things like trees, ground, and vegetation in 3D laser scans (called semantic segmentation) requires a lot of painstaking manual labeling. This is slow, expensive, and limits how much forest area can be studied. It's hard to get enough labeled data to train computers to do this automatically, and it's unclear which characteristics of the scans are most helpful for accurate identification.

What's the solution?

The researchers developed a system that combines several techniques to make labeling easier and more efficient. First, they transform the 3D scan into a 2D image-like representation. Then, they add extra information (features) to each pixel in the image. They use multiple computer programs (an ensemble) to automatically predict labels and also identify areas where the prediction is uncertain. This uncertainty is then used to guide a human annotator to only label the ambiguous parts, reducing the overall labeling effort. Finally, they created a new dataset of mangrove forests called Mangrove3D and tools to visualize the results in 2D and 3D.

Why it matters?

This work is important because it provides a way to quickly and accurately map and monitor forests using 3D scans. By reducing the need for manual labeling, it makes large-scale ecological studies more feasible. The research also identifies which features of the 3D scans are most important for accurate identification, and shows that only a relatively small amount of labeled data is needed to achieve good results. This can be applied to other types of forests and environments, helping with conservation efforts and understanding ecosystem changes.

Abstract

Accurate semantic segmentation of terrestrial laser scanning (TLS) point clouds is limited by costly manual annotation. We propose a semi-automated, uncertainty-aware pipeline that integrates spherical projection, feature enrichment, ensemble learning, and targeted annotation to reduce labeling effort, while sustaining high accuracy. Our approach projects 3D points to a 2D spherical grid, enriches pixels with multi-source features, and trains an ensemble of segmentation networks to produce pseudo-labels and uncertainty maps, the latter guiding annotation of ambiguous regions. The 2D outputs are back-projected to 3D, yielding densely annotated point clouds supported by a three-tier visualization suite (2D feature maps, 3D colorized point clouds, and compact virtual spheres) for rapid triage and reviewer guidance. Using this pipeline, we build Mangrove3D, a semantic segmentation TLS dataset for mangrove forests. We further evaluate data efficiency and feature importance to address two key questions: (1) how much annotated data are needed and (2) which features matter most. Results show that performance saturates after ~12 annotated scans, geometric features contribute the most, and compact nine-channel stacks capture nearly all discriminative power, with the mean Intersection over Union (mIoU) plateauing at around 0.76. Finally, we confirm the generalization of our feature-enrichment strategy through cross-dataset tests on ForestSemantic and Semantic3D. Our contributions include: (i) a robust, uncertainty-aware TLS annotation pipeline with visualization tools; (ii) the Mangrove3D dataset; and (iii) empirical guidance on data efficiency and feature importance, thus enabling scalable, high-quality segmentation of TLS point clouds for ecological monitoring and beyond. The dataset and processing scripts are publicly available at https://fz-rit.github.io/through-the-lidars-eye/.

View Paper