Deep learning with simulated laser scanning data for 3D point cloud classification

Laser scanning is an active remote sensing technique applied in many disciplines to acquire state-of-the-art spatial measurements. Semantic labeling is often necessary to extract information from the raw point cloud. Deep learning methods constitute a data-hungry solution for the semantic segmentation of point clouds. In this work, we investigate the use of simulated laser scanning for training deep learning models, which are applied to real data subsequently. We show that training a deep learning model purely on virtual laser scanning data can produce results comparable to models trained on real data when evaluated on real data. For leaf-wood segmentation of trees, using the KPConv model trained with virtual data achieves 93.7% overall accuracy, while the model trained on real data reaches 94.7% overall accuracy. In urban contexts, a KPConv model trained on virtual data achieves 74.1% overall accuracy on real validation data, while the model trained on real data achieves 82.4%. Our models outperform the state-of-the-art model FSCT in terms of generalization to unseen real data as well as a baseline model trained on points randomly sampled from the tree mesh surface. From our results, we conclude that the combination of laser scanning simulation and deep learning is a cost-effective alternative to real data acquisition and manual labeling in the domain of geospatial point cloud analysis. The strengths of this approach are that (a) a large amount of diverse laser scanning training data can be generated quickly and without the need for expensive equipment, (b) the simulation configurations can be adapted so that the virtual training data have similar characteristics to the targeted real data, and (c) the whole workflow can be automated through procedural scene generation.

keywords: Virtual laser scanning, LiDAR simulation, Point clouds, Machine learning, Point-wise classification, Leaf-wood segmentation