CPU and GPU oriented optimizations for LiDAR data processing
Digital Terrain Models (DTM) can be accurately obtained from clouds of LiDAR points but the corresponding cloud processing time can be prohibitive. This paper describes several optimization techniques that have been applied to the Overlap Window Method (OWM) that is a key component in DTM applications. OWM was originally implemented in R which translates into serious limitations in terms of the size of the LiDAR point cloud that can be processed. We have ported the code to C++, significantly optimized the data structure to minimize memory accesses, and developed parallel implementations for CPU and GPU commodity devices using oneAPI libraries and tools. This results in CPU and GPU versions that are up to 19x and 83x faster, respectively, than an OpenMP baseline that uses eight CPU cores. Most importantly, the proposed optimizations for CPU and GPU can be paramount to get the most out of other LiDAR-based algorithms in which the careful selection of the right data structure, parallelization strategies and memory access reduction techniques will certainly result in significant performance improvements.
keywords: LiDAR data processing, Digital Terrain Model, Tree data structures, Parallel optimization, GPU, SYCL CUDA oneAPI