Efficient GPU Asynchronous Implementation of a Watershed Algorithm Based on Cellular Automata

The watershed transform is a widely used method for non-supervised image segmentation, especially suitable for low-contrast images. In this paper we show that an algorithm calculating the watershed transform based on a cellular automaton is a good choice for the most recent GPU architectures, especially when the synchronization rules are relaxed. In particular we compare a synchronous and an asynchronous implementation of the algorithm. The results show high speedups for both implementations, especially for the asynchronous one, indicating the potential of this kind of algorithms for new architectures based on hundreds of cores.

keywords: CUDA, Cellular Automata, GPU, Image segmentation, Watershed transform