Comparison of a massive and diverse collection of classifiers for oil spill detection in SAR images

We present a comparison of the largest collection of classifiers considered until now in the literature, composed by 428 methods belonging to 41 very different families. This collection, much larger than the one in our previous work (Fernández-Delgado et al. in J Mach Learn Res 15:3133–3181, 2014), includes 320 ensembles (varying the base and meta-classifiers), alongside with Support Vector Machines, Bayesian, Neural Networks, Discriminant Analysis, Multivariate Adaptive Regression Splines, Random Forests, Decision Trees and many others. The classifier comparison is developed on the detection of oil spills on Synthetic Aperture Radar (SAR) images taken from satellites. The SAR images have revealed very useful to surveillance maritime agencies for the detection of regular offshore operational discharges, which, despite is commonly accepted, is one of the biggest causes of hydrocarbon marine pollution, instead of tanker and oil platform catastrophes. After a segmentation of the SAR images to select oil spill candidates, classifiers use the features extracted from these candidates to discard frequent and expensives look-alikes (false positives), caused by natural phenomena. Testing experiments revealed that the RotationForest ensemble of MultilayerPerceptron base classifiers, applying Kernel PCA on the original data, achieves the best accuracy and Cohen j (87.1 % and 71.0 %, respectively) with a low frequency of false positives (5.13 %).

keywords: Oil spill detection, Synthetic Aperture Radar satellite images, Rotation Forests, Artificial Neural Networks, Support Vector Machines, Decision Trees, Bagging, Boosting, Kernel Principal Component Analysis, Classifier ensembles, Multivariate Adaptive