On the use of feature selection to improve the detection of sea oil spills in SAR images
Fast and effective oil spill detection systems are crucial to ensure a proper response to environmental emergencies caused by hydrocarbon pollution on the ocean’s surface. Typically, these systems uncover not only oil spills, but also a high number of look-alikes. The feature extraction is a critical and computationally intensive phase where each detected dark spot is independently examined. Traditionally, detection systems use an arbitrary set of features to discriminate between oil spills and look-alikes phenomena. However, Feature Selection (FS) methods based on Machine Learning (ML) have proved to be very useful in real domains for enhancing the generalization capabilities of the classifiers, while discarding the existing irrelevant features. In this work,we present a generic and systematic approach, based on FS methods, for choosing a concise and relevant set of features to improve the oil spill detection systems. We have compared five FS methods: Correlation-based feature selection (CFS), Consistency-based filter, Information Gain, ReliefF and Recursive Feature Elimination for Support Vector Machine (SVM-RFE). They were applied on a 141-input vector composed of features from a collection of outstanding studies. Selected features were validated via a Support Vector Machine (SVM) classifier and the results were compared with previous works. Test experiments revealed that the classifier trained with the 6-input feature vector proposed by SVM-RFE achieved the best accuracy and Cohen’s kappa coefficient (87.1% and 74.06% respectively). This is a smaller feature combination with similar or even better classification accuracy than previous works. The presented finding allows to speed up the feature extraction phase without reducing the classifier accuracy. Experiments also confirmed the significance of the geometrical features since 75.0% of the different features selected by the applied FS methods as well as 66.67% of the proposed 6-input feature vector belong to this category.
keywords: Feature Selection, SVM-RFE, Classification, Support Vector Machines, Synthetic Aperture Radar, oil spill detection