Dynamic Saliency from Adaptative Whitening

This paper describes a unified framework for the static and dynamic saliency detection by whitening both the chromatic characteristics and the spatio-temporal frequencies. This approach is grounded in the statistical adaptation to the input data, resembling the human visual system’s early codification. Our approach, AWS-D, outperforms state-of-the-art models in the ability to predict human eye fixations while freely viewing a set of videos from three open access datasets (task free). We used as assessment measure an adaptation of the shuffling-AUC metric to spatio-temporal stimulus, together with a permutation test. Under this criterion, AWS-D not only reaches the highest AUC values, but also holds significant AUC figures for longer periods of time (more frames), over all the videos used in the test. The model also reproduces psychophysical results obtained in pop-out experiments in agreement with human behavior.