Spatio-Temporal Object Detection from UAV On-Board Cameras
We propose a new two stage spatio-temporal object detector framework able to improve detection precision by taking into account temporal information. First, a short-term proposal linking and aggregation method improves box features. Then, we design a long-term attention module that further enhances short-term aggregated features adding long-term spatio-temporal information. This module takes into account object trajectories to effectively exploit long-term relationships between proposals in arbitrary distant frames. Many videos recorded from UAV on-board cameras have a high density of small objects, making the detection problem very challenging. Our method takes advantage of spatio-temporal information to address these issues increasing the detection robustness. We have compared our method with state-of-the-art video object detectors in two different publicly available datasets focused on UAV recorded videos. Our approach outperforms previous methods in both datasets.
keywords: Object detection, Spatio-temporal features, CNN
Publication: Congress
1636360705984
November 8, 2021
/research/publications/spatio-temporal-object-detection-from-uav-on-board-cameras
We propose a new two stage spatio-temporal object detector framework able to improve detection precision by taking into account temporal information. First, a short-term proposal linking and aggregation method improves box features. Then, we design a long-term attention module that further enhances short-term aggregated features adding long-term spatio-temporal information. This module takes into account object trajectories to effectively exploit long-term relationships between proposals in arbitrary distant frames. Many videos recorded from UAV on-board cameras have a high density of small objects, making the detection problem very challenging. Our method takes advantage of spatio-temporal information to address these issues increasing the detection robustness. We have compared our method with state-of-the-art video object detectors in two different publicly available datasets focused on UAV recorded videos. Our approach outperforms previous methods in both datasets. - Daniel Cores, Victor M. Brea, Manuel Mucientes - 10.1007/978-3-030-89131-2_13 - 978-3-030-89130-5
publications_en