Mapping the Unknown: A New Approach to Open-World Video Recognition
Intelligent agents must strive to (re)map the constantly changing environment in which they operate, in order to remain adaptive and efficient. In open-world recognition (OWR) a system has to: detect new emerging categories, recognize new instances of known categories, and continually update knowledge based on the data streams it receives, mostly unannotated. In this work, we propose a hybrid method to deal with OWR that combines deep feature embedings with dynamic ensemble methods for a continuous reshaping of boundaries in feature space. Our approach is flexible to update to patterns in the border of what is already known (concept-drift), detect and create models for new categories, recover from mistakes, and mitigate catastrophic forgetting, even in semi-supervised contexts. As an application use case, we have considered the problem of semi-supervised video face recognition, where the spatial-temporal coherence is harnessed to augment data. Our experiments shown that the system responds adequately to the unknowns, adding models for new identities, and improving its performance.
keywords:
Publication: Congress
1743421138488
March 31, 2025
/research/publications/mapping-the-unknown-a-new-approach-to-open-world-video-recognition
Intelligent agents must strive to (re)map the constantly changing environment in which they operate, in order to remain adaptive and efficient. In open-world recognition (OWR) a system has to: detect new emerging categories, recognize new instances of known categories, and continually update knowledge based on the data streams it receives, mostly unannotated. In this work, we propose a hybrid method to deal with OWR that combines deep feature embedings with dynamic ensemble methods for a continuous reshaping of boundaries in feature space. Our approach is flexible to update to patterns in the border of what is already known (concept-drift), detect and create models for new categories, recover from mistakes, and mitigate catastrophic forgetting, even in semi-supervised contexts. As an application use case, we have considered the problem of semi-supervised video face recognition, where the spatial-temporal coherence is harnessed to augment data. Our experiments shown that the system responds adequately to the unknowns, adding models for new identities, and improving its performance. - César D. Parga, Xosé M. Pardo, Carlos V. Regueiro - 10.1007/978-3-031-78189-6_10 - 978-3-031-78189-6
publications_en