Mapping the Unknown: A New Approach to Open-World Video Recognition

Intelligent agents must strive to (re)map the constantly changing environment in which they operate, in order to remain adaptive and efficient. In open-world recognition (OWR) a system has to: detect new emerging categories, recognize new instances of known categories, and continually update knowledge based on the data streams it receives, mostly unannotated. In this work, we propose a hybrid method to deal with OWR that combines deep feature embedings with dynamic ensemble methods for a continuous reshaping of boundaries in feature space. Our approach is flexible to update to patterns in the border of what is already known (concept-drift), detect and create models for new categories, recover from mistakes, and mitigate catastrophic forgetting, even in semi-supervised contexts. As an application use case, we have considered the problem of semi-supervised video face recognition, where the spatial-temporal coherence is harnessed to augment data. Our experiments shown that the system responds adequately to the unknowns, adding models for new identities, and improving its performance.

keywords: