MixUDA: From Synthetic to Real Object Detection

Object detection has made remarkable progress in recent years, driven by advancements in deep learning and the availability of large-scale annotated datasets. However, these methods often require extensive labeled data, which may not be accessible for specific or emerging applications. This limitation has generated interest in Unsupervised Domain Adaptation (UDA), which facilitates knowledge transfer from a labeled source domain to an unlabeled and differently distributed target domain. This study addresses the challenge of UDA between synthetic and real-world data. A methodology for generating synthetic datasets is proposed using AirSim and Unreal Engine, enabling the creation of highly customizable and diverse datasets. We also propose a Domain Adaptation technique, MixUDA, that maximizes the utility of the synthetic dataset to improve the performance of a model in a real domain. MixUDA is a UDA approach which uses a Mean Teacher architecture and employs pseudo-labels combined with two different image-mixing operations to achieve a smooth and progressive transition from the synthetic to the real domain: pseudo-mosaic and pseudo-mixup. The obtained results demonstrate encouraging progress, as MixUDA surpasses state-of-the-art models D3T and MixPL by 1.18 and 4 AP points respectively, approaching performance of oracle models trained directly on the target domain. These findings suggest that synthetic datasets have significant potential in addressing data scarcity and improving model generalization, while also pointing to promising directions for further exploration in this area.

Palabras clave: Synthetic dataset, Unsupervised Domain Adaptation