Master thesis : Is the use of synthetic datasets a solution to improve object detection models on real data?
Tilkin, Nicolas
Promoteur(s) : Louppe, Gilles
Date de soutenance : 27-jan-2023 • URL permanente : http://hdl.handle.net/2268.2/16762
Détails
Titre : | Master thesis : Is the use of synthetic datasets a solution to improve object detection models on real data? |
Titre traduit : | [fr] L'utilisation de données synthétiques permet-elle d'améliorer la performance de modèles de détection d'objets sur des données réelles? |
Auteur : | Tilkin, Nicolas |
Date de soutenance : | 27-jan-2023 |
Promoteur(s) : | Louppe, Gilles |
Membre(s) du jury : | Debruyne, Christophe
Van Droogenbroeck, Marc Rebbouh, Leila |
Langue : | Anglais |
Nombre de pages : | 74 |
Mots-clés : | [en] Deep learning [en] Object detection [en] Synthetic data |
Discipline(s) : | Ingénierie, informatique & technologie > Sciences informatiques |
Institution(s) : | Université de Liège, Liège, Belgique |
Diplôme : | Master : ingénieur civil en science des données, à finalité spécialisée |
Faculté : | Mémoires de la Faculté des Sciences appliquées |
Résumé
[en] In the recent years, object detection models have leveraged deep learning architectures to improve performance in many problems. However, these techniques require
a large amount of high quality labelled data in order to reach their full potential,
and obtaining such data may prove to be an arduous task. In this context, this
work explores the possibility of using entirely synthetically generated and labelled
images to train an object detection model. In particular, we examine which factors
of variations in the synthetic data best transfer to real data. Unsurprisingly, models
trained on synthetic data only perform significantly worse than models trained on
real data. We explore whether the synthetic images can be enhanced using filtering
and generative models, but find the results to be inconclusive. In a setting where
both real and synthetic data are available, we experiment to find out how these
should be combined to improve performance in the real domain. We find that the
synthetic and real datasets should be combined into a single training dataset, and
that the object detection model trained in this fashion significantly outperforms the
model trained on real data only.
Citer ce mémoire
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.