Master thesis : Is the use of synthetic datasets a solution to improve object detection models on real data?
Tilkin, Nicolas
Promotor(s) : Louppe, Gilles
Date of defense : 27-Jan-2023 • Permalink : http://hdl.handle.net/2268.2/16762
Details
Title : | Master thesis : Is the use of synthetic datasets a solution to improve object detection models on real data? |
Translated title : | [fr] L'utilisation de données synthétiques permet-elle d'améliorer la performance de modèles de détection d'objets sur des données réelles? |
Author : | Tilkin, Nicolas |
Date of defense : | 27-Jan-2023 |
Advisor(s) : | Louppe, Gilles |
Committee's member(s) : | Debruyne, Christophe
Van Droogenbroeck, Marc Rebbouh, Leila |
Language : | English |
Number of pages : | 74 |
Keywords : | [en] Deep learning [en] Object detection [en] Synthetic data |
Discipline(s) : | Engineering, computing & technology > Computer science |
Institution(s) : | Université de Liège, Liège, Belgique |
Degree: | Master : ingénieur civil en science des données, à finalité spécialisée |
Faculty: | Master thesis of the Faculté des Sciences appliquées |
Abstract
[en] In the recent years, object detection models have leveraged deep learning architectures to improve performance in many problems. However, these techniques require
a large amount of high quality labelled data in order to reach their full potential,
and obtaining such data may prove to be an arduous task. In this context, this
work explores the possibility of using entirely synthetically generated and labelled
images to train an object detection model. In particular, we examine which factors
of variations in the synthetic data best transfer to real data. Unsurprisingly, models
trained on synthetic data only perform significantly worse than models trained on
real data. We explore whether the synthetic images can be enhanced using filtering
and generative models, but find the results to be inconclusive. In a setting where
both real and synthetic data are available, we experiment to find out how these
should be combined to improve performance in the real domain. We find that the
synthetic and real datasets should be combined into a single training dataset, and
that the object detection model trained in this fashion significantly outperforms the
model trained on real data only.
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.