One-Shot Learning for Face Recognition
Brieven, Géraldine
Promoteur(s) : Louppe, Gilles
Date de soutenance : 26-jui-2019/27-jui-2019 • URL permanente : http://hdl.handle.net/2268.2/6795
Détails
Titre : | One-Shot Learning for Face Recognition |
Auteur : | Brieven, Géraldine |
Date de soutenance : | 26-jui-2019/27-jui-2019 |
Promoteur(s) : | Louppe, Gilles |
Membre(s) du jury : | Geurts, Pierre
Van Lishout, François |
Langue : | Anglais |
Nombre de pages : | 58 |
Mots-clés : | [en] deep learning [en] one-shot learning [en] face recognition |
Discipline(s) : | Ingénierie, informatique & technologie > Sciences informatiques |
Centre(s) de recherche : | University of Liege |
Intitulé du projet de recherche : | Research on Deep Learning |
Public cible : | Chercheurs Etudiants Grand public |
Institution(s) : | Université de Liège, Liège, Belgique |
Diplôme : | Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems" |
Faculté : | Mémoires de la Faculté des Sciences appliquées |
Résumé
[en] This master thesis has been drawn up for the purpose of obtaining the Master grade in Civil Engineering in Computer Science. It relates to a research topic on deep learning focusing in particular to one-shot learning, applied to the face recognition problem. In this work, the face recognition problem is assimilated to the task consisting in identifying a person on a given picture. This classification problem fits quite well with the One-Shot learning setting since it's supported by a training set containing only few instances of each face, meaning that the model is expected to quickly integrate new people's face from few data, which represent the challenge carried by one-shot learning. In this work, 4 main phases are defined to perform the face recognition task.
First, databases containing face images have to be acquired. Here, the Labeled Faces in the Wild (LFW), the Celebrities in Frontal-Profile in the Wild (CFP), the FaceScrub cropped and part of the CASIA-WebFace are exploited, in addition to some other extra smaller databases representing common (i.e. not famous) people in order to be closer to real conditions.
From those databases, a training set is being built, based on less than 10000 face pictures, which represent a quite limited data quantity regarding the standard size of face datasets that sometimes reach hundreds of millions of face images.
Once the data has been collected, the face images are being processed, which mainly consists in detecting the face on each picture, aligning it and finally cropping the picture. Notice that the face detection relies on an external model already trained for.
Next, the focus is on how to efficiently extract features from the face patches and make the in-class (i.e. picture instances representing the same person) closer and out-class (i.e. picture instances representing different people) further in some embedding space. To do so, a Siamese Network is being trained, based on pairs of faces, to perform well on the verification task.
After a good Siamese Network has been trained, it can support the last step - the classification - consisting in assigning an identity to an input face image (probe) by leading some similarity computation between this input picture and identified face images contained in a gallery. Following that, the final predicted identity is the one the input face was the closest during the comparison process.
Besides those 4 main steps, to support the learning process based on few data, 2 extra steps are being defined. First, the Siamese network can be pretrained as encoder belonging to an autoencoder targeting to reproduce input faces. Next, a data augmentation process is defined by employing a Style GAN to derive synthetic face instances.
Finally, regarding the performance of the face recognition task, a top-10 accuracy of 84% can be obtained once a probe has been identified in front of a gallery of 200 people (each identity being represented by 8 instances). Besides this, the Siamese Network can reach a f1-score of 87% on the verification task. The autoencoder and the definition of synthetic data can improve the performance only in the case where very little data are initially available (less than 500 face pictures typically).
Fichier(s)
Document(s)
Description: Report (Full Content)
Taille: 17.57 MB
Format: Adobe PDF
Description: Summary
Taille: 437.64 kB
Format: Adobe PDF
Citer ce mémoire
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.