Faculté des Sciences appliquées
Faculté des Sciences appliquées

ChatBot with GANs

Castillo Lenz, Sergio Miguel ULiège
Promotor(s) : Ittoo, Ashwin ULiège
Date of defense : 22-Jan-2021 • Permalink :
Title : ChatBot with GANs
Translated title : [fr] ChatBot avec GANs
Author : Castillo Lenz, Sergio Miguel ULiège
Date of defense  : 22-Jan-2021
Advisor(s) : Ittoo, Ashwin ULiège
Committee's member(s) : Hiard, Samuel ULiège
Louppe, Gilles ULiège
Language : English
Keywords : [en] machine learning
[en] gan
[en] auto-encoders
[en] daily dialog
[en] deep learning
Discipline(s) : Engineering, computing & technology > Computer science
Target public : Researchers
Professionals of domain
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en sciences informatiques, à finalité spécialisée en "intelligent systems"
Faculty: Master thesis of the Faculté des Sciences appliquées


[en] Since its introduction in 2014 [Goodfellow et al., 2014], the architecture of Generative Adversarial Networks (GANs) have experienced various evolutions to reach its current state where it is capable to recreate realistic images of any given context. Those improvements, both in terms of complexity and stability, enabled successful applications of GANs frameworks in the field of computer vision and transfer learning. On the other hand, GANs lack of successful applications within the field of Natural Language Processing (NLP) where models based on Transformers architecture, such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Training (GPT), remain the current state-of-the-art for various NLP tasks.

Given this current situation, this thesis investigates why GANs remain underused for NLP tasks. As such, we explore some researchers’ proposals within the area of Dialog Systems by using data from the Daily Dialog dataset, a human-written and multi-turn dialog set reflecting daily human communication.

Moreover, we investigate the influence of an embedding layer of the proposed GAN models. In order to do so first, we test pre-trained “word-level” embeddings, such as Stanford's Glove and Spacy embeddings.

Second, we train the model by using our own word embeddings coming from the Daily Dialog dataset. The Word2Vec algorithm is used in this case. Third, we explore the idea of using BERT as a contextualized word embeddings. From these experiments it was observed that the use of pre-trained embeddings, not only accelerates the convergence during the training but also, improves the quality of the produced samples by the model, to some extents avoiding an early arrival of mode collapse.

In conclusion, despite their limited success in the NLP area, GAN-trained models offer an interesting approach during the training phase, as the generator G is able to produce different but potentially correct response samples and is not penalized by not producing the most likely single correct sequence of words. This actually follows an important characteristic of the human learning process. Overall, this thesis successfully explores propositions made to tackle drawbacks of the GAN architecture within the NLP area and opens doors for critical progresses in the area.



Access TFE_scastillo.pdf
Size: 2.44 MB
Format: Adobe PDF
Access abstract_TFE_scastillo(1).pdf
Size: 66.41 kB
Format: Adobe PDF


  • Castillo Lenz, Sergio Miguel ULiège Université de Liège > Master sc. informatiques, à fin.


Committee's member(s)

  • Hiard, Samuel ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)
    ORBi View his publications on ORBi
  • Louppe, Gilles ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data
    ORBi View his publications on ORBi
  • Total number of views 155
  • Total number of downloads 653

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.