Automatic Abstractive Text Summarization : A deeper look into convolutional sequence-to-sequence networks
Vermeylen, Valentin
Promoteur(s) : Ittoo, Ashwin ; Doloris, Samy
Date de soutenance : 6-sep-2021/7-sep-2021 • URL permanente : http://hdl.handle.net/2268.2/13292
Détails
Titre : | Automatic Abstractive Text Summarization : A deeper look into convolutional sequence-to-sequence networks |
Titre traduit : | [fr] Synthétisation Abstractive et Automatique de Textes : Un examen des réseaux séquence-vers-séquence convolutionnels |
Auteur : | Vermeylen, Valentin |
Date de soutenance : | 6-sep-2021/7-sep-2021 |
Promoteur(s) : | Ittoo, Ashwin
Doloris, Samy |
Membre(s) du jury : | Fontaine, Pascal
Gribomont, Pascal |
Langue : | Anglais |
Nombre de pages : | 65 |
Mots-clés : | [en] abstractive summarization [en] convolutional sequence-to-sequence |
Discipline(s) : | Ingénierie, informatique & technologie > Sciences informatiques |
Organisme(s) subsidiant(s) : | NRB |
Public cible : | Grand public |
Institution(s) : | Université de Liège, Liège, Belgique |
Diplôme : | Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems" |
Faculté : | Mémoires de la Faculté des Sciences appliquées |
Résumé
[en] As the amount of information produced everyday continually increases, the desire for summaries containing only the most salient parts of the texts continues to gain traction. Even though the possibility to extract parts of texts and gluing them together already exists, we usually prefer fluent, human-like summaries.
That is the concern of the Artificial Intelligence subfield of Automatic Abstractive Summarization. Although the task is typically solved using recurrent neural networks, that architecture comes with several challenges, the biggest being the amount of time and computational power required to train the models. Fortunately, another less computationally intensive paradigm exists, based on convolutional networks, even though it has not been as extensively studied.
This thesis is concerned with that convolutional framework, and explores questions and assumptions that have not been answered previously, such as the advantages and drawbacks of using pretrained embeddings, or the tradeoff between performance gains and the added complexity of mechanisms such as reinforcement learning or pointing-generation. Experiments about the abstractiveness of the models, their fine-tuning on a different dataset, and their ability to capture long-distanced dependencies are also performed through the use of both the CNN/DailyMail dataset, and the XSUM dataset.
Those experiments show that using more convolutional blocks in the model makes sense up to a certain point, that the use of pretrained embeddings is advisable, as is the use of the pointer-generator network implemented in this work. The use of reinforcement learning is also advisable at the end of the model training.
Finally, this thesis is concluded with additional experiments that could be implemented in future works, as well as practical advises regarding the use of abstractive summarization in the context of general terms and conditions summarization.
Fichier(s)
Document(s)
Citer ce mémoire
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.