Automatic Abstractive Text Summarization : A deeper look into convolutional sequence-to-sequence networks
Vermeylen, Valentin
Promotor(s) : Ittoo, Ashwin ; Doloris, Samy
Date of defense : 6-Sep-2021/7-Sep-2021 • Permalink : http://hdl.handle.net/2268.2/13292
Details
Title : | Automatic Abstractive Text Summarization : A deeper look into convolutional sequence-to-sequence networks |
Translated title : | [fr] Synthétisation Abstractive et Automatique de Textes : Un examen des réseaux séquence-vers-séquence convolutionnels |
Author : | Vermeylen, Valentin |
Date of defense : | 6-Sep-2021/7-Sep-2021 |
Advisor(s) : | Ittoo, Ashwin
Doloris, Samy |
Committee's member(s) : | Fontaine, Pascal
Gribomont, Pascal |
Language : | English |
Number of pages : | 65 |
Keywords : | [en] abstractive summarization [en] convolutional sequence-to-sequence |
Discipline(s) : | Engineering, computing & technology > Computer science |
Funders : | NRB |
Target public : | General public |
Institution(s) : | Université de Liège, Liège, Belgique |
Degree: | Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems" |
Faculty: | Master thesis of the Faculté des Sciences appliquées |
Abstract
[en] As the amount of information produced everyday continually increases, the desire for summaries containing only the most salient parts of the texts continues to gain traction. Even though the possibility to extract parts of texts and gluing them together already exists, we usually prefer fluent, human-like summaries.
That is the concern of the Artificial Intelligence subfield of Automatic Abstractive Summarization. Although the task is typically solved using recurrent neural networks, that architecture comes with several challenges, the biggest being the amount of time and computational power required to train the models. Fortunately, another less computationally intensive paradigm exists, based on convolutional networks, even though it has not been as extensively studied.
This thesis is concerned with that convolutional framework, and explores questions and assumptions that have not been answered previously, such as the advantages and drawbacks of using pretrained embeddings, or the tradeoff between performance gains and the added complexity of mechanisms such as reinforcement learning or pointing-generation. Experiments about the abstractiveness of the models, their fine-tuning on a different dataset, and their ability to capture long-distanced dependencies are also performed through the use of both the CNN/DailyMail dataset, and the XSUM dataset.
Those experiments show that using more convolutional blocks in the model makes sense up to a certain point, that the use of pretrained embeddings is advisable, as is the use of the pointer-generator network implemented in this work. The use of reinforcement learning is also advisable at the end of the model training.
Finally, this thesis is concluded with additional experiments that could be implemented in future works, as well as practical advises regarding the use of abstractive summarization in the context of general terms and conditions summarization.
File(s)
Document(s)
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.