Energy-based Multi-Modal Attention
Werenne, Aurélien
Promotor(s) : Marée, Raphaël
Date of defense : 9-Sep-2019/10-Sep-2019 • Permalink : http://hdl.handle.net/2268.2/7854
Details
Title : | Energy-based Multi-Modal Attention |
Author : | Werenne, Aurélien |
Date of defense : | 9-Sep-2019/10-Sep-2019 |
Advisor(s) : | Marée, Raphaël |
Committee's member(s) : | Geurts, Pierre
Louppe, Gilles Embrechts, Jean-Jacques |
Language : | English |
Number of pages : | 74 |
Keywords : | [en] Multimodal, Deep Learning, Attention, Robustness |
Discipline(s) : | Engineering, computing & technology > Computer science |
Target public : | Researchers Professionals of domain Student General public |
Complementary URL : | https://github.com/Werenne/energy-based-multimodal-attention |
Institution(s) : | Université de Liège, Liège, Belgique |
Degree: | Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems" |
Faculty: | Master thesis of the Faculté des Sciences appliquées |
Abstract
[en] A multi-modal neural network exploits information from different channels and in different terms (e.g., images, text, sounds, sensor measures) in the hope that the information carried by each mode is complementary, in order to improve the predictions the neural network. Nevertheless, in realistic situations, varying levels of perturbations can occur on the data of the modes, which may decrease the quality of the inference process. An additional difficulty is that these perturbations vary between the modes and on a per-sample basis. This work presents a solution to this problem. The three main contributions are described below.
First, a novel attention module is designed, analysed and implemented. This attention module is constructed to help multi-modal networks handle modes with perturbations.
Secondly, two new regularizers are developed to improve the generalization of the robustness gain on more intensive failing modes (relative to the training set).
Lastly, a unified multi-modal attention module is presented, combining the main types of attention mechanisms in the deep learning literature with our module. We suggest that this unified module could be coupled with a prediction model to enable the latter face unexpected situations, and improve the extraction of the relevant information in the data.
File(s)
Document(s)
Annexe(s)
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.