Feedback

Faculté des Sciences appliquées
Faculté des Sciences appliquées
MASTER THESIS
VIEW 70 | DOWNLOAD 344

Master thesis : Sparse hypernetworks for multitasking

Download
Cubélier, François ULiège
Promotor(s) : Geurts, Pierre ULiège
Date of defense : 27-Jun-2022/28-Jun-2022 • Permalink : http://hdl.handle.net/2268.2/14574
Details
Title : Master thesis : Sparse hypernetworks for multitasking
Author : Cubélier, François ULiège
Date of defense  : 27-Jun-2022/28-Jun-2022
Advisor(s) : Geurts, Pierre ULiège
Committee's member(s) : Wehenkel, Louis ULiège
Louveaux, Quentin ULiège
Language : English
Number of pages : 73
Keywords : [en] deep learning
[en] hypernetworks
[en] multitasking
[en] meta-models
Discipline(s) : Engineering, computing & technology > Computer science
Target public : Researchers
Professionals of domain
Student
Complementary URL : https://github.com/francoisCub/multitasking-hnet
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculty: Master thesis of the Faculté des Sciences appliquées

Abstract

[en] Machine learning researchers have always been interested in creating less narrow artificial intelligence. Meta-models, i.e. models capable of producing other models, could potentially be a key ingredient for building new highly multitasking capable models. Hypernetworks, which are neural networks that produce the parameters of other neural networks, can be used as meta-models. However, due to the large number of parameters in neural networks nowadays, it is not trivial to build hypernetworks with the large output size required to produce all the parameters of another neural network. Current solutions, like chunked hypernetworks, which split the target parameter space into parts and reuse the same model to produce each part, achieve good results in practice and are scalable independently of the maximal size of the layers in the target model. However, they seem unsatisfactory because they arbitrarily split the target model parameters into chunks. In this work, we propose a new scalable architecture for building hypernetworks, which consists in a sparse MLP with hidden layers of exponentially growing size. After testing different variations of this architecture, we compare it with chunked hypernetworks on multitasking computer vision benchmarks. We show that they can match the performance of chunked hypernetworks, even though they were slightly behind on more complex problems. We also show that linear sparse hypernetworks outperformed their non-linear version and chunked hypernetworks for inferring new models for new tasks with a pretrained task-conditioned hypernetwork. This is may indicate that linear sparse hypernetworks have better generalization properties than more complex hypernetworks. In addition to proposing this sparse architecture and as a preamble of this work, we also review the literature on hypernetworks and propose a typology of hypernetworks. Even though the results obtained are promising, there are still many ways to improve sparse hypernetworks and, more generally, hypernetworks that can be explored in future research.


File(s)

Document(s)

File
Access master_thesis_hypernetworks.pdf
Description:
Size: 1.42 MB
Format: Adobe PDF
File
Access master_thesis_hypernetworks_summary.pdf
Description:
Size: 221.52 kB
Format: Adobe PDF

Annexe(s)

File
Access complete_hypernetwork_types.JPG
Description:
Size: 60.35 kB
Format: JPEG
File
Access hypernetwork.pdf
Description:
Size: 33.61 kB
Format: Adobe PDF
File
Access sparse_hypernetwork_example.pdf
Description:
Size: 28.87 kB
Format: Adobe PDF

Author

  • Cubélier, François ULiège Université de Liège > Master ingé. civ. info., à fin.

Promotor(s)

Committee's member(s)

  • Wehenkel, Louis ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Méthodes stochastiques
    ORBi View his publications on ORBi
  • Louveaux, Quentin ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation : Optimisation discrète
    ORBi View his publications on ORBi
  • Total number of views 70
  • Total number of downloads 344










All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.