Feedback

Faculté des Sciences appliquées
Faculté des Sciences appliquées
MASTER THESIS
VIEW 45 | DOWNLOAD 2

Master thesis : Text Autocomplete System with Language Models for Amanote

Download
Fery, Loïs ULiège
Promotor(s) : Ittoo, Ashwin ULiège
Date of defense : 26-Jan-2024 • Permalink : http://hdl.handle.net/2268.2/19566
Details
Title : Master thesis : Text Autocomplete System with Language Models for Amanote
Author : Fery, Loïs ULiège
Date of defense  : 26-Jan-2024
Advisor(s) : Ittoo, Ashwin ULiège
Committee's member(s) : Louppe, Gilles ULiège
Debruyne, Christophe ULiège
Language : English
Number of pages : 75
Discipline(s) : Engineering, computing & technology > Computer science
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculty: Master thesis of the Faculté des Sciences appliquées

Abstract

[en] This thesis explores the design of an autocomplete system based on language modeling that aims to be integrated into Amanote, a note-taking application for slides and syllabuses, whose primary audience is students. The system is designed to generate real-time suggestions to assist students in taking notes by reducing repetitive typing. Another aspect of the thesis is to discuss the possibility of deploying the system locally on the user's pc, eliminating the need for a server.

We discuss each stage of the system design: corpora gathering, candidate models selection, models training/fine-tuning, models evaluation, suggestions generation and deployment. We notably discuss the gathering and analysis of two datasets to train and evaluate our system: one composed of Amanote user notes and the other composed of articles from several academic disciplines. We also conduct several experiments on the candidate models to identify the most suitable ones for deployment in the application.

The results show that student notes tend to be less formal than classical texts and that a large portion of them contains many abbreviations and spelling mistakes. Moreover, the results of our experiments tend to show the effectiveness of large scale pre-training for the autocompletion task in the context of note-taking. Another noteworthy discovery is that character-level tokenization may potentially be effective for this task. Overall, we find the results promising and we are confident in the fact that our system could be useful to the users of Amanote. Moreover, our findings indicate that a local deployment of the system may be achievable, even if there are some challenges associated with it.

In essence, this thesis contributes to the advancement of autocomplete systems and to the broader goal of enhancing accessibility to neural language models by focusing on their local deployment, thereby reducing reliance on external servers.


File(s)

Document(s)

File
Access s175043Fery2024.pdf
Description:
Size: 4.81 MB
Format: Adobe PDF

Annexe(s)

File
Access s175043Fery2024_Abstract.pdf
Description:
Size: 1.26 MB
Format: Adobe PDF

Author

  • Fery, Loïs ULiège Université de Liège > Master ing. civ. inf. fin. spéc.int. sys.

Promotor(s)

Committee's member(s)

  • Louppe, Gilles ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data
    ORBi View his publications on ORBi
  • Debruyne, Christophe ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Représentation et ingénierie des données
    ORBi View his publications on ORBi
  • Total number of views 45
  • Total number of downloads 2










All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.