Feedback

Faculté des Sciences appliquées
Faculté des Sciences appliquées
Mémoire
VIEW 19 | DOWNLOAD 2

Master thesis : Text Autocomplete System with Language Models for Amanote

Télécharger
Fery, Loïs ULiège
Promoteur(s) : Ittoo, Ashwin ULiège
Date de soutenance : 26-jan-2024 • URL permanente : http://hdl.handle.net/2268.2/19566
Détails
Titre : Master thesis : Text Autocomplete System with Language Models for Amanote
Auteur : Fery, Loïs ULiège
Date de soutenance  : 26-jan-2024
Promoteur(s) : Ittoo, Ashwin ULiège
Membre(s) du jury : Louppe, Gilles ULiège
Debruyne, Christophe ULiège
Langue : Anglais
Nombre de pages : 75
Discipline(s) : Ingénierie, informatique & technologie > Sciences informatiques
Institution(s) : Université de Liège, Liège, Belgique
Diplôme : Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculté : Mémoires de la Faculté des Sciences appliquées

Résumé

[en] This thesis explores the design of an autocomplete system based on language modeling that aims to be integrated into Amanote, a note-taking application for slides and syllabuses, whose primary audience is students. The system is designed to generate real-time suggestions to assist students in taking notes by reducing repetitive typing. Another aspect of the thesis is to discuss the possibility of deploying the system locally on the user's pc, eliminating the need for a server.

We discuss each stage of the system design: corpora gathering, candidate models selection, models training/fine-tuning, models evaluation, suggestions generation and deployment. We notably discuss the gathering and analysis of two datasets to train and evaluate our system: one composed of Amanote user notes and the other composed of articles from several academic disciplines. We also conduct several experiments on the candidate models to identify the most suitable ones for deployment in the application.

The results show that student notes tend to be less formal than classical texts and that a large portion of them contains many abbreviations and spelling mistakes. Moreover, the results of our experiments tend to show the effectiveness of large scale pre-training for the autocompletion task in the context of note-taking. Another noteworthy discovery is that character-level tokenization may potentially be effective for this task. Overall, we find the results promising and we are confident in the fact that our system could be useful to the users of Amanote. Moreover, our findings indicate that a local deployment of the system may be achievable, even if there are some challenges associated with it.

In essence, this thesis contributes to the advancement of autocomplete systems and to the broader goal of enhancing accessibility to neural language models by focusing on their local deployment, thereby reducing reliance on external servers.


Fichier(s)

Document(s)

File
Access s175043Fery2024.pdf
Description:
Taille: 4.81 MB
Format: Adobe PDF

Annexe(s)

File
Access s175043Fery2024_Abstract.pdf
Description:
Taille: 1.26 MB
Format: Adobe PDF

Auteur

  • Fery, Loïs ULiège Université de Liège > Master ing. civ. inf. fin. spéc.int. sys.

Promoteur(s)

Membre(s) du jury

  • Louppe, Gilles ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data
    ORBi Voir ses publications sur ORBi
  • Debruyne, Christophe ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Représentation et ingénierie des données
    ORBi Voir ses publications sur ORBi
  • Nombre total de vues 19
  • Nombre total de téléchargements 2










Tous les documents disponibles sur MatheO sont protégés par le droit d'auteur et soumis aux règles habituelles de bon usage.
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.