Faculté des Sciences appliquées
Faculté des Sciences appliquées

Master's Thesis : Development of server side document processing and OCR services

Maréchal, Grégory ULiège
Promotor(s) : Leduc, Guy ULiège
Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink :
Title : Master's Thesis : Development of server side document processing and OCR services
Translated title : [fr] Développement de services de traitement de documents et de services de reconnaissance optique des caractères côté serveur
Author : Maréchal, Grégory ULiège
Date of defense  : 7-Sep-2020/9-Sep-2020
Advisor(s) : Leduc, Guy ULiège
Committee's member(s) : Boigelot, Bernard ULiège
Donnet, Benoît ULiège
Hannay, Sébastien 
Language : English
Number of pages : 55 (65 avec annexes)
Keywords : [en] Android mobile
[en] Spring java server
[en] deep learning
[en] classification
[en] online training
[en] image processing
Discipline(s) : Engineering, computing & technology > Civil engineering
Name of the research project : Self training classification of medical documents for a distributed mobile application
Target public : Professionals of domain
Complementary URL :
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master : ingénieur civil en informatique, à finalité spécialisée en "management"
Faculty: Master thesis of the Faculté des Sciences appliquées


[en] Andaman7 is the name of a company and of a mobile app whose goal is to empower patients (medical term) by giving them easier access and more control on their medical data. However, the processes currently in place to import this data into the application are long and/or tedious. In this project, we will start an exploration of the possibility to use machine learning algorithms in order to automate as much as possible the process of importing data.

To do so, we will implement what will be called the dataflow, which is a complete data processing scheme, including front-end and back-end services, allowing the user to send data for automated metadata extraction, but also to review samples for which the machine learning algorithm would not be confident. This last element will allow Andaman7 to rely on online training to compensate for the lack of data.

The dataflow will then be completed with an actual machine learning algorithm which will be used to classify the sent samples. Finally, the conclusion will include a short discussion about what could be done to extract more metadata from the samples than just the class.



Access Master_Thesis.pdf
Description: Main document
Size: 3.53 MB
Format: Adobe PDF
Access Summary.pdf
Description: Summary
Size: 57.9 kB
Format: Adobe PDF


Access picture_1.png
Description: picture 1
Size: 217.72 kB
Format: image/png
Access picture_2.jpg
Description: picture 2
Size: 134.63 kB
Format: JPEG
Access picture_3.jpg
Description: picture 3
Size: 120.37 kB
Format: JPEG
Access picture_4.png
Description: picture 4
Size: 77.34 kB
Format: image/png
Access picture_5.jpg
Description: picture 5
Size: 272.84 kB
Format: JPEG
Access picture_6.jpg
Description: picture 6
Size: 70.81 kB
Format: JPEG


  • Maréchal, Grégory ULiège Université de Liège > Master ingé. civ. info., à fin.


Committee's member(s)

  • Boigelot, Bernard ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique
    ORBi View his publications on ORBi
  • Donnet, Benoît ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorithmique des grands systèmes
    ORBi View his publications on ORBi
  • Hannay, Sébastien
  • Total number of views 52
  • Total number of downloads 7

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.