Master's Thesis : Development of server side document processing and OCR services

Master's Thesis : Development of server side document processing and OCR services

Maréchal, Grégory

Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink : `http://hdl.handle.net/2268.2/10882`

Details

Title :	Master's Thesis : Development of server side document processing and OCR services
Translated title :	[fr] Développement de services de traitement de documents et de services de reconnaissance optique des caractères côté serveur
Author :	Maréchal, Grégory
Date of defense :	7-Sep-2020/9-Sep-2020
Advisor(s) :	Leduc, Guy
Committee's member(s) :	Boigelot, Bernard Donnet, Benoît Hannay, Sébastien
Language :	English
Number of pages :	55 (65 avec annexes)
Keywords :	[en] Android mobile [en] Spring java server [en] deep learning [en] classification [en] online training [en] image processing
Discipline(s) :	Engineering, computing & technology > Civil engineering
Name of the research project :	Self training classification of medical documents for a distributed mobile application
Target public :	Professionals of domain Student
Complementary URL :	https://www.andaman7.com/fr
Institution(s) :	Université de Liège, Liège, Belgique
Degree:	Master : ingénieur civil en informatique, à finalité spécialisée en "management"
Faculty:	Master thesis of the Faculté des Sciences appliquées

Abstract

[en] Andaman7 is the name of a company and of a mobile app whose goal is to empower patients (medical term) by giving them easier access and more control on their medical data. However, the processes currently in place to import this data into the application are long and/or tedious. In this project, we will start an exploration of the possibility to use machine learning algorithms in order to automate as much as possible the process of importing data.

To do so, we will implement what will be called the dataflow, which is a complete data processing scheme, including front-end and back-end services, allowing the user to send data for automated metadata extraction, but also to review samples for which the machine learning algorithm would not be confident. This last element will allow Andaman7 to rely on online training to compensate for the lack of data.

The dataflow will then be completed with an actual machine learning algorithm which will be used to classify the sent samples. Finally, the conclusion will include a short discussion about what could be done to extract more metadata from the samples than just the class.

File(s)

Document(s)

Master_Thesis.pdf
Description: Main document
Size: 3.53 MB
Format: Adobe PDF

Ask a request copy

Summary.pdf
Description: Summary
Size: 57.9 kB
Format: Adobe PDF

Ask a request copy

Annexe(s)

picture_1.png
Description: picture 1
Size: 217.72 kB
Format: image/png

Ask a request copy

picture_2.jpg
Description: picture 2
Size: 134.63 kB
Format: JPEG

Ask a request copy

picture_3.jpg
Description: picture 3
Size: 120.37 kB
Format: JPEG

Ask a request copy

picture_4.png
Description: picture 4
Size: 77.34 kB
Format: image/png

Ask a request copy

picture_5.jpg
Description: picture 5
Size: 272.84 kB
Format: JPEG

Ask a request copy

picture_6.jpg
Description: picture 6
Size: 70.81 kB
Format: JPEG

Ask a request copy

Cite this master thesis

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.

MASTER THESIS

Master's Thesis : Development of server side document processing and OCR services

Maréchal, Grégory

Promotor(s) : Leduc, Guy

Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink : http://hdl.handle.net/2268.2/10882

Details

Abstract

File(s)

Document(s)

Annexe(s)

Author

Promotor(s)

Committee's member(s)

Cite this master thesis

Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink : `http://hdl.handle.net/2268.2/10882`