Master's Thesis : Audio frame reconstruction from incomplete observations using Deep Learning techniques

Master's Thesis : Audio frame reconstruction from incomplete observations using Deep Learning techniques

Schils, Minh

Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink : `http://hdl.handle.net/2268.2/10138`

Details

Title :	Master's Thesis : Audio frame reconstruction from incomplete observations using Deep Learning techniques
Author :	Schils, Minh
Date of defense :	7-Sep-2020/9-Sep-2020
Advisor(s) :	Embrechts, Jean-Jacques
Committee's member(s) :	Van Droogenbroeck, Marc Louppe, Gilles sarti, Augusto
Language :	English
Keywords :	[en] audio inpainting [en] deep learning
Discipline(s) :	Engineering, computing & technology > Computer science
Complementary URL :	https://ced211.github.io/
Institution(s) :	Université de Liège, Liège, Belgique
Degree:	Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculty:	Master thesis of the Faculté des Sciences appliquées

Abstract

[en] In this thesis, we tackle the problem of restoring an audio frame given the
preceding and subsequent one, e.g. audio inpainting, and extend our proposed
solution to the prediction of an audio frame given the last one. We
consider frames of 64 and 128 milliseconds. The proposed solution combines
a signal processing pipeline with a Generative adversarial network (GAN).
Using as input the absolute value of the STFT of the surrounding frames, the
network is able to retrieve the STFT magnitude corresponding to the gap
frame. By applying the Griffin-Lim Algorithm, we are then able to estimate
also the STFT phase and finally through the inverse STFT to reconstruct
the missing audio frame. We compare our method, considering as baseline a
Linear predictive coefficient (LPC) technique. The proposed solution shows
encouraging results with respect to the baseline both for inpainting and prediction.
It outperforms the baseline in term of Signal to noise ratio (SNR)
on the magnitude spectrum and performs equally well or better in term of
the Objective difference grade (ODG) which is a measure used tu assess the
perceived audio quality. Since the phase of the STFT can be only approximately
reconstructed through the Griffin-Lim Algorithm, the baseline shows
better performances in terms of audio SNR. We further show the model generalization
ability, by training and testing on two different types of music
datasets.

File(s)

Document(s)

report.pdf
Description:
Size: 4.22 MB
Format: Adobe PDF

Annexe(s)

code.zip
Description:
Size: 84.47 MB
Format: Unknown

page_web.zip
Description:
Size: 9.52 MB
Format: Unknown

Cite this master thesis

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.

Nom	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	Cookie de session de plate-forme à usage général, utilisé par les sites écrits en JSP. Habituellement utilisé pour maintenir une session utilisateur anonyme par le serveur.
CookieScriptConsent	CookieScript .uliege.be	1 an	Ce cookie est utilisé par le service Cookie-Script.com pour mémoriser les préférences de consentement des visiteurs en matière de cookies. Il est nécessaire pour que la bannière de cookies Cookie-Script.com fonctionne correctement.

Nom	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 an	Ce nom de cookie est associé à la plateforme d'analyse Web open source Matomo. Il est utilisé pour aider les propriétaires de sites Web à suivre le comportement des visiteurs et à mesurer les performances du site. Il s'agit d'un cookie de type modèle, où le préfixe _pk_id est suivi d'une courte série de chiffres et de lettres, qui est censé être un code de référence pour le domaine définissant le cookie.
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Ce nom de cookie est associé à la plateforme d'analyse Web open source Matomo. Il est utilisé pour aider les propriétaires de sites Web à suivre le comportement des visiteurs et à mesurer les performances du site. Il s'agit d'un cookie de type modèle, où le préfixe _pk_ses est suivi d'une courte série de chiffres et de lettres, ce qui est considéré comme un code de référence pour le domaine définissant le cookie.
_pk_ref	InnoCraft Ltd .uliege.be	6 mois	Ce nom de cookie est associé à la plateforme d'analyse Web open source Matomo. Il est utilisé pour aider les propriétaires de sites Web à suivre le comportement des visiteurs et à mesurer les performances du site. Il s'agit d'un cookie de type modèle, où le préfixe _pk_ref est suivi d'une courte série de chiffres et de lettres, ce qui est considéré comme un code de référence pour le domaine définissant le cookie.

MASTER THESIS

Master's Thesis : Audio frame reconstruction from incomplete observations using Deep Learning techniques

Schils, Minh

Promotor(s) : Embrechts, Jean-Jacques

Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink : http://hdl.handle.net/2268.2/10138

Details

Abstract

File(s)

Document(s)

Annexe(s)

Author

Promotor(s)

Committee's member(s)

Cite this master thesis

APA

Chicago

Date of defense : 7-Sep-2020/9-Sep-2020 • Permalink : `http://hdl.handle.net/2268.2/10138`