Feedback

Faculté des Sciences appliquées
Faculté des Sciences appliquées
MASTER THESIS

Quantification of audio/video de-synchronization with reliable confidence scores

Download
de Thibault, Adrien ULiège
Promotor(s) : Louppe, Gilles ULiège
Date of defense : 30-Jun-2025/1-Jul-2025 • Permalink : http://hdl.handle.net/2268.2/23370
Details
Title : Quantification of audio/video de-synchronization with reliable confidence scores
Translated title : [fr] Quantification de la désynchronisation audio/vidéo avec des indices de confiance fiables
Author : de Thibault, Adrien ULiège
Date of defense  : 30-Jun-2025/1-Jul-2025
Advisor(s) : Louppe, Gilles ULiège
Committee's member(s) : Massoz, Quentin 
Cioppa, Anthony ULiège
Leduc, Guy ULiège
Language : English
Number of pages : 66
Keywords : [en] AV Synchronization
[en] computer vision
[en] Synchformer
[en] confidence score
[en] deep learning
[en] fine-tuning
Discipline(s) : Engineering, computing & technology > Computer science
Target public : Researchers
Professionals of domain
Student
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculty: Master thesis of the Faculté des Sciences appliquées

Abstract

[en] Audio-video desynchronization remains a significant challenge in broadcast and multimedia applications, as even slight desynchronization can negatively affect viewer experience. This master's thesis focuses on the evaluation and improvement of Synchformer, a state-of-the-art classifier model for estimating temporal offsets for 'in the wild' video content. A second focus is placed on assessing uncertainty associated with predictions of the model.

To achieve finer and more reliable predictions, Synchformer was fine-tuned for temporal offset regression. These fine-tuned models offer consistent performance across the full range of temporal offsets, addressing a limitation of the original discrete-class approach. In parallel, various confidence measures were explored to quantify prediction certainty. Some measures are based on predictions from Synchformer, while others use a history of predictions to assess the consistency of the model over time. Based on these measures, classification techniques were developed to distinguish between reliable and unreliable predictions.

Experiments conducted on SyncST, a newly introduced dataset of over 1500 broadcast video clips, showed that, when considering discrete offset, it is possible to classify reliable predictions with a precision of 92% for Synchformer in classification, corresponding to an error tolerance of 200ms.

In a regression setting, where offsets are continuous and the error tolerance is 170ms to align with the humanly undetectable limits, this reliability classification achieves a precision of 75% on fine-tuned versions of Synchformer. While, when using an error tolerance of 275ms, which is an acceptable error for viewers, it is possible to achieve a precision of 90%. Considering the low performance of these models at predicting desynchronization below what is humanly perceivable or acceptable, these contributions improve the trustworthiness that could be put in such a synchronization model, making it more applicable to real-world applications where low error tolerance and interpretability are critical.


File(s)

Document(s)

File
Access TFE_report_adth.pdf
Description:
Size: 6.43 MB
Format: Adobe PDF

Annexe(s)

File
Access classification_impossible_offsets.png
Description:
Size: 32.78 kB
Format: image/png
File
Access confidence_classification_diagram.png
Description:
Size: 560.94 kB
Format: image/png
File
Access mae_vs_undetectable_accuracy.png
Description:
Size: 258.47 kB
Format: image/png
File
Access performance_thresholding_regression.png
Description:
Size: 75.08 kB
Format: image/png
File
Access synchformer_model.png
Description:
Size: 72 kB
Format: image/png
File
Access Synchformer_without_git.zip
Description: The archive containing the code for this master's thesis
Size: 119.39 MB
Format: Unknown
File
Access TFE_AV_synchronization_abstract.pdf
Description:
Size: 273.88 kB
Format: Adobe PDF

Author

  • de Thibault, Adrien ULiège Université de Liège > Master ing. civ. inf. fin. spéc.int. sys.

Promotor(s)

Committee's member(s)

  • Massoz, Quentin
  • Cioppa, Anthony ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Computer vision and data analysis
    ORBi View his publications on ORBi
  • Leduc, Guy ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques
    ORBi View his publications on ORBi








All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.