Quantification of audio/video de-synchronization with reliable confidence scores
de Thibault, Adrien
Promoteur(s) :
Louppe, Gilles
Date de soutenance : 30-jui-2025/1-jui-2025 • URL permanente : http://hdl.handle.net/2268.2/23370
Détails
| Titre : | Quantification of audio/video de-synchronization with reliable confidence scores |
| Titre traduit : | [fr] Quantification de la désynchronisation audio/vidéo avec des indices de confiance fiables |
| Auteur : | de Thibault, Adrien
|
| Date de soutenance : | 30-jui-2025/1-jui-2025 |
| Promoteur(s) : | Louppe, Gilles
|
| Membre(s) du jury : | Massoz, Quentin
Cioppa, Anthony
Leduc, Guy
|
| Langue : | Anglais |
| Nombre de pages : | 66 |
| Mots-clés : | [en] AV Synchronization [en] computer vision [en] Synchformer [en] confidence score [en] deep learning [en] fine-tuning |
| Discipline(s) : | Ingénierie, informatique & technologie > Sciences informatiques |
| Public cible : | Chercheurs Professionnels du domaine Etudiants |
| Institution(s) : | Université de Liège, Liège, Belgique |
| Diplôme : | Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems" |
| Faculté : | Mémoires de la Faculté des Sciences appliquées |
Résumé
[en] Audio-video desynchronization remains a significant challenge in broadcast and multimedia applications, as even slight desynchronization can negatively affect viewer experience. This master's thesis focuses on the evaluation and improvement of Synchformer, a state-of-the-art classifier model for estimating temporal offsets for 'in the wild' video content. A second focus is placed on assessing uncertainty associated with predictions of the model.
To achieve finer and more reliable predictions, Synchformer was fine-tuned for temporal offset regression. These fine-tuned models offer consistent performance across the full range of temporal offsets, addressing a limitation of the original discrete-class approach. In parallel, various confidence measures were explored to quantify prediction certainty. Some measures are based on predictions from Synchformer, while others use a history of predictions to assess the consistency of the model over time. Based on these measures, classification techniques were developed to distinguish between reliable and unreliable predictions.
Experiments conducted on SyncST, a newly introduced dataset of over 1500 broadcast video clips, showed that, when considering discrete offset, it is possible to classify reliable predictions with a precision of 92% for Synchformer in classification, corresponding to an error tolerance of 200ms.
In a regression setting, where offsets are continuous and the error tolerance is 170ms to align with the humanly undetectable limits, this reliability classification achieves a precision of 75% on fine-tuned versions of Synchformer. While, when using an error tolerance of 275ms, which is an acceptable error for viewers, it is possible to achieve a precision of 90%. Considering the low performance of these models at predicting desynchronization below what is humanly perceivable or acceptable, these contributions improve the trustworthiness that could be put in such a synchronization model, making it more applicable to real-world applications where low error tolerance and interpretability are critical.
Fichier(s)
Document(s)
TFE_report_adth.pdf
Description:
Taille: 6.43 MB
Format: Adobe PDF
Annexe(s)
classification_impossible_offsets.png
Description:
Taille: 32.78 kB
Format: image/png
confidence_classification_diagram.png
Description:
Taille: 560.94 kB
Format: image/png
mae_vs_undetectable_accuracy.png
Description:
Taille: 258.47 kB
Format: image/png
performance_thresholding_regression.png
Description:
Taille: 75.08 kB
Format: image/png
synchformer_model.png
Description:
Taille: 72 kB
Format: image/png
Synchformer_without_git.zip
Description: The archive containing the code for this master's thesis
Taille: 119.39 MB
Format: Unknown
TFE_AV_synchronization_abstract.pdf
Description:
Taille: 273.88 kB
Format: Adobe PDF
Citer ce mémoire
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.

Master Thesis Online

