Leveraging Advanced Diffusion Model Techniques for Super-Resolution in Sports Broadcast Images
Alrahel, Badei
Promoteur(s) : Louppe, Gilles
Date de soutenance : 24-jui-2024/25-jui-2024 • URL permanente : http://hdl.handle.net/2268.2/20475
Détails
Titre : | Leveraging Advanced Diffusion Model Techniques for Super-Resolution in Sports Broadcast Images |
Titre traduit : | [fr] Tirer parti des techniques avancées de modèle de diffusion pour la super-résolution des images de diffusion sportive |
Auteur : | Alrahel, Badei |
Date de soutenance : | 24-jui-2024/25-jui-2024 |
Promoteur(s) : | Louppe, Gilles |
Membre(s) du jury : | Botta, Vincent
Geurts, Pierre Huynh-Thu, Vân Anh |
Langue : | Anglais |
Nombre de pages : | 86 |
Mots-clés : | [fr] Diffusion models [fr] Super-Resolution |
Discipline(s) : | Ingénierie, informatique & technologie > Sciences informatiques |
Institution(s) : | Université de Liège, Liège, Belgique |
Diplôme : | Master en science des données, à finalité spécialisée |
Faculté : | Mémoires de la Faculté des Sciences appliquées |
Résumé
[fr] This thesis investigates the application of diffusion models to the task of image super-resolution,
a crucial process for enhancing the quality of low-resolution images. Diffusion models, known
for their success in generative tasks, are examined for their potential to outperform traditional
super-resolution methods. The study begins with an overview of generative models, including
Variational Autoencoders (VAEs) and their hierarchical variants, and introduces the core principles
of diffusion models, detailing key processes such as forward diffusion, reverse denoising, and
their three equivalent interpretations.
The literature review covers both regression-based and generative-based super-resolution methods,
highlighting the strengths and limitations of techniques like Generative Adversarial Networks
(GANs) and various diffusion-based models. Notably, models such as SR3 (Super-Resolution via
Iterative Refinement) and Latent Diffusion Models (LDMs) are examined for their efficiency and
performance.
A proof-of-concept implementation of the SR3 model is conducted to demonstrate the practical
application and effectiveness of diffusion models in super-resolution tasks, particularly when
compared to the regression-based method A
2N, with a focus on reconstructing facial images.
The thesis introduces and compares advanced models, Efficient Diffusion Model for Image
Super-resolution by Residual Shifting (ResShift) and Diffusion-Based Image Super-Resolution
in a Single Step (SinSR), which offer significant improvements in inference speed and image
quality. Specifically, an implementation of SinSR on top of ResShift is proposed.
An effective data pipeline utilizing blind super-resolution techniques, along with random cropping
and resizing, is proposed for training the implemented SinSR model. This pipeline is compared
to a simpler cropping-only approach showing the importance of the training data scale that are
given to the model as input.
Extensive experiments were conducted to evaluate the performance of the proposed methods,
focusing particularly on sports broadcast and facial image data. The evaluation includes an
assessment of the performance of the used VQGAN within the SinSR model, an experiment
evaluating the model’s performance relative to varying input face sizes, and a study comparing
the training of the model on blurry versus non-blurry images.
Fichier(s)
Document(s)
Description: -
Taille: 71.78 MB
Format: Adobe PDF
Citer ce mémoire
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.