Novel viewpoint synthesis of sport scenes using broadcast images
|Novel viewpoint synthesis of sport scenes using broadcast images
|Date of defense :
|Committee's member(s) :
|Van Droogenbroeck, Marc
|Number of pages :
|[en] Deep learning
[en] Machine learning
[en] Computer Vision
[fr] Apprentissage profond
[fr] Apprentissage automatique
[fr] Vision par ordinateur
|Engineering, computing & technology > Computer science
|Université de Liège, Liège, Belgique
|Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
|Master thesis of the Faculté des Sciences appliquées
[en] NeRF is a recent method for novel view synthesis and proved its capabilities by enabling the rendering of truly photorealistic novel views of a scene, only leveraging calibrated images. This method is able to render high-quality images when trained with many views densely distributed in translation and rotation around the scene. However, its performances degrade when used in sport broadcasting conditions where the number of cameras is limited and are only able to move in rotation. Furthermore, the original NeRF is designed to work on static scenes and is very slow both for training and inference. Both of these factor limits the application of NeRF in sport broadcasting conditions, where moving elements are abundant and live delivery is required. In this work, we only touch upon the problem of time constraints and leave aside the problem of moving elements. Instead, we focus on the problem of sparse input views of a static scene: we analyse and quantify how the performances of NeRF are limited by the number of viewpoints in the training set. We show that using depth information combined with a depth loss greatly improves results even if we only have partial depth information. We integrate this extension with the nerfacto model, which is an off-the-shelf NeRF model several orders of magnitude faster to train and to render images than the original NeRF. Furthermore, we implement and integrate with nerfacto a patch-based regularization technique, also meant to alleviate the problem of sparse input views. While the latter extension does not bring the expected performance improvement, the resulting model is overall much faster than the original NeRF while providing greatly improved results in a sparse input view setup characteristic of sport broadcasting conditions.
Size: 89.38 MB
Format: Adobe PDF
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.