Towards GPU-accelerated Direct and Iterative Solvers for Computational Wave Scattering in HELM
Distrée, Florent
Promotor(s) :
Arnst, Maarten
Date of defense : 8-Sep-2025/9-Sep-2025 • Permalink : http://hdl.handle.net/2268.2/24701
Details
| Title : | Towards GPU-accelerated Direct and Iterative Solvers for Computational Wave Scattering in HELM |
| Translated title : | [fr] Accélération par le GPU de la simulation d'éléments finis avec application à la nano-optique informatique 3D |
| Author : | Distrée, Florent
|
| Date of defense : | 8-Sep-2025/9-Sep-2025 |
| Advisor(s) : | Arnst, Maarten
|
| Committee's member(s) : | Geuzaine, Christophe
Bogdanowicz, Janusz Tomasetti, Romin
|
| Language : | English |
| Number of pages : | 104 |
| Discipline(s) : | Engineering, computing & technology > Multidisciplinary, general & others |
| Institution(s) : | Université de Liège, Liège, Belgique |
| Degree: | Master en ingénieur civil physicien, à finalité approfondie |
| Faculty: | Master thesis of the Faculté des Sciences appliquées |
Abstract
[en] This thesis contributes towards the development of GPU-accelerated direct and iterative solvers for computational wave scattering.
This research was carried out in the context of experimental Finite Element code HELM developed in a computational stochastic research group.
The research extends the physical modeling capabilities of the HELM C++ framework by formalizing a scattered-field weak formulation with Perfectly Matched Layers to accurately model complex, heterogeneous media.
The core of the work investigates two primary solution strategies on modern GPU architectures.
First, a numerical performance evaluation of GPU-accelerated direct solvers is presented, demonstrating that the NVIDIA cuDSS library offers a performance advantage over both its predecessor, cuSOLVER, and CPU-based solvers like KLU2, achieving speedups of up to an order of magnitude on a single GPU.
Second, an iterative solver based on a Multigrid Shifted Laplacian preconditioner is explored. In particular, an implementation of this preconditioner in the MueLu package from Trilinos is used. Several contributions were made to the MueLu implementation to enable its use with general ordinal types, as well as to permit a nearly complete GPU-resident multigrid V-cycle for complex arithmetic.
A comparative scaling analysis reveals that while the GPU direct solver is faster for the tested 2D problems, the iterative multigrid method exhibits superior asymptotic complexity, suggesting it is a viable path for large-scale 3D and parallel problems. Key limitations, including the parallel scalability of the Restricted Additive Schwarz smoother and remaining host-bound computations, are identified. Future work could focus on integrating Optimized Schwarz methods and achieving a fully GPU-resident workflow by offloading the coarse-grid solve and matrix reordering.
File(s)
Document(s)
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.

Master Thesis Online


All files (archive ZIP)
Master_Thesis_Final.pdf