Master thesis : Performance evaluation and optimization of a GPU-enabled Discontinuous Galerkin code

Master thesis : Performance evaluation and optimization of a GPU-enabled Discontinuous Galerkin code

D'Antonio, Marco

Date of defense : 5-Sep-2022/6-Sep-2022 • Permalink : `http://hdl.handle.net/2268.2/15924`

Details

Title :	Master thesis : Performance evaluation and optimization of a GPU-enabled Discontinuous Galerkin code
Author :	D'Antonio, Marco
Date of defense :	5-Sep-2022/6-Sep-2022
Advisor(s) :	Geuzaine, Christophe
Committee's member(s) :	Cicuttin, Matteo Hillewaert, Koen Arnst, Maarten
Language :	English
Discipline(s) :	Engineering, computing & technology > Computer science
Target public :	Researchers Professionals of domain Student
Institution(s) :	Université de Liège, Liège, Belgique Università degli Studi di Salerno, Fisciano, Italia
Degree:	Cours supplémentaires destinés aux étudiants d'échange (Erasmus, ...)
Faculty:	Master thesis of the Faculté des Sciences appliquées

Abstract

[en] Modern supercomputers adopt the use of GPUs to enable better performance on many problems, but developing parallel applications that run at high performance requires a thorough understanding of the hardware and software platforms.
Numerical electromagnetics for example, is one of the fields that benefit from modern HPC machines, with various numerical methods that showed improved performance after implementation on GPU.
In particular, Discontinuous Galerkin Time Domain methods are usually implemented on GPUs for their scalability.

Gmsh DG, developed at the Applied and Computational Electromagnetics research group of the University of Liège, is a solver for Maxwell's equations using the Discontinuous Galerkin method, targeting high-performance parallel systems.
This thesis aimed to implement performance optimizations, a thorough performance analysis and support for multiple GPUs systems.

During the work two optimization were implemented, allowing to improve the overall application performance by reducing memory traffic, increasing locality and enabling the use of compiler optimizations.

The performance of the application were evaluated on real-world problems, performing scaling analysis on a multiprocessor system, showing a perfect scaling up to the bandwidth saturation of the NUMA domains of the AMD processors used for testing.
Furthermore, the results show that in order to outperform single GPU execution, about 64 dedicated cores are required.
The evaluation was also carried out for the single computational kernels, highlighting how all of them, both on CPU and GPU, exploit to the maximum the bandwidth available and especially for high orders of approximation some kernels show performance very close to the maximum peak achievable by the hardware.

Finally, the work focused on implementing multi-GPU support for the application and testing its performance on the available platform, our measurement show that the solver can achieve good performance that become optimal as the problem size increases.

File(s)

Document(s)

thesis.pdf
Description: Thesis
Size: 3.67 MB
Format: Adobe PDF

abstract.pdf
Description: Summary
Size: 58.87 kB
Format: Adobe PDF

illustration.pdf
Description: Illustration of results
Size: 243.53 kB
Format: Adobe PDF

Cite this master thesis

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.

MASTER THESIS

Master thesis : Performance evaluation and optimization of a GPU-enabled Discontinuous Galerkin code

D'Antonio, Marco

Promotor(s) : Geuzaine, Christophe

Date of defense : 5-Sep-2022/6-Sep-2022 • Permalink : http://hdl.handle.net/2268.2/15924

Details

Abstract

File(s)

Document(s)

Author

Promotor(s)

Committee's member(s)

Cite this master thesis

Date of defense : 5-Sep-2022/6-Sep-2022 • Permalink : `http://hdl.handle.net/2268.2/15924`