Autonomous Drone Control: A Reinforcement Learning Approach
Hansen, Julien
Promotor(s) :
Ernst, Damien
Date of defense : 30-Jun-2025/1-Jul-2025 • Permalink : http://hdl.handle.net/2268.2/23358
Details
| Title : | Autonomous Drone Control: A Reinforcement Learning Approach |
| Translated title : | [fr] Contrôle autonome de drones : Une approche par apprentissage par renforcement |
| Author : | Hansen, Julien
|
| Date of defense : | 30-Jun-2025/1-Jul-2025 |
| Advisor(s) : | Ernst, Damien
|
| Committee's member(s) : | Geurts, Pierre
Leroy, Pascal
|
| Language : | English |
| Number of pages : | 47 |
| Keywords : | [en] Reinforcement Learning [en] Drone [fr] IsaacLab [fr] policy gradient method [fr] QuadCopter |
| Discipline(s) : | Engineering, computing & technology > Computer science |
| Target public : | Researchers Professionals of domain Student |
| Institution(s) : | Université de Liège, Liège, Belgique |
| Degree: | Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems" |
| Faculty: | Master thesis of the Faculté des Sciences appliquées |
Abstract
[fr] Drones have become an essential tools across a wide range of industries, from agri-
culture to surveillance, and are increasingly deployed in military contexts for detection,
recognition, identification, exploration, and combat purposes. While most systems remain
controlled by human, the shift toward autonomy is intensifying, driven by breakthroughs
in artificial intelligence, notably in reinforcement learning and scalable simulation tech-
niques.
This Master’s thesis explores the potential of reinforcement learning for drone control
within both single-agent and multi-agent frameworks. Two tasks are addressed : naviga-
tion in unknown terrains and adversarial drone combat. Our work focuses on designing
simulation environments that model the learning process of agents as they interact with
these tasks. Our navigation environment consists of multiple randomly spaced obstacles
(spikes), a target, and a drone placed on opposite sides of the terrain. The drone is equip-
ped with a sensor—either a LiDAR or a camera—which it uses to explore the environment
and reach the target. In the adversarial scenario, the environment includes two drones :
an attacker and a defender. The attacker attempts to reach a designated target, while the
defender tries to intercept it by colliding with it.
Reinforcement learning is particularly well suited to these tasks due to its ability to
learn complex, sequential decision-making policies from interaction with the environment.
In scenarios such as drone navigation or combat, where the environment is often partially
observable, highly dynamic, and difficult to model analytically, RL offers a flexible and
data-driven approach to learning effective control strategies. Furthermore, Reinforcement
learning naturally supports learning in multi-agent settings, where agents must coordinate
or compete in real time.
To tackle these tasks, policy gradient methods such as Proximal Policy Optimization
, its multi-agent extension Independent Proximal Policy Optimization and a variant ins-
pired by self-play methods were explored. To train and evaluate our agents, IsaacLab
environments were designed following the formalism of partially observable Markov deci-
sion process and stochastic games. Our work highlights the performance of trained agents
regarding these tasks and show promising potential for future improvements regarding
autonomous drone control
File(s)
Document(s)
Annexe(s)
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.

Master Thesis Online


All files (archive ZIP)
TFE_HANSEN_Julien.pdf