Gene regulatory network inference from observational and interventional expression data
Promotor(s) : Geurts, Pierre
Date of defense : 26-Jun-2017/27-Jun-2017 • Permalink :
|Gene regulatory network inference from observational and interventional expression data
|Date of defense :
|Committee's member(s) :
Huynh-Thu, Vân Anh
|Number of pages :
|[en] gene regulatory network
[en] machine learning
[en] random forest
[en] enriched random forest
[en] GRN inference
|Engineering, computing & technology > Computer science
|Target public :
|Université de Liège, Liège, Belgique
|Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
|Master thesis of the Faculté des Sciences appliquées
[en] The problem of reverse-engineering biological networks has attracted a lot of attention in the last decades. Studying the interactions occurring inside a living organism is of great importance to understand the behavior of biological systems. The development of computer science and the abundance of new genetic data raised the question of predicting gene regulatory networks. These networks describe how some genes regulate the expression of some other genes.
Many methods have already been developed to infer these networks from gene expression data. Among them, GENIE3, a method based on Random Forests, was proposed and achieved state-of-the-art performance. However, one drawback of GENIE3 is its inability to use the specificities of some types of gene expression measurements, potentially missing useful information. In particular, datasets often include knockouts, which are measurements done after the deletion of a gene.
This thesis proposes new variants for GENIE3, based on the idea of enriched random forests, in order to integrate knockout specific information as weights guiding GENIE3 to a better prediction. First, the methods are tested on ideal cases where a knockout of every gene is available. Better predictions are indeed achieved and several ways of achieving the best results are highlighted. Realistic cases are then tested. Less convincing results are then obtained, although interesting phenomena are discovered.
The second part of the thesis studies the possibility of predicting the effect of knockouts. Differences and similarities with the GRN prediction problem are analyzed and a method of evaluation, although imperfect, is proposed. Several methods are then evaluated, showing relatively encouraging results. Some initiated reflections call for future developments.
The possibility of using the proposed weighted GENIE3 methods in other situations is also briefly explained. Important improvements are indeed achieved on several datasets without the use of knockouts.
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.