Faculté des Sciences appliquées
Faculté des Sciences appliquées

Gene regulatory network inference from observational and interventional expression data

Smetz, Colin ULiège
Promotor(s) : Geurts, Pierre ULiège
Date of defense : 26-Jun-2017/27-Jun-2017 • Permalink :
Title : Gene regulatory network inference from observational and interventional expression data
Author : Smetz, Colin ULiège
Date of defense  : 26-Jun-2017/27-Jun-2017
Advisor(s) : Geurts, Pierre ULiège
Committee's member(s) : Wehenkel, Louis ULiège
Huynh-Thu, Vân Anh ULiège
Meyer, Patrick ULiège
Language : English
Number of pages : 84
Keywords : [en] gene regulatory network
[en] machine learning
[en] random forest
[en] enriched random forest
[en] knockout
[en] GRN inference
[en] Z-score
Discipline(s) : Engineering, computing & technology > Computer science
Target public : Researchers
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculty: Master thesis of the Faculté des Sciences appliquées


[en] The problem of reverse-engineering biological networks has attracted a lot of attention in the last decades. Studying the interactions occurring inside a living organism is of great importance to understand the behavior of biological systems. The development of computer science and the abundance of new genetic data raised the question of predicting gene regulatory networks. These networks describe how some genes regulate the expression of some other genes.

Many methods have already been developed to infer these networks from gene expression data. Among them, GENIE3, a method based on Random Forests, was proposed and achieved state-of-the-art performance. However, one drawback of GENIE3 is its inability to use the specificities of some types of gene expression measurements, potentially missing useful information. In particular, datasets often include knockouts, which are measurements done after the deletion of a gene.

This thesis proposes new variants for GENIE3, based on the idea of enriched random forests, in order to integrate knockout specific information as weights guiding GENIE3 to a better prediction. First, the methods are tested on ideal cases where a knockout of every gene is available. Better predictions are indeed achieved and several ways of achieving the best results are highlighted. Realistic cases are then tested. Less convincing results are then obtained, although interesting phenomena are discovered.

The second part of the thesis studies the possibility of predicting the effect of knockouts. Differences and similarities with the GRN prediction problem are analyzed and a method of evaluation, although imperfect, is proposed. Several methods are then evaluated, showing relatively encouraging results. Some initiated reflections call for future developments.

The possibility of using the proposed weighted GENIE3 methods in other situations is also briefly explained. Important improvements are indeed achieved on several datasets without the use of knockouts.



Access MasterThesis_ColinSmetz.pdf
Description: -
Size: 2.79 MB
Format: Adobe PDF


  • Smetz, Colin ULiège Université de Liège > Master ingé. civ. info., à fin.


Committee's member(s)

  • Wehenkel, Louis ULiège Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
    ORBi View his publications on ORBi
  • Huynh-Thu, Vân Anh ULiège Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
    ORBi View his publications on ORBi
  • Meyer, Patrick ULiège Université de Liège - ULg > Département des sciences de la vie > Biologie des systèmes et bioinformatique
    ORBi View his publications on ORBi
  • Total number of views 207
  • Total number of downloads 554

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.