Faculté des Sciences appliquées
Faculté des Sciences appliquées

Large-scale gene regulatory network inference from single-cell RNA seq data

Paquot, Sarah ULiège
Promotor(s) : Geurts, Pierre ULiège
Date of defense : 25-Jun-2018/26-Jun-2018 • Permalink :
Title : Large-scale gene regulatory network inference from single-cell RNA seq data
Author : Paquot, Sarah ULiège
Date of defense  : 25-Jun-2018/26-Jun-2018
Advisor(s) : Geurts, Pierre ULiège
Committee's member(s) : Wehenkel, Louis ULiège
Meyer, Patrick ULiège
Huynh-Thu, Vân Anh ULiège
Language : English
Number of pages : 92
Keywords : [fr] machine learning
[fr] XGBoost
[fr] GRN inference
[fr] clustering
[fr] single-cell
Discipline(s) : Engineering, computing & technology > Computer science
Target public : Researchers
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en ingénieur civil en informatique, à finalité spécialisée en "intelligent systems"
Faculty: Master thesis of the Faculté des Sciences appliquées


[en] Uncovering and modeling gene regulatory networks (GRNs) is one of the long-standing
challenges in systems biology. This uncovering implies to computationally predict, from
given gene expression data, direct regulatory interactions between transcription factors
and their target genes. All those predicted direct regulatory interactions form a GRN.
Several techniques have been tested to address this problem. Among those, GENIE3 is one
of the top performing methods. However, it has a big disadvantage, which is its slowness.

Using traditional sequencing methods, only the mean of the gene expression values over
a mix of millions of cells could be obtained. The emergence of new techniques allows the
creation of single-cell RNA-seq data, which contain values corresponding to the expression
level in every single cell. It raises two main challenges. First, a computational challenge,
as it creates much bigger expression matrices than traditional methods. Second, we can
now see different cell types in the data, which we were not able to see before, as we only
had means of expression values from different cells. One strategy is to cluster this data so
that each cluster corresponds to a cell type contained in the data.

Our contribution in this context is first to propose a variant of GENIE3 that uses boosting
in order to make it faster and applicable to single-cell datasets. The results obtained are
very promising, as this transforms GENIE3 from a very slow method to a very fast one,
while having the same - and sometimes better - performance. The boosting method has
however the drawback of depending on many parameters. Our second contribution is to
propose three regulatory network-based methods for cell clustering from single-cell data.
Results obtained were not as good as expected but call for more investigations in this way.
Better results could probably be obtained by further analyzing some parameters.



Access thesis.pdf
Description: -
Size: 2 MB
Format: Adobe PDF
Access abstract.pdf
Description: -
Size: 175.85 kB
Format: Adobe PDF


  • Paquot, Sarah ULiège Université de Liège > Master ingé. civ. info., à fin.


Committee's member(s)

  • Wehenkel, Louis ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
    ORBi View his publications on ORBi
  • Meyer, Patrick ULiège Université de Liège - ULiège > Département des sciences de la vie > Biologie des systèmes et bioinformatique
    ORBi View his publications on ORBi
  • Huynh-Thu, Vân Anh ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
    ORBi View his publications on ORBi
  • Total number of views 104
  • Total number of downloads 16

All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.