Master thesis : Optimization and machine learning techniques in flow cytometry
Waltregny-Dengis, Karl
Promoteur(s) : Louveaux, Quentin
Date de soutenance : 27-jui-2022/28-jui-2022 • URL permanente : http://hdl.handle.net/2268.2/14579
Détails
Titre : | Master thesis : Optimization and machine learning techniques in flow cytometry |
Auteur : | Waltregny-Dengis, Karl |
Date de soutenance : | 27-jui-2022/28-jui-2022 |
Promoteur(s) : | Louveaux, Quentin |
Membre(s) du jury : | Wehenkel, Louis
Cornélusse, Bertrand |
Langue : | Anglais |
Nombre de pages : | 81 |
Discipline(s) : | Ingénierie, informatique & technologie > Ingénierie électrique & électronique |
Institution(s) : | Université de Liège, Liège, Belgique |
Diplôme : | Master : ingénieur civil électricien, à finalité spécialisée en "electronic systems and devices" |
Faculté : | Mémoires de la Faculté des Sciences appliquées |
Résumé
[en] This study is based on two previous studies: ’Clustering and Kernel Density Estimation for Assessment of Measurable
Residual Disease by Flow Cytometry’ and Comparison of Three Contour Gating Methods in Flow Cytometry. These
two studies aimed to identify suspicious clusters of AML (Acute Myeloid Leukemia) patient cells using different
clustering techniques. The contour lines of the clusters are ellipsoids. A cluster is suspicious if it presents a large
abnormality ratio (AR1). The data of the cells are obtained using Multiparameter Flow Cytometry (MFC). These
data require a logicle transformation to be usable by cytometrist. After the LT, it has been observed that the
data distribution tends to be a multivariate normal distribution (MND). The graphical representations of the PDF
(probability density function) of MND are ellipsoids. Hence, if it is possible to improve the MND of the data resulting
from the LT, it would be very appropriate to use ellipsoids as contour lines to determine the cluster since the graphical
representation of the PDF of an MND is an ellipsoid. The first part of the study is to optimise the parameters of
the LT to improve the MND of the data. It has been proved that it was possible to optimise the LT parameters. A
normality test: the Agostino test is used to quantify the normality of the set to determine which combination of the
parameters gives the best MND of the set.
The second part of this study is to identify new potential clustering techniques. Two usual clustering methods are
proposed: the Gaussian mixture method and the k-means. In addition, two methods, Gap statistic and the silhouette
coefficient, that determine in how many clusters the data set must be divided are introduced. These methods estimate
the optimal number of clusters and could help the cytometrist in tagging the suspicious clusters. The techniques have
been tested in terms of AR and computation time. The best techniques are k-means and the silhouette coefficient.
However, the clustering techniques proposed in the previous studies and paragraphs do not optimise the clusters’
contour lines to maximise the AR. So, the problem is written as an optimisation problem (OP). The OP objective
is to maximise the AR by updating the contour lines of the clusters. The formulation of the problem is non-convex
and discrete. It is impossible to solve it in a decent time. After approximating the discrete problem in a continuous
problem, the gradient descent is used to solve the continuous OP. The results conclude that this clustering technique
optimises the cluster’s contour lines to maximise the AR.
Finally, the initial OP is reformulated and solved. It gives very good results in terms of AR. However, the
the computation time of this formulation is too large to be used in practical applications.
The two initial objectives of this study are full-filled. First, it has been proven that it is possible to optimise the LT
parameters to improve the MND of the data set. Then, the problem has been formulated as an optimisation problem
that maximises the AR. A technique, gradient descent, manages to solve such an OP.
Fichier(s)
Document(s)
Annexe(s)
Citer ce mémoire
L'Université de Liège ne garantit pas la qualité scientifique de ces travaux d'étudiants ni l'exactitude de l'ensemble des informations qu'ils contiennent.