Conditions for Microbial Metabolite Biosynthesis Activated Transcription: development and assessment of the COMMBAT methodology
Ribeiro Monteiro, Silvia
Promotor(s) : Rigali, Sébastien ; Tocquin, Pierre
Date of defense : 4-Sep-2023 • Permalink : http://hdl.handle.net/2268.2/18604
Details
Title : | Conditions for Microbial Metabolite Biosynthesis Activated Transcription: development and assessment of the COMMBAT methodology |
Author : | Ribeiro Monteiro, Silvia |
Date of defense : | 4-Sep-2023 |
Advisor(s) : | Rigali, Sébastien
Tocquin, Pierre |
Committee's member(s) : | Baurain, Denis
Bouché, Frédéric GOFFIN, Philippe |
Language : | English |
Number of pages : | 75 |
Discipline(s) : | Life sciences > Microbiology Life sciences > Biochemistry, biophysics & molecular biology |
Research unit : | Center of Protein Engineering (CIP) - Uliege |
Institution(s) : | Université de Liège, Liège, Belgique |
Degree: | Master en bioinformatique et modélisation, à finalité approfondie |
Faculty: | Master thesis of the Faculté des Sciences |
Abstract
[en] The biosynthetic gene clusters (BGCs) of Actinobacteria code for secondary metabolites that often present interesting medical properties like antiviral, antifungal or antibacterial properties. However, these BGCs are often not expressed in laboratory conditions where bacteria are cultivated under rich nutrient conditions. The objective of this Master thesis is to contribute in setting up an automated methodology for fast, reliable, and exhaustive identification of BGCs – either cryptic or associated with known natural products – whose expression responds to a specific environmental cue. The developed methodology named COMMBAT (COnditions for Microbial Metabolite Biosynthesis Activated Transcription) is based on the detection of cis-acting elements bound by a well-studied transcription factor. The methodology is divided into four main steps: 1) Creation of a position weight matrix (PWM) of a transcription factor’s cis-acting elements; 2) Identification of all BGCs from downloaded genome sequences; 3) Scan of the BGCs with the PWM created at step 1; 4) Analysis of the output generated at step 3 to identify BGCs (either known or cryptic) that would reliably respond to a specific environmental signal. This methodology attributes two scores to a predicted BGC: i) the ‘novelty’ score to quantify how much a BGC is similar to known BGCs, and ii) the ‘expression’ score to evaluate the probability that the expression of a BGC could be controlled by an environmental signal of interest. The chosen regulator to test the methodology is the CebR repressor that is able to bind a 14-nt sequence and whose DNA-binding ability is inhibited upon cellobiose and cellotriose-binding. The COMMBAT methodology is tested on available genomes of Streptomyces pathogenic strains that are associated with the common scab disease on root and tuber crops. The training set has been chosen to guarantee the presence of a ‘positive control’, that is a strain (Streptomyces scabiei 87-22) that contains the thaxtomin cluster, producing the phytotoxins, that is known to be controlled by CebR. The COMMBAT methodology is first used to predict the extent to which the cello-oligosaccharide-mediated pathway for thaxtomin production is conserved amongst pathogenic Streptomyces species. The analysis of the output reveals that most of the pathogenic Streptomyces strains have a conserved cellobiose/cellotriose-mediated regulation of thaxtomin ; the most remarkable exception being three ipomoeae strains that specifically colonize sweet potatoes. Secondly, a visual representation of the COMMBAT results is proposed and discussed. The representation combines the two attributed scores (novelty and expression scores) and facilitates the identification of BGCs (either known or cryptic) that would reliably respond to a specific environmental signal.
File(s)
Document(s)
Description:
Size: 6.47 MB
Format: Adobe PDF
Cite this master thesis
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.