You are here

Warning message

Attention! This event has already passed.

An integrative structural bioinformatics approach for the study of human genetic variation underlying conformational disease

Monday, 23 June, 2008 - 17:00
Campus: Brussels Humanities, Sciences & Engineering campus
Faculty: Science and Bio-engineering Sciences
Joke Reumers
phd defence

With the completion of the sequencing of the human genome, much attention has been centered
on the study of human genome variability. Single nucleotide polymorphisms (SNPs) are the most common source of human genetic variation, and are undoubtedly a valuable resource for investigating the genetic basis of diseases. SNPs, together with DNA copy number variations (CNVs), have become one of the most actively researched areas of genomics in recent
years (Feuk et al., 2006; Stranger et al., 2007).

In this dissertation we have focused on the smaller of these two types of variation, SNPs.
SNPs are associated with diversity in the population and human individuality, and although
the majority of these variations probably result in neutral phenotypic outcomes, certain polymorphisms can predispose individuals to disease, or influence its severity, progression or individual response to medicine.

Many studies have focused on annotating the phenotypic effects of the different variant
types, with an emphasis on the analysis of coding non synonymous SNPs (nsSNPs). This type
of variation changes the primary sequence of the corresponding protein, and therefore has the
potential to alter its structural and functional properties. The term molecular phenotype of
a single nucleotide polymorphism was first introduced by Bork and co-workers to describe
the ensemble of structural and functional properties that characterise the behaviour of a
SNP (Sunyaev et al., 2001a).

In this dissertation we use state-of-the-art tools in structural bioinformatics to annotate
the effects of single site mutations on the molecular phenotype of a protein, and more specifically
on identifying variations that are responsible for conformational disease. The evaluation
of this integrative structural bioinformatics approach on specific case studies such as carcinogenic
mutations in p53, VHL and BRCA1 and mutations responsible for neurodegenerative
diseases (Parkinson’s disease and Charcot-Marie-Tooth disease) showed that this is an effective
strategy to identify the molecular mechanism of disease.

Critical analyses on the reliability of public domain variation data, the quality of training
and test sets, and the predictability of the molecular phenotype, showed that it is not yet
feasible to use this approach to classify disease associated mutations versus neutral variation

From the combination of these small and large scale studies we concluded that although
strict classification of deleterious and neutral variation is not possible with the current data and tools available, the methodology can be used to provide functional and structural annotations
on human variation. These annotations can be helpful for the identification of the
molecular mechanisms underlying disease, or can be used to prioritise polymorphisms for additional study. Therefore we applied computational tools, covering as many functional and
structural aspects of proteins as possible, to the full set of non synonymous SNPs available in
the Ensembl human variation database. Our predictions are made available to the community
through an online database, SNPeffect.

The quality of SNP annotation can be improved by enhancing data quality, or by extending
the computational coverage of the molecular phenotype. We have contributed to the latter
in two ways. First, we investigated a specific mechanism underlying conformational disease:
the mutation of so-called gatekeeper residues in relation to increased aggregation tendency.
Previous studies have suggested that the disruption of a gatekeeper motif will result in a
strong aggregation increase and might therefore represent a new category of disease-inducing
mutations. This assumption was confirmed by the enrichment of gatekeeper mutations among
known disease mutations in comparison with neutral SNPs. The analysis also revealed that
redundancy of gatekeeper usage is introduced to cap regions with strong aggregation propensity:
regions with high aggregation tendency are capped by more gatekeepers than regions
with low aggregation scores

The second addition to the predictable molecular phenotype involves the development of a
tool that can assess stability effects of mutations on membrane proteins without using structural
information. Building on a statistical thermodynamics approach that has proven to be
successful in predicting stability effects on soluble helices (the Agadir algorithm) and the assessment of aggregation propensities of protein sequences (the Tango algorithm), we developed
Casablanca, a sequence-based stability predictor of integral membrane helices. When applied
to a set of known disease associated mutations and neutral polymorphisms, Casablanca identified
more extreme stability changes among disease mutations in transmembrane helices,
showing that the algorithm is a valuable addition to the set of structural bioinformatics tools
we use to annotate effects of amino acid variations.