**This is the mirror of the original
site maintain by LIT.
Workshop on Bioinformatics and Protein
Interaction
Jointly organized
by Institute for Mathematical Sciences (IMS)
and Laboratories
for Information Technology (LIT)
Program Information · Workshop Schedule · Speakers · Abstracts
Terry Lybrand
Molecular
modeling of protein-ligand interactions: Detailed
simulations of a biotin-streptavidin complex
|
The complex formed between
biotin and streptavidin is the strongest noncovalent
complex known in nature, and serves as a paradigm
for ultra-high affinity ligand binding. We have used
high-resolution x-ray crystallography, site-directed
mutagenesis, calorimetry, and computer simulation
methods to determine how streptavidin forms such an
extraordinarily tight complex with biotin. This
multi-disciplinary approach has yielded some
exciting and unexpected results. The basic
experimental and computational strategies used in
this project should be of general utility for
analysis of other high-affinity protein-ligand and
protein-protein complexes. Knowledge gained from
study of this system should also be useful in design
of high-affinity ligands for protein targets in the
future.
|
Terry Lybrand
Molecular
Modeling of Protein Structure and Function
|
Molecular modeling tools
can be used to supplement information about protein
structure and function obtained from experimental
techniques. In cases where it is impractical or even
impossible to obtain direct structural information
from experimental studies, molecular modeling tools
can often be used to obtain useful information about
protein structure and function. I will describe
general modeling strategies that may be used to
explore protein structure and structure-function
relationships for several different categories of
problems: 1) cases where there is little direct
experimental data available, such as G
protein-coupled receptors 2) cases where there is
extensive experimental data, but no direct
structural information, such as bacterial chemotaxis
receptors, and 3) cases where there is experimental
structural information for important protein family
members, but no direct information for a specific
protein of interest (E.g., class II Major
Histocompatibility Complex proteins).
|
Peter
Kuhn
Structural
genomics of Thermotoga maritima - data mining
and target prioritization using protein-protein
interaction |
Structural
and functional genomics approaches require an
industrialization of traditional methods in the
determination of the three dimensional structures of
biologicalmacromolecules. This requires the
development, implementation and operation of
high-throughput methodology for a structure
determination pipeline addressing both, shortening
the time from target identification to structure
solution and number of targets processed in parallel
while maintaining high data integrity quality of
analysis. The Joint Center for Structural Genomics
and its partners have implemented a complete
high-throughput system that enables target
processing from selection to sample preparation to
structure determination. The completion of the
development phase of the high-throughput subsystems
and its full integration into a structure
determination pipeline has enabled the initial
processing of the Thermotoga maritima genome.
Both, a description of the technology and early
results from the proteome analysis of T. maritima will be discussed. Data gathering throughout the
process allows for detailed analysis of experimental
approaches and optimization of individual process
steps.
|
R.
Manjunatha Kini
Analysis of
Protein sequences and Identification of
Protein-Protein Interaction Sites |
Proteins
are biological “foot soldiers” in living cells,
and protein-protein interactions are crucial for the
existence of life itself. Despite their elaborate structures, proteins
interact with their complementary partners through a
rather small but specific portion of their surface. In the post-genomic era, sequence analysis is
one of the crucial and daunting tasks. Identification of these interaction sites and
understanding the “communication” between
protein partners in a specific physiologic system is
one of the final goals in structural biology,
proteomics and chemical biology. Recently, we systematically examined the
flanking segments of over 1600 protein-protein
interaction sites. This survey indicated that
proline residues are commonly found in these
flanking segments and the probability of occurrence
in flanking segments is 2.5 times greater than
elsewhere in the molecule. Based on these observations, we proposed a
structural role for proline residues in protecting
the conformation and integrity of the interaction
site by blocking the “invasion” of neighboring
secondary structures. They also help in presenting the sites to
their complementary proteins. As a corollary, we have developed methods (a)
to design potent bioactive peptides and (b) to
identify protein-protein interaction sites directly
from the amino acid sequence. These studies provide strong experimental
evidence for the structural role of proline residues
in the flanking segments of protein-protein
interaction sites.
|
Bryan Grieg Fry
The Three
Finger Toxin Toolkit |
Many
venom components are invaluable in molecular,
biochemical and biomedical research due to their
specificity and potency. Facilitating this is the tremendous natural
variation in venom components between species or
even within a species itself. This is because the
variation in snake venom components apparently
results from frequent gene duplication of
toxin-encoding genes, which is sometimes followed by
functional and structural diversification and
accelerated rates of sequence evolution. The three
finger toxins family of snake venom peptides is
particularly a good source that shows significant
functional diversity through seemingly rapid rate of
mutation. In this study, we carried out a phylogenetic
analysis of 276 sequences from this family. The results revealed a diversity in the
toxins far greater than has been previously realised.
A substantial number of the toxins did not
fall within any of the toxin clades with
characterised activity, and further lacked the
functional motifs of those groups. We identified
twenty such "orphan groups". The phylogenetic pattern revealed in the case
of the three finger toxins is consistent with that
expected under the birth-and-death model of gene
evolution. The 'three finger toxin toolkit'
constructed by this study will be useful in
providing a better picture of the diversity of
investigational ligands available within this
important class.
|
Jeremiah Stanson
Joseph
Conserved
Codon Composition of Ribosomal Protein-coding Genes in Escherichia coli, M. tuberculosis and Saccharomyces cerevisiae:
Lessons from Supervised Machine Learning in
Functional Genomics |
Genomics
projects have resulted in a flood of sequence data. Functional annotation currently relies almost
exclusively on inter-species sequence comparison,
and is restricted in cases of limited data from
related species, and widely divergent sequences with
no known homologues. Here, we demonstrate that codon
composition - a fusion of codon usage bias and amino
acid composition signals - can accurately
discriminate, in the absence of sequence homology
information, cytoplasmic ribosomal protein genes
from all other genes of known function in Saccharomyces
cerevisiae, Escherichia
coli and Mycobacterium
tuberculosis using an implementation of
support vector machines, SVMlight.
Analysis of these codon composition signals is
instructive in determining features that confer
individuality to ribosomal protein genes. Each of
the sets of positively charged, negatively charged and small hydrophobic residues,
as well as codon bias, contribute to their
distinctive codon composition profile. The
representation of all these signals are sensitively
detected, combined and augmented by the SVMs to perform an accurate classification. Of special
mention is an obvious outlier - yeast gene RPL22B -
highly homologous to RPL22A, but employing very
different codon usage, perhaps indicating
non-ribosomal function. Finally, we propose that codon composition be used
in combination with other attributes in gene/protein
classification by supervised machine learning
algorithms.
|
Judice Koh
BioWare -
the data warehousing system for molecular biology |
Biological
databases keep growing exponentially. This growth is
reflected both in the rapid growth of existing
databases and in proliferation of new databases. A
major concern for molecular biologists is the access
to and the selection of data relevant to their
research from the vast pool of biological
information. The recently introduced technologies
for Knowledge Discovery from Databases (KDD) and
Data Mining enable the extraction of new knowledge
(concepts, patterns, or explanations, among others)
from the data stored in databases. An infrastructure
for KDD typically requires: a) mechanism to
facilitate the construction of a subject-specific
data warehouse (SSDW) from the diverse data sources,
b) integration of tools to enable data mining from
the SSDW, c) automated updating of the SSDW, and d)
the ability to easily integrate both new data
sources and new analysis tools. We have designed
BioWare - a system that provides a framework for
biological data mining. The BioWare prototype
provides three separate subsystems. First, BioWare-Retrieve
extracts data from multiple public database sources
and integrates them into a singular dataset. Second,
BioWare-Prep provides a semi-automated mechanism to
biologists for rapid annotation of database entries,
including the addition of new data fields. Third,
the TEMPLAR subsystem facilitates a rapid creation
of a new searchable subject-specific data warehouse
integrated with the searh tools. We used the BioWare
system to create a database of snake toxins.
|
Kelathur Nadathur
Srinivasan
Identification
of Functional Residues in Scorpion Toxins: A
Bioinformatics Approach |
An
important and exciting challenge in the post-genomic
era is to understand the functions of newly
discovered proteins based on their structures. The
main thrust is to find the common structural motifs
that contribute to specific functions. Using this
premise, we have identified a novel class of weak
potassium channel toxins from the venom of the
scorpion Heterometrus fulvipes. These toxins, k-hefutoxin1
and k-hefutoxin2,
exhibit no homology to any known toxins. NMR studies
indicate that k-hefutoxin1adopts
a unique three-dimensional fold of two parallel
helices linked by two disulfide bridges without any b-sheets.
Based on the presence of the functional diad (Y5/K19)
at
a distance (6.0 ± 1.0 Å) comparable to other
potassium channel toxins, we hypothesized its
function as a potassium channel toxin. k-Hefutoxin1
not only blocks the voltage-gated K+-channels: Kv1.3 and
Kv1.2, but also slows the activation kinetics of Kv1.3 currents - a novel feature of k-hefutoxin1
unlike other scorpion toxins, which are considered
solely pore-blockers. Alanine mutants (Y5A, K19A and
Y5A/K19A) failed to block the channels indicating
the importance of the functional diad.
|
Paul Tan Tiam Joo
Bioinformatics
approach to structure-function study of scorpion
toxins |
Scorpion
toxins have been used as research tools for
characterisation of various ion channels,
preparation of vaccines and antitoxins, drugs,
insecticides and in phylogenetic studies. However,
multiple research groups have focused on isolating,
purifying and characterising individual toxins, or
small groups of toxins. Consequently, there is an
increasing number of characterised scorpion toxins
reported in literature and molecular databases.
Current scorpion toxin data are scattered across
multiple databases, or reported as long lists of
aligned sequences in literature reviews. Thus, it is
becoming more difficult for researchers to get an
overall view of the structure-function relationship
and classification of scorpion toxins. SCORPION
database contains at least 300 scorpion sequences
classified into defined categories based on primary
sequence homology. This talk focuses on systematic
bioinformatic-based approach in the study of
structure-function relationship of scorpion toxins.
Sequences in the defined categories were further
classified into basic
structure-function units so that each unit
contains peptides that share same functional and
structural properties. By comparing peptides that
have similar structures and different functional
properties, in the context of their units, putative
functional sites and sequence motifs related to
specific functions could be identified. These motifs
will be used to built prediction tools for
annotation of newly identified scorpion toxins.
|
Shoba Ranganathan
Locating
the polyanion-binding residues in the'sushi' domain 7
of human complement factor H |
Factor
H, a secretory protein comprising 20 short consensus
repeat (SCR) or 'sushi' domains of about 60 amino
acids each, is a regulator of the complement system,
an alternate pathway in immune response. The
complement-regulatory functions of factor H are
targeted by its binding to polyanoins such as
heparin and sialic acid, involving SCRs 7 and 20.
The SCR 7 heparin-binding site was shown to be
co-localized with the Streptococcus Group A M
protein-binding site. We have a combination of
sequence analysis of all heparin-binding domains of
factor H and its closest homologues, molecular
modeling of SCRs 6 and 7, and surface electrostatic
potential studies. The residues implicated in
heparin/sialic acid binding to SCR 7 have been
localized to four regions of sequence space,
containing stretches of basic as well as
histidine residues. The heparin-binding site is
spatially compact and lies near the interface
between SCRs 6 and 7, with residues in the
interdomain linker playing an important part.
Experimental mutation results of the proposed
heparin-binding residues will be presented.
|
Kelathur Nadathur
Srinivasan
k-Hefutoxin1,
a Novel Toxin from the Scorpion Heterometrus
fulvipes with Unique Structure and Function:
Importance of the
Functional Diad in Potassium Channel Selectivity |
An
important and exciting challenge in the post-genomic
era is to understand the functions of newly
discovered proteins based on their structures. The
main thrust is to find the common structural motifs
that contribute to specific functions. Using this
premise, we have identified a novel class of weak
potassium channel toxins from the venom of the
scorpion Heterometrus
fulvipes. These toxins, k-hefutoxin1
and k-hefutoxin2,
exhibit no homology to any known toxins. NMR studies
indicate that k-hefutoxin1adopts
a unique three-dimensional fold of two parallel
helices linked by two disulfide bridges without any b-sheets.
Based on the presence of the functional diad (Y5/K19)
at
a distance (6.0 ± 1.0 Å) comparable to other
potassium channel toxins, we hypothesized its
function as a potassium channel toxin. k-Hefutoxin1
not only blocks the voltage-gated K+-channels: Kv1.3 and
Kv1.2, but also slows the activation kinetics of Kv1.3 currents - a novel feature of k-hefutoxin1
unlike other scorpion toxins, which are considered
solely pore-blockers. Alanine mutants (Y5A, K19A and
Y5A/K19A) failed to block the channels indicating
the importance of the functional diad.
|
Eastwood Leung
Strategies
for large scale protein-protein interaction studies |
Experimental
approach of large scale protein:protein interaction
studies remains to be challenging. With the advent
of high-throughput protein production, genome-wide
scale protein:protein interaction becomes feasible.
This lecture will cover present strategies of
high-throughput protein production and platforms for
genome-wide scale protein:protein interaction such
as protein microarray and mass spectrometry. Future
challenges of this field will also be discussed.
|
Prasanna Kolatkar
Prediction
of Peptide Binding to Families of Related Receptors |
The
Protein-Protein Interactions Database (PpiDB) has
been created to help understand several functional
and evolutionary relationships existing in
biological knowledge. Protein-protein interactions
have been predicted using the "Rosetta
stone" approach described by Ed Marcotte
(Eisenberg Lab, UCLA). The basic principle of this
idea is that certain proteins exist as large
multifunctional proteins within one species, while
their corresponding functions in another species are
carried out by smaller individual proteins. The
larger multifunctional protein thus serves as a
"Rosetta Stone" for predicting
protein-protein interactions within the latter
species. We have taken this approach and augmented
it by basing the interaction prediction on domain
information (Pfam) rather than sequence. This does
result in a large amount of false positives but that
is where the next set of steps is implemented.
|
Vladimir Brusic
Prediction
of Peptide Binding to Families of Related Receptors |
Major
histocompatibility complex (MHC) molecules bind
peptides and present them on cell surface for
recognition by T cells of the immune system. MHC
molecules are encoded by genes that show significant
polymorphism. In humans, there are more than 500
characterised allelic variants of MHC for each MHC
class I and class II. We developed a prediction
system called MULTIPRED that was trained using
virtual sequences that represent peptide-MHC
interactions. The virtual sequences were constructed
by combining the interaction sites from both peptide
(ligand) and the receptor (MHC) inferred from the
three-dimensional structure of MHC molecules. We
applied the MULTIPRED system to a selection of human
MHC class II molecules HLA-DR, and to the
human MHC class I superfamily HLA-A2. MULTIPRED
showed high accuracy in predicting HLA-binding
peptides. In addition, we have shown that MULTIPRED
can accurately predict peptide binding to HLA
molecules for which no binding data are available.
|
Program Information · Workshop Schedule · Speakers · Abstracts