Figuring Out Life:
NUS - Karolinska Joint
Symposium on Application of Mathematics in Biomedicine
(28 - 29 Nov 2005)
Bioinformatics workflow integration for biomanufacturing and biosurveillance
Tin Wee Tan, National University of Singapore
In 2004, the A*STAR funded a pilot programme on Integrated Manufacturing Services and Systems (IMSS). Our pilot project in this programme was able to design and demonstrate a pilot prototypic workflow integration system for the biomanufacture and design of diagnostic systems. The systems include the integration of DNA automated sequencers, oligonucleotide synthesis (Novusgene), DNA chip systems (Attogenix) using workflow systems such as KOOPrime, Goalnet and Taverna/MyGrid. Since then we have embarked on expanding this concept to other aspects of workflow integration including biosurveillance.
The concept of workflow integration for biomanufacture stems from the ability of bioinformatics workflow and pipelining systems to integrate information and data flow with machine/device control. We extend this idea of integration to supply chain management systems which interface with the biomanufacturing and design process specifically for diagnostic kits, with the possibility to extend this to linear polymeric therapeutics (siRNA, miRNA, peptides) and prophylactics (peptide and DNA vaccines) in response to massive scale-up in times of biodisasters such as bird flu, SARS, emerging diseases, etc. or biosurveillance in the face of biological warfare and terrorist acts.
« Back
Solvation in proteins: insights from atomistic simulations
Chandra Verma, Bioinformatics Institute of Singapore
The importance of solvation/hydration in proteins is relatively less understood, in particular its role in stability and function.
Computer simulations offer advantages over other experimental techniques to explore this complex feature in great detail.
A set of examples from out owrk and those of others will be presented to highlight advances and shortcomings of the processes involved.
« Back
Protein function prediction from protein interactions
Lim Soon Wong, National University of Singapore
The elucidation of protein function is one of the key problems in computational biology. A recent trend in protein function prediction is
based on the use of protein interaction data. The intuition is that if protein A and B belong to the same functional pathway, A is likely to
interact with B; therefore, when A and B are observed to interact, they are likely to share functions. However, in many cases, the direct partners of a
protein share few or no function with it; instead, the partners of these partners show functional similarity with the protein. We discuss, in this
talk, the plausibility of such indirect functional associations and their use in improving protein function prediction.
« Back
Understanding function using in
silico and experimental structural information
Prasanna Kolatkar, Genome Institute of Singapore
High throughput methods in genomics and proteomics are
allowing us a picture of the overall landscape of many
biological systems through sequence as well interaction
data. The data sets often yield valuable information about
biological function on a grand scale but sometimes don’t
offer specific information about the mechanism of various
biological events. Structural information can be used in
conjunction with these high throughput technologies to yield
structure function relationships which can help to shed
light on various mechanisms in biology. Improvements of in
silico structure methods as well as high(er) throughput
experimental structure determination can allow us to stay
closer to the other methods in terms of volume and helping
us to understand mechanisms using structure-function
relationships. We will show some examples of projects at the
Genome Institute of Singapore for which in silico structure
methods and high(er) throughput experimental data are being
used to help us understand biological function.
« Back
Comparative genomics of
two cyprinid species
Alan Christoffels, Temasek Life Sciences Laboratory,
Singapore
The Cyprinidae family, with more than 2,000
species, is the most abundant and widespread of all
freshwater fish families across Europe, Asia, Africa and
North America. The evolution of cyprinid teleosts has been
impacted significantly by gene and genome duplications and
very likely contributes to the paucity of genome sequence
data for cyprinids, barring the nearly complete sequenced
zebrafish genome. In the absence of genomic data for other
cyprinid species, we embarked on a partial transcriptome
analysis of cyprinids using publicly available common carp
and zebrafish EST data and the genome assembly for zebrafish.
We have generated over 6,000 ESTs from the
differentiating testis of common carp and clustered them
with 10,395 non-gonadal ESTs from CarpBase as well as 660
common carp mRNAs from GenBank. The resulting unique
sequences were subjected to detailed analysis and compared
against zebrafish sequences at the cDNA, protein and genome
level.
We present data to show that there is sufficient homology
between the transcribed sequences of common carp and
zebrafish to warrant a cyprinid transcriptome comparison. We
show that common carp transcripts map to un-annotated
regions and to ab-initio gene predictions on the
zebrafish genome assembly. A substantial portion of our
unique transcripts from common carp seems to be
tissue-specific.
With about 24,000 species, teleost represent the most
diverse group of vertebrates and unlikely to be sampled by
many completely sequenced genomes. Our analyses therefore
illustrate the value in utilizing partially sequenced
genomes and suggest the need for integrated resources to
leverage the wealth of fragmented genomic data.
« Back
Cryo-electron tomography of
individual protein molecules
Sara Sandin, Karolinska Institutet, Sweden
Averaging methods of determining structure, such as X-ray
diffraction, do not preserve information about the
flexibility of molecules. Cryo-electron tomography allows us
to reconstruct individual hydrated objects. The method is
limited to low-resolution, but it can be used to study
dynamic structures, such as very large macromolecular
complexes, and to perform in situ analysis of cellular
organelles. These studies explore the expansion of the cryo-electron
tomography method to individual protein molecules.
Tomographic structures of four proteins, ranging in size
from 90 to 150 kDa, are presented.
We have analysed the structure and flexibility of the
antibody immunoglobulin G (IgG). The tomograms reveal
y-shaped IgG molecules with three protruding subunits. We
show that the tomographic structures are consistent with
X-ray crystallographic structures of IgG and that the three
50 kDa subunits were resolved with accuracy. Each subunit
has a similar structure in the tomograms and in the X-ray
map. Notably, the positions of the subunits differed greatly
from one molecule to another. The large flexibility of IgG
in solution is most likely of functional significance in
antigen recognition. We have investigated a larger number of
individual IgG molecules, measured equilibrium distribution
of the molecule in terms of the relevant angular coordinates
and built a model of the dynamics of IgG in solution.
The hepatocyte growth factor/scatter factor (HGF/SF)
controls the growth, morphogenesis or migration of
epithelial, endothelial and muscle progenitor cells. We have
defined the main conformations of inactive single-chain HGF/SF
and active two-chain HGF/SF. Furthermore we present
structures of the receptor tyrosine kinase MET and of MET
bound to two-chain HGF/SF. These structures reveal the
mechanism of HGF/SF activation and clarify the mode of
binding to MET.
Nuclear receptors play important roles in development and
tissue homeostasis, and have been implicated in many disease
states. We present the structure of the full-length
Glucocorticoid Receptor (GR) protein, activated by a
synthetic hormone agonist. Three asymmetric domains are
clearly defined in the structure of the GR monomer, and two
low-density regions, interpreted as hinge regions, connect
the domains. The three domains were further characterized by
multi-resolution docking procedures and by visualizing GR in
complex with a monoclonal antibody.
These studies show that cryo-electron tomography can be
used to visualize individual proteins molecules with a
molecular weight below 200 kDa. Thus, the method can be
applied to flexible multi-domain proteins that have not been
amenable to high-resolution methods of determining
structure.
« Back
Inhibition of bacterial genes
using expressed RNA and cell-permeable antisense agents
Liam Good, Karolinska Institutet, Sweden
We develop antisense technologies to inhibit gene
expression in pathogenic bacteria without a need for genome
manipulation. In one strategy we use short synthetic
antisense peptide nucleic acids (PNA). Cell uptake is
enhanced using attached cell penetrating peptides (CPPs).
When added directly to growing cultures, antisense peptide-PNAs
limit reporter gene expression with gene and sequence
specificity in several species. Also, the antisense effects
are sufficient to kill bacteria when targeted to
stringently-required essential genes in Esherichia coli,
Staphylococcus aureus and Mycobacterium smegmatis. Bacterial
cell death occurs in the absence of cell lysis and levels of
persistence and resistance following exposure are low
relative to conventional antibiotics. In a second strategy,
we have developed stabilised expressed antisense RNAs.
Antisense inhibition at the mRNA level can reveal new
information about gene function and drug mechanism of
action. Expressed RNA and c ell-permeable antisense agents
provide complimentary new tools for antimicrobial drug
target validation and drug mechanism of action studies.
« Back
Transcriptional targets of
STAST5b in liver
Amilcar Flores, Karolinska Institutet, Sweden
STAT5b is a transcription factor that is activated by
tyrosine phosphorylation in multiple tissues. The hepatic
actions of STAT5b are essential for the regulation of
somatic growth. In order to identify novel direct hepatic
targets of STAT5b, we have used microarrays to analyze the
acute transcriptional response to GH treatment of rat livers
infected with replication-defective adenoviruses encoding
either a constitutively active (CA) or dominant-negative (DN)
version of rat Stat5b. A set of candidates genes where
selected based on their ability to be induced by GH in
livers expressing STAT5b-CA but not in those STAT5b-DN. To
validate our selection criteria, the promoter of the
candidates genes where analyzed for the presence of
phylogenetically conserved STAT5b DNA binding sites. This
analysis was performed using Prometheus, a system that uses
GRID computation to predict promoter architecture of lists
of co-regulated genes. Chromatin immunoprecipitation and
promoter-reporter analysis where used to confirm the
validity of our predictions.
« Back
Enhancing the recognition of gene
function transfer from model organisms by considering
different levels of conservation of co-regulation
Carsten Daub, Karolinska Institutet, Sweden
Inferring the co-regulation of genes from various data
sources (e.g. gene expression data, protein-protein
interaction data, or functional annotations) builds a basis
for the prediction of gene function. The transfer of
knowledge about gene function gained from model organisms
towards e.g. human gene analysis is a powerful approach for
experimentally supported gene function predictions. The
predictive power of merging information strongly depends on
the correct prediction of orthologous genes in the model
organisms under consideration. The combination of both, the
prediction of gene function from co-regulation within
species as well as from the conservation of gene function
across species, enables an enhanced function prediction.
Furthermore, considering different levels of co-regulation
conservation from the model organism to the target organism
allows the transfer of functional annotations at different
levels of confidence.
We investigate the conservation of co-regulatory links in
a target organism to the corresponding links in model
organisms. To accomplish this, we employ the principle of
orthology to assign relationships between genes of different
organisms. By varying the thresholds for significant
co-regulations within two species under consideration, we
find a systematic change in the degree of co-regulation
conservation. A similar observation is made when
systematically evaluating the relationship of co-regulation
in a target organism to functional annotations in model
organisms.
Our aim is to exploit this method to transfer functional
annotations from model organisms to target organisms like
human that are experimentally inaccessible.
« Back
Eukaryotic gene expression: the
function of actin and actin-associated proteins in transcription
Piergiorgio Percipalle, Karolinska Institutet, Sweden
In the cell nucleus, actin is an important regulator of
gene expression, found as component of ATP-dependent
chromatin remodelling complexes, ribonucleoprotein particles
(RNP) and more recently, associated with all three
eukaryotic RNA polymerases. A lot of effort is currently
placed on deciphering the molecular mechanisms underlying
the function of actin in gene activation, through the
identification of nuclear actin-associated proteins. This
lecture will focus on two ongoing studies in my laboratory,
to clarify how actin specifically functions in
transcription.
Actin in transcription of protein-coding genes
We have recently discovered that actin is required to turn
on transcription of protein-coding genes associated with the
active RNA polymerase II, in complex with the
ribonucleoprotein hnRNP U (Kukalev et al., 2005). In
this study, we also found that actin binds hnRNP U through a
short and conserved (from insect to mammals) amino acid
sequence motif. Given that hnRNP U and actin are
respectively coupled to histone acetyl transferase
activities (p300/CBP) and chromatin remodelling complexes,
we propose a model in which the actin-hnRNP U complex
activates transcription by recruiting chromatin modifying
components.
Actin in ribosomal DNA (rDNA) transcription
After the discovery of a nuclear form of myosin 1 (NM1) and
its direct involvement in transcription initiation of
protein-coding genes, it seemed possible that actin and
myosin could perform a concerted general role in
transcription. In support of this possibility, we found that
actin and NM1 are on actively transcribing ribosomal genes
bound to the largest RNA polymerase I subunit (Fomproix and
Percipalle, 2004). We recently discovered that NM1 is a
component of a multiprotein assembly, containing the
chromatin remodelling complex WSTF-ISWI, which activates
rDNA transcription (Percipalle et al., 2005).
Considering the very dynamic interaction between actin and
NM1, we suggest that they activate and maintain productive
rDNA transcription as molecular switches, recruiting RNA
polymerase I co-activators on ribosomal genes (Percipalle
et al., 2005).
In conclusion, our data suggest a key role for actin in
transcription. An interesting scenario is that transcription
of all RNA polymerases is facilitated by actin-based
molecular switches in which the polymerase-associated actin
binds to specific adaptors (such as hnRNP U and NM1) to
recruit transcription co-activators.
« Back
Size-dependent Pareto-like
distributions in genomics, proteomics and molecular
evolution
Vladimir A. Kuznetsov, Genome Institute of Singapore
I will describe a family of skewed probability
distributions that are appeared in many genomics, proteomics
and molecular evolution data sets. The observed probability
distributions have the following characteristic in common:
there are few frequent and many rare events in the evolved
multi-class large-scale system. Importantly, that form of
the distribution can systematically depend on size of the
sample (number of transcripts, number of proteins etc.). I
will present several random process models of population
growth that leads us to a size-dependent Pareto-like
probability distribution of the frequency of occurrences of
events in multi-class finite population. I will show how the
models help to improve the gene expression profiles observed
in SAGE and microarray experiments. Our modeling provides a
theoretical basis for accurately counting the expression
level and the number of expressed genes, the total number of
genes in a given cell type and for better understanding the
probabilistic mechanism(s) governing the evolution of
complexity of transcriptome and proteome.
« Back
Filling in the GAPs for cell
dynamics control
Boon-Chuan Low, National University of Singapore
Abstract: Cells undergo dynamic changes in morphology or
motility during cellular division and proliferation,
differentiation, neuronal pathfinding, wound healing,
apoptosis, host defense and organ development. These
processes are controlled by signaling events relayed via
cascades of protein interaction leading to the establishment
and maintenance of cytoskeletal networks of microtubules and
actin. Various checkpoints, including the Rho small GTPases
serve as master switches to fine-tune the amplitude,
duration as well as the integration of such circuitry
response. Rho are activated by guanine nucleotide exchange
factors and inactivated by GTPase-Activating Proteins (GAPs).
We have identified two novel classes of regulators for small
GTPases, the BNIP-2 and BPGAP families, all of which harbor
the conserved BNIP-2 and Cdc42GAP Homology (BCH) domain.
Some properties of the BCH domains and cellular functions of
BNIPs and BPGAP will be discussed in the context of their
novel interacting partners and cell dynamics roles.
« Back
Are complex methods helpful in
mapping complex traits
David Siegmund, Stanford University and National University
of Singapore
I discuss a systematic large sample theory for genetic
mapping of quantitative trait loci (QTL), which (i) deals
with problems of multiple comparisons, (ii) clari es
similarities and di erences between experimental and human
genetics, (iii) treats issues of study design of recent
interest, e.g., the value of large pedigrees in human
genetics and models for gene gene and gene X environment
interaction. One tentative conclusion is that models that
deal with gene X gene interaction appear to have the
potential to play a more important role in experimental
genetics, while models for gene X environment interaction
appear to play a more important role in human genetics.
References
- Tang and Siegmund (2001) Biostatistics 2,
147-162.
- Tang and Siegmund (2002) Genetic Epidemiology,
22 313-327.
- Peng and Siegmund (2004) PNAS 101,
7845-7850.
- Peng, J., Tang, H.-K., and Siegmund, D. (2005). Genome
scans with gene-covariate
interaction, Genet. Epi. (in press)
« Back
|