Figuring Out Life: NUS - Karolinska Joint Symposium on Application of Mathematics in Biomedicine

Back

Figuring Out Life:
NUS - Karolinska Joint Symposium on Application of Mathematics in Biomedicine
(28 - 29 Nov 2005)

Bioinformatics workflow integration for biomanufacturing and biosurveillance
Tin Wee Tan, National University of Singapore

In 2004, the A*STAR funded a pilot programme on Integrated Manufacturing Services and Systems (IMSS). Our pilot project in this programme was able to design and demonstrate a pilot prototypic workflow integration system for the biomanufacture and design of diagnostic systems. The systems include the integration of DNA automated sequencers, oligonucleotide synthesis (Novusgene), DNA chip systems (Attogenix) using workflow systems such as KOOPrime, Goalnet and Taverna/MyGrid. Since then we have embarked on expanding this concept to other aspects of workflow integration including biosurveillance. The concept of workflow integration for biomanufacture stems from the ability of bioinformatics workflow and pipelining systems to integrate information and data flow with machine/device control. We extend this idea of integration to supply chain management systems which interface with the biomanufacturing and design process specifically for diagnostic kits, with the possibility to extend this to linear polymeric therapeutics (siRNA, miRNA, peptides) and prophylactics (peptide and DNA vaccines) in response to massive scale-up in times of biodisasters such as bird flu, SARS, emerging diseases, etc. or biosurveillance in the face of biological warfare and terrorist acts.

« Back

Solvation in proteins: insights from atomistic simulations
Chandra Verma, Bioinformatics Institute of Singapore

The importance of solvation/hydration in proteins is relatively less understood, in particular its role in stability and function. Computer simulations offer advantages over other experimental techniques to explore this complex feature in great detail. A set of examples from out owrk and those of others will be presented to highlight advances and shortcomings of the processes involved.

« Back

Protein function prediction from protein interactions
Lim Soon Wong, National University of Singapore

The elucidation of protein function is one of the key problems in computational biology. A recent trend in protein function prediction is based on the use of protein interaction data. The intuition is that if protein A and B belong to the same functional pathway, A is likely to interact with B; therefore, when A and B are observed to interact, they are likely to share functions. However, in many cases, the direct partners of a protein share few or no function with it; instead, the partners of these partners show functional similarity with the protein. We discuss, in this talk, the plausibility of such indirect functional associations and their use in improving protein function prediction.

« Back

Understanding function using in silico and experimental structural information
Prasanna Kolatkar, Genome Institute of Singapore

High throughput methods in genomics and proteomics are allowing us a picture of the overall landscape of many biological systems through sequence as well interaction data. The data sets often yield valuable information about biological function on a grand scale but sometimes don’t offer specific information about the mechanism of various biological events. Structural information can be used in conjunction with these high throughput technologies to yield structure function relationships which can help to shed light on various mechanisms in biology. Improvements of in silico structure methods as well as high(er) throughput experimental structure determination can allow us to stay closer to the other methods in terms of volume and helping us to understand mechanisms using structure-function relationships. We will show some examples of projects at the Genome Institute of Singapore for which in silico structure methods and high(er) throughput experimental data are being used to help us understand biological function.

« Back

Comparative genomics of two cyprinid species
Alan Christoffels, Temasek Life Sciences Laboratory, Singapore

The Cyprinidae family, with more than 2,000 species, is the most abundant and widespread of all freshwater fish families across Europe, Asia, Africa and North America. The evolution of cyprinid teleosts has been impacted significantly by gene and genome duplications and very likely contributes to the paucity of genome sequence data for cyprinids, barring the nearly complete sequenced zebrafish genome. In the absence of genomic data for other cyprinid species, we embarked on a partial transcriptome analysis of cyprinids using publicly available common carp and zebrafish EST data and the genome assembly for zebrafish.

We have generated over 6,000 ESTs from the differentiating testis of common carp and clustered them with 10,395 non-gonadal ESTs from CarpBase as well as 660 common carp mRNAs from GenBank. The resulting unique sequences were subjected to detailed analysis and compared against zebrafish sequences at the cDNA, protein and genome level.

We present data to show that there is sufficient homology between the transcribed sequences of common carp and zebrafish to warrant a cyprinid transcriptome comparison. We show that common carp transcripts map to un-annotated regions and to ab-initio gene predictions on the zebrafish genome assembly. A substantial portion of our unique transcripts from common carp seems to be tissue-specific.

With about 24,000 species, teleost represent the most diverse group of vertebrates and unlikely to be sampled by many completely sequenced genomes. Our analyses therefore illustrate the value in utilizing partially sequenced genomes and suggest the need for integrated resources to leverage the wealth of fragmented genomic data.

« Back

Cryo-electron tomography of individual protein molecules
Sara Sandin, Karolinska Institutet, Sweden

Averaging methods of determining structure, such as X-ray diffraction, do not preserve information about the flexibility of molecules. Cryo-electron tomography allows us to reconstruct individual hydrated objects. The method is limited to low-resolution, but it can be used to study dynamic structures, such as very large macromolecular complexes, and to perform in situ analysis of cellular organelles. These studies explore the expansion of the cryo-electron tomography method to individual protein molecules. Tomographic structures of four proteins, ranging in size from 90 to 150 kDa, are presented.

We have analysed the structure and flexibility of the antibody immunoglobulin G (IgG). The tomograms reveal y-shaped IgG molecules with three protruding subunits. We show that the tomographic structures are consistent with X-ray crystallographic structures of IgG and that the three 50 kDa subunits were resolved with accuracy. Each subunit has a similar structure in the tomograms and in the X-ray map. Notably, the positions of the subunits differed greatly from one molecule to another. The large flexibility of IgG in solution is most likely of functional significance in antigen recognition. We have investigated a larger number of individual IgG molecules, measured equilibrium distribution of the molecule in terms of the relevant angular coordinates and built a model of the dynamics of IgG in solution.

The hepatocyte growth factor/scatter factor (HGF/SF) controls the growth, morphogenesis or migration of epithelial, endothelial and muscle progenitor cells. We have defined the main conformations of inactive single-chain HGF/SF and active two-chain HGF/SF. Furthermore we present structures of the receptor tyrosine kinase MET and of MET bound to two-chain HGF/SF. These structures reveal the mechanism of HGF/SF activation and clarify the mode of binding to MET.

Nuclear receptors play important roles in development and tissue homeostasis, and have been implicated in many disease states. We present the structure of the full-length Glucocorticoid Receptor (GR) protein, activated by a synthetic hormone agonist. Three asymmetric domains are clearly defined in the structure of the GR monomer, and two low-density regions, interpreted as hinge regions, connect the domains. The three domains were further characterized by multi-resolution docking procedures and by visualizing GR in complex with a monoclonal antibody.

These studies show that cryo-electron tomography can be used to visualize individual proteins molecules with a molecular weight below 200 kDa. Thus, the method can be applied to flexible multi-domain proteins that have not been amenable to high-resolution methods of determining structure.

« Back

Inhibition of bacterial genes using expressed RNA and cell-permeable antisense agents
Liam Good, Karolinska Institutet, Sweden

We develop antisense technologies to inhibit gene expression in pathogenic bacteria without a need for genome manipulation. In one strategy we use short synthetic antisense peptide nucleic acids (PNA). Cell uptake is enhanced using attached cell penetrating peptides (CPPs). When added directly to growing cultures, antisense peptide-PNAs limit reporter gene expression with gene and sequence specificity in several species. Also, the antisense effects are sufficient to kill bacteria when targeted to stringently-required essential genes in Esherichia coli, Staphylococcus aureus and Mycobacterium smegmatis. Bacterial cell death occurs in the absence of cell lysis and levels of persistence and resistance following exposure are low relative to conventional antibiotics. In a second strategy, we have developed stabilised expressed antisense RNAs. Antisense inhibition at the mRNA level can reveal new information about gene function and drug mechanism of action. Expressed RNA and c ell-permeable antisense agents provide complimentary new tools for antimicrobial drug target validation and drug mechanism of action studies.

« Back

Transcriptional targets of STAST5b in liver
Amilcar Flores, Karolinska Institutet, Sweden

STAT5b is a transcription factor that is activated by tyrosine phosphorylation in multiple tissues. The hepatic actions of STAT5b are essential for the regulation of somatic growth. In order to identify novel direct hepatic targets of STAT5b, we have used microarrays to analyze the acute transcriptional response to GH treatment of rat livers infected with replication-defective adenoviruses encoding either a constitutively active (CA) or dominant-negative (DN) version of rat Stat5b. A set of candidates genes where selected based on their ability to be induced by GH in livers expressing STAT5b-CA but not in those STAT5b-DN. To validate our selection criteria, the promoter of the candidates genes where analyzed for the presence of phylogenetically conserved STAT5b DNA binding sites. This analysis was performed using Prometheus, a system that uses GRID computation to predict promoter architecture of lists of co-regulated genes. Chromatin immunoprecipitation and promoter-reporter analysis where used to confirm the validity of our predictions.

« Back

Enhancing the recognition of gene function transfer from model organisms by considering different levels of conservation of co-regulation
Carsten Daub, Karolinska Institutet, Sweden

Inferring the co-regulation of genes from various data sources (e.g. gene expression data, protein-protein interaction data, or functional annotations) builds a basis for the prediction of gene function. The transfer of knowledge about gene function gained from model organisms towards e.g. human gene analysis is a powerful approach for experimentally supported gene function predictions. The predictive power of merging information strongly depends on the correct prediction of orthologous genes in the model organisms under consideration. The combination of both, the prediction of gene function from co-regulation within species as well as from the conservation of gene function across species, enables an enhanced function prediction. Furthermore, considering different levels of co-regulation conservation from the model organism to the target organism allows the transfer of functional annotations at different levels of confidence.

We investigate the conservation of co-regulatory links in a target organism to the corresponding links in model organisms. To accomplish this, we employ the principle of orthology to assign relationships between genes of different organisms. By varying the thresholds for significant co-regulations within two species under consideration, we find a systematic change in the degree of co-regulation conservation. A similar observation is made when systematically evaluating the relationship of co-regulation in a target organism to functional annotations in model organisms.

Our aim is to exploit this method to transfer functional annotations from model organisms to target organisms like human that are experimentally inaccessible.

« Back

Eukaryotic gene expression: the function of actin and actin-associated proteins in transcription
Piergiorgio Percipalle, Karolinska Institutet, Sweden

In the cell nucleus, actin is an important regulator of gene expression, found as component of ATP-dependent chromatin remodelling complexes, ribonucleoprotein particles (RNP) and more recently, associated with all three eukaryotic RNA polymerases. A lot of effort is currently placed on deciphering the molecular mechanisms underlying the function of actin in gene activation, through the identification of nuclear actin-associated proteins. This lecture will focus on two ongoing studies in my laboratory, to clarify how actin specifically functions in transcription.

Actin in transcription of protein-coding genes
We have recently discovered that actin is required to turn on transcription of protein-coding genes associated with the active RNA polymerase II, in complex with the ribonucleoprotein hnRNP U (Kukalev et al., 2005). In this study, we also found that actin binds hnRNP U through a short and conserved (from insect to mammals) amino acid sequence motif. Given that hnRNP U and actin are respectively coupled to histone acetyl transferase activities (p300/CBP) and chromatin remodelling complexes, we propose a model in which the actin-hnRNP U complex activates transcription by recruiting chromatin modifying components.

Actin in ribosomal DNA (rDNA) transcription
After the discovery of a nuclear form of myosin 1 (NM1) and its direct involvement in transcription initiation of protein-coding genes, it seemed possible that actin and myosin could perform a concerted general role in transcription. In support of this possibility, we found that actin and NM1 are on actively transcribing ribosomal genes bound to the largest RNA polymerase I subunit (Fomproix and Percipalle, 2004). We recently discovered that NM1 is a component of a multiprotein assembly, containing the chromatin remodelling complex WSTF-ISWI, which activates rDNA transcription (Percipalle et al., 2005). Considering the very dynamic interaction between actin and NM1, we suggest that they activate and maintain productive rDNA transcription as molecular switches, recruiting RNA polymerase I co-activators on ribosomal genes (Percipalle et al., 2005).

In conclusion, our data suggest a key role for actin in transcription. An interesting scenario is that transcription of all RNA polymerases is facilitated by actin-based molecular switches in which the polymerase-associated actin binds to specific adaptors (such as hnRNP U and NM1) to recruit transcription co-activators.

« Back

Size-dependent Pareto-like distributions in genomics, proteomics and molecular evolution
Vladimir A. Kuznetsov, Genome Institute of Singapore

I will describe a family of skewed probability distributions that are appeared in many genomics, proteomics and molecular evolution data sets. The observed probability distributions have the following characteristic in common: there are few frequent and many rare events in the evolved multi-class large-scale system. Importantly, that form of the distribution can systematically depend on size of the sample (number of transcripts, number of proteins etc.). I will present several random process models of population growth that leads us to a size-dependent Pareto-like probability distribution of the frequency of occurrences of events in multi-class finite population. I will show how the models help to improve the gene expression profiles observed in SAGE and microarray experiments. Our modeling provides a theoretical basis for accurately counting the expression level and the number of expressed genes, the total number of genes in a given cell type and for better understanding the probabilistic mechanism(s) governing the evolution of complexity of transcriptome and proteome.

« Back

Filling in the GAPs for cell dynamics control
Boon-Chuan Low, National University of Singapore

Abstract: Cells undergo dynamic changes in morphology or motility during cellular division and proliferation, differentiation, neuronal pathfinding, wound healing, apoptosis, host defense and organ development. These processes are controlled by signaling events relayed via cascades of protein interaction leading to the establishment and maintenance of cytoskeletal networks of microtubules and actin. Various checkpoints, including the Rho small GTPases serve as master switches to fine-tune the amplitude, duration as well as the integration of such circuitry response. Rho are activated by guanine nucleotide exchange factors and inactivated by GTPase-Activating Proteins (GAPs). We have identified two novel classes of regulators for small GTPases, the BNIP-2 and BPGAP families, all of which harbor the conserved BNIP-2 and Cdc42GAP Homology (BCH) domain. Some properties of the BCH domains and cellular functions of BNIPs and BPGAP will be discussed in the context of their novel interacting partners and cell dynamics roles.

« Back

Are complex methods helpful in mapping complex traits
David Siegmund, Stanford University and National University of Singapore

I discuss a systematic large sample theory for genetic mapping of quantitative trait loci (QTL), which (i) deals with problems of multiple comparisons, (ii) clari es similarities and di erences between experimental and human genetics, (iii) treats issues of study design of recent interest, e.g., the value of large pedigrees in human genetics and models for gene gene and gene X environment interaction. One tentative conclusion is that models that deal with gene X gene interaction appear to have the potential to play a more important role in experimental genetics, while models for gene X environment interaction appear to play a more important role in human genetics.

References
- Tang and Siegmund (2001) Biostatistics 2, 147-162.
- Tang and Siegmund (2002) Genetic Epidemiology, 22 313-327.
- Peng and Siegmund (2004) PNAS 101, 7845-7850.
- Peng, J., Tang, H.-K., and Siegmund, D. (2005). Genome scans with gene-covariate
interaction, Genet. Epi. (in press)

« Back

Figuring Out Life: NUS - Karolinska Joint Symposium on Application of Mathematics in Biomedicine (28 - 29 Nov 2005)

Figuring Out Life:
NUS - Karolinska Joint Symposium on Application of Mathematics in Biomedicine
(28 - 29 Nov 2005)