Figuring Out Life: 
					NUS - Karolinska Joint 
					Symposium on Application of Mathematics in Biomedicine 
                  (28 - 29 Nov 2005)
                  
                  
                Bioinformatics workflow integration for biomanufacturing and biosurveillance 
                Tin Wee Tan, National University of Singapore 
                In 2004, the A*STAR funded a pilot programme on Integrated Manufacturing Services and Systems (IMSS). Our pilot project in this programme was able to design and demonstrate a pilot prototypic  workflow integration system for the biomanufacture and design of diagnostic systems. The systems include the integration of DNA automated sequencers, oligonucleotide synthesis (Novusgene), DNA chip systems (Attogenix) using workflow systems such as KOOPrime, Goalnet and Taverna/MyGrid. Since then we have embarked on expanding this concept to other aspects of workflow integration including biosurveillance.
The concept of workflow integration for biomanufacture stems from the ability of bioinformatics workflow and pipelining systems to integrate information and data flow with machine/device control. We extend this idea of integration to supply chain management systems which interface with the biomanufacturing and design process specifically for diagnostic kits, with the possibility to extend this to linear polymeric therapeutics (siRNA, miRNA, peptides) and prophylactics (peptide and DNA vaccines) in response to massive scale-up in times of biodisasters such as bird flu, SARS, emerging diseases, etc. or biosurveillance in the face of biological warfare and terrorist acts. 
                 « Back 
                Solvation in proteins: insights from atomistic simulations 
				Chandra Verma, Bioinformatics Institute of Singapore 
					The importance of solvation/hydration in proteins is relatively less understood, in particular its role in stability and function.
Computer simulations offer advantages over other experimental techniques to explore this complex feature in great detail.
A set of examples from out owrk and those of others will be presented to highlight advances and shortcomings of the processes involved. 
                 « Back 
                Protein function prediction from protein interactions 
				Lim Soon Wong, National University of Singapore 
					The elucidation of protein function is one of the key problems in computational biology. A recent trend in protein function prediction is
based on the use of protein interaction data. The intuition is that if protein A and B belong to the same functional pathway, A is likely to
interact with B; therefore, when A and B are observed to interact, they are likely to share functions. However, in many cases, the direct partners of a
protein share few or no function with it; instead, the partners of these partners show functional similarity with the protein. We discuss, in this
talk, the plausibility of such indirect functional associations and their use in improving protein function prediction. 
                 « Back 
                	Understanding function using in 
					silico and experimental structural information 
					Prasanna Kolatkar, Genome Institute of Singapore 
					High throughput methods in genomics and proteomics are 
					allowing us a picture of the overall landscape of many 
					biological systems through sequence as well interaction 
					data. The data sets often yield valuable information about 
					biological function on a grand scale but sometimes don’t 
					offer specific information about the mechanism of various 
					biological events. Structural information can be used in 
					conjunction with these high throughput technologies to yield 
					structure function relationships which can help to shed 
					light on various mechanisms in biology. Improvements of in 
					silico structure methods as well as high(er) throughput 
					experimental structure determination can allow us to stay 
					closer to the other methods in terms of volume and helping 
					us to understand mechanisms using structure-function 
					relationships. We will show some examples of projects at the 
					Genome Institute of Singapore for which in silico structure 
					methods and high(er) throughput experimental data are being 
					used to help us understand biological function. 
                 « Back 
                	Comparative genomics of 
					two cyprinid species 
					Alan Christoffels, Temasek Life Sciences Laboratory, 
					Singapore 
					The Cyprinidae family, with more than 2,000 
					species, is the most abundant and widespread of all 
					freshwater fish families across Europe, Asia, Africa and 
					North America. The evolution of cyprinid teleosts has been 
					impacted significantly by gene and genome duplications and 
					very likely contributes to the paucity of genome sequence 
					data for cyprinids, barring the nearly complete sequenced 
					zebrafish genome. In the absence of genomic data for other 
					cyprinid species, we embarked on a partial transcriptome 
					analysis of cyprinids using publicly available common carp 
					and zebrafish EST data and the genome assembly for zebrafish.
					 
					We have generated over 6,000 ESTs from the 
					differentiating testis of common carp and clustered them 
					with 10,395 non-gonadal ESTs from CarpBase as well as 660 
					common carp mRNAs from GenBank. The resulting unique 
					sequences were subjected to detailed analysis and compared 
					against zebrafish sequences at the cDNA, protein and genome 
					level.  
					We present data to show that there is sufficient homology 
					between the transcribed sequences of common carp and 
					zebrafish to warrant a cyprinid transcriptome comparison. We 
					show that common carp transcripts map to un-annotated 
					regions and to ab-initio gene predictions on the 
					zebrafish genome assembly. A substantial portion of our 
					unique transcripts from common carp seems to be 
					tissue-specific.  
					With about 24,000 species, teleost represent the most 
					diverse group of vertebrates and unlikely to be sampled by 
					many completely sequenced genomes. Our analyses therefore 
					illustrate the value in utilizing partially sequenced 
					genomes and suggest the need for integrated resources to 
					leverage the wealth of fragmented genomic data. 
                 « Back 
                	Cryo-electron tomography of 
					individual protein molecules 
					Sara Sandin, Karolinska Institutet, Sweden 
					Averaging methods of determining structure, such as X-ray 
					diffraction, do not preserve information about the 
					flexibility of molecules. Cryo-electron tomography allows us 
					to reconstruct individual hydrated objects. The method is 
					limited to low-resolution, but it can be used to study 
					dynamic structures, such as very large macromolecular 
					complexes, and to perform in situ analysis of cellular 
					organelles. These studies explore the expansion of the cryo-electron 
					tomography method to individual protein molecules. 
					Tomographic structures of four proteins, ranging in size 
					from 90 to 150 kDa, are presented. 
					We have analysed the structure and flexibility of the 
					antibody immunoglobulin G (IgG). The tomograms reveal 
					y-shaped IgG molecules with three protruding subunits. We 
					show that the tomographic structures are consistent with 
					X-ray crystallographic structures of IgG and that the three 
					50 kDa subunits were resolved with accuracy. Each subunit 
					has a similar structure in the tomograms and in the X-ray 
					map. Notably, the positions of the subunits differed greatly 
					from one molecule to another. The large flexibility of IgG 
					in solution is most likely of functional significance in 
					antigen recognition. We have investigated a larger number of 
					individual IgG molecules, measured equilibrium distribution 
					of the molecule in terms of the relevant angular coordinates 
					and built a model of the dynamics of IgG in solution. 
					The hepatocyte growth factor/scatter factor (HGF/SF) 
					controls the growth, morphogenesis or migration of 
					epithelial, endothelial and muscle progenitor cells. We have 
					defined the main conformations of inactive single-chain HGF/SF 
					and active two-chain HGF/SF. Furthermore we present 
					structures of the receptor tyrosine kinase MET and of MET 
					bound to two-chain HGF/SF. These structures reveal the 
					mechanism of HGF/SF activation and clarify the mode of 
					binding to MET. 
					Nuclear receptors play important roles in development and 
					tissue homeostasis, and have been implicated in many disease 
					states. We present the structure of the full-length 
					Glucocorticoid Receptor (GR) protein, activated by a 
					synthetic hormone agonist. Three asymmetric domains are 
					clearly defined in the structure of the GR monomer, and two 
					low-density regions, interpreted as hinge regions, connect 
					the domains. The three domains were further characterized by 
					multi-resolution docking procedures and by visualizing GR in 
					complex with a monoclonal antibody.  
					These studies show that cryo-electron tomography can be 
					used to visualize individual proteins molecules with a 
					molecular weight below 200 kDa. Thus, the method can be 
					applied to flexible multi-domain proteins that have not been 
					amenable to high-resolution methods of determining 
					structure. 
                 « Back 
                	Inhibition of bacterial genes 
					using expressed RNA and cell-permeable antisense agents 
					Liam Good, Karolinska Institutet, Sweden 
					We develop antisense technologies to inhibit gene 
					expression in pathogenic bacteria without a need for genome 
					manipulation. In one strategy we use short synthetic 
					antisense peptide nucleic acids (PNA). Cell uptake is 
					enhanced using attached cell penetrating peptides (CPPs). 
					When added directly to growing cultures, antisense peptide-PNAs 
					limit reporter gene expression with gene and sequence 
					specificity in several species. Also, the antisense effects 
					are sufficient to kill bacteria when targeted to 
					stringently-required essential genes in Esherichia coli, 
					Staphylococcus aureus and Mycobacterium smegmatis. Bacterial 
					cell death occurs in the absence of cell lysis and levels of 
					persistence and resistance following exposure are low 
					relative to conventional antibiotics. In a second strategy, 
					we have developed stabilised expressed antisense RNAs. 
					Antisense inhibition at the mRNA level can reveal new 
					information about gene function and drug mechanism of 
					action. Expressed RNA and c ell-permeable antisense agents 
					provide complimentary new tools for antimicrobial drug 
					target validation and drug mechanism of action studies. 
                 « Back 
                	Transcriptional targets of 
					STAST5b in liver 
					Amilcar Flores, Karolinska Institutet, Sweden 
					STAT5b is a transcription factor that is activated by 
					tyrosine phosphorylation in multiple tissues. The hepatic 
					actions of STAT5b are essential for the regulation of 
					somatic growth. In order to identify novel direct hepatic 
					targets of STAT5b, we have used microarrays to analyze the 
					acute transcriptional response to GH treatment of rat livers 
					infected with replication-defective adenoviruses encoding 
					either a constitutively active (CA) or dominant-negative (DN) 
					version of rat Stat5b. A set of candidates genes where 
					selected based on their ability to be induced by GH in 
					livers expressing STAT5b-CA but not in those STAT5b-DN. To 
					validate our selection criteria, the promoter of the 
					candidates genes where analyzed for the presence of 
					phylogenetically conserved STAT5b DNA binding sites. This 
					analysis was performed using Prometheus, a system that uses 
					GRID computation to predict promoter architecture of lists 
					of co-regulated genes. Chromatin immunoprecipitation and 
					promoter-reporter analysis where used to confirm the 
					validity of our predictions.  
                 « Back 
                	Enhancing the recognition of gene 
					function transfer from model organisms by considering 
					different levels of conservation of co-regulation 
					Carsten Daub, Karolinska Institutet, Sweden 
					Inferring the co-regulation of genes from various data 
					sources (e.g. gene expression data, protein-protein 
					interaction data, or functional annotations) builds a basis 
					for the prediction of gene function. The transfer of 
					knowledge about gene function gained from model organisms 
					towards e.g. human gene analysis is a powerful approach for 
					experimentally supported gene function predictions. The 
					predictive power of merging information strongly depends on 
					the correct prediction of orthologous genes in the model 
					organisms under consideration. The combination of both, the 
					prediction of gene function from co-regulation within 
					species as well as from the conservation of gene function 
					across species, enables an enhanced function prediction. 
					Furthermore, considering different levels of co-regulation 
					conservation from the model organism to the target organism 
					allows the transfer of functional annotations at different 
					levels of confidence.  
					We investigate the conservation of co-regulatory links in 
					a target organism to the corresponding links in model 
					organisms. To accomplish this, we employ the principle of 
					orthology to assign relationships between genes of different 
					organisms. By varying the thresholds for significant 
					co-regulations within two species under consideration, we 
					find a systematic change in the degree of co-regulation 
					conservation. A similar observation is made when 
					systematically evaluating the relationship of co-regulation 
					in a target organism to functional annotations in model 
					organisms.  
					Our aim is to exploit this method to transfer functional 
					annotations from model organisms to target organisms like 
					human that are experimentally inaccessible. 
                 « Back 
                Eukaryotic gene expression: the 
				function of actin and actin-associated proteins in transcription 
				Piergiorgio Percipalle, Karolinska Institutet, Sweden 
					In the cell nucleus, actin is an important regulator of 
					gene expression, found as component of ATP-dependent 
					chromatin remodelling complexes, ribonucleoprotein particles 
					(RNP) and more recently, associated with all three 
					eukaryotic RNA polymerases. A lot of effort is currently 
					placed on deciphering the molecular mechanisms underlying 
					the function of actin in gene activation, through the 
					identification of nuclear actin-associated proteins. This 
					lecture will focus on two ongoing studies in my laboratory, 
					to clarify how actin specifically functions in 
					transcription.  
					Actin in transcription of protein-coding genes 
					We have recently discovered that actin is required to turn 
					on transcription of protein-coding genes associated with the 
					active RNA polymerase II, in complex with the 
					ribonucleoprotein hnRNP U (Kukalev et al., 2005). In 
					this study, we also found that actin binds hnRNP U through a 
					short and conserved (from insect to mammals) amino acid 
					sequence motif. Given that hnRNP U and actin are 
					respectively coupled to histone acetyl transferase 
					activities (p300/CBP) and chromatin remodelling complexes, 
					we propose a model in which the actin-hnRNP U complex 
					activates transcription by recruiting chromatin modifying 
					components.  
					Actin in ribosomal DNA (rDNA) transcription 
					After the discovery of a nuclear form of myosin 1 (NM1) and 
					its direct involvement in transcription initiation of 
					protein-coding genes, it seemed possible that actin and 
					myosin could perform a concerted general role in 
					transcription. In support of this possibility, we found that 
					actin and NM1 are on actively transcribing ribosomal genes 
					bound to the largest RNA polymerase I subunit (Fomproix and 
					Percipalle, 2004). We recently discovered that NM1 is a 
					component of a multiprotein assembly, containing the 
					chromatin remodelling complex WSTF-ISWI, which activates 
					rDNA transcription (Percipalle et al., 2005). 
					Considering the very dynamic interaction between actin and 
					NM1, we suggest that they activate and maintain productive 
					rDNA transcription as molecular switches, recruiting RNA 
					polymerase I co-activators on ribosomal genes (Percipalle 
					et al., 2005).  
					In conclusion, our data suggest a key role for actin in 
					transcription. An interesting scenario is that transcription 
					of all RNA polymerases is facilitated by actin-based 
					molecular switches in which the polymerase-associated actin 
					binds to specific adaptors (such as hnRNP U and NM1) to 
					recruit transcription co-activators. 
                 « Back 
                	Size-dependent Pareto-like 
					distributions in genomics, proteomics and molecular 
					evolution 
					Vladimir A. Kuznetsov, Genome Institute of Singapore 
					I will describe a family of skewed probability 
					distributions that are appeared in many genomics, proteomics 
					and molecular evolution data sets. The observed probability 
					distributions have the following characteristic in common: 
					there are few frequent and many rare events in the evolved 
					multi-class large-scale system. Importantly, that form of 
					the distribution can systematically depend on size of the 
					sample (number of transcripts, number of proteins etc.). I 
					will present several random process models of population 
					growth that leads us to a size-dependent Pareto-like 
					probability distribution of the frequency of occurrences of 
					events in multi-class finite population. I will show how the 
					models help to improve the gene expression profiles observed 
					in SAGE and microarray experiments. Our modeling provides a 
					theoretical basis for accurately counting the expression 
					level and the number of expressed genes, the total number of 
					genes in a given cell type and for better understanding the 
					probabilistic mechanism(s) governing the evolution of 
					complexity of transcriptome and proteome. 
                 « Back 
                	Filling in the GAPs for cell 
					dynamics control 
					Boon-Chuan Low, National University of Singapore 
					Abstract: Cells undergo dynamic changes in morphology or 
					motility during cellular division and proliferation, 
					differentiation, neuronal pathfinding, wound healing, 
					apoptosis, host defense and organ development. These 
					processes are controlled by signaling events relayed via 
					cascades of protein interaction leading to the establishment 
					and maintenance of cytoskeletal networks of microtubules and 
					actin. Various checkpoints, including the Rho small GTPases 
					serve as master switches to fine-tune the amplitude, 
					duration as well as the integration of such circuitry 
					response. Rho are activated by guanine nucleotide exchange 
					factors and inactivated by GTPase-Activating Proteins (GAPs). 
					We have identified two novel classes of regulators for small 
					GTPases, the BNIP-2 and BPGAP families, all of which harbor 
					the conserved BNIP-2 and Cdc42GAP Homology (BCH) domain. 
					Some properties of the BCH domains and cellular functions of 
					BNIPs and BPGAP will be discussed in the context of their 
					novel interacting partners and cell dynamics roles.  
                 « Back 
                Are complex methods helpful in 
				mapping complex traits 
				David Siegmund, Stanford University and National University 
				of Singapore 
					I discuss a systematic large sample theory for genetic 
					mapping of quantitative trait loci (QTL), which (i) deals 
					with problems of multiple comparisons, (ii) clari es 
					similarities and di erences between experimental and human 
					genetics, (iii) treats issues of study design of recent 
					interest, e.g., the value of large pedigrees in human 
					genetics and models for gene  gene and gene X environment 
					interaction. One tentative conclusion is that models that 
					deal with gene X gene interaction appear to have the 
					potential to play a more important role in experimental 
					genetics, while models for gene X environment interaction 
					appear to play a more important role in human genetics. 
					 References 
					- Tang and Siegmund (2001) Biostatistics 2, 
					147-162. 
					- Tang and Siegmund (2002) Genetic Epidemiology, 
					22 313-327. 
					- Peng and Siegmund (2004) PNAS 101, 
					7845-7850. 
					- Peng, J., Tang, H.-K., and Siegmund, D. (2005). Genome 
					scans with gene-covariate 
  interaction, Genet. Epi. (in press) 
                 « Back 
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                 |