Protein tertiary structure provides indispensable information for elucidating protein function and evolution. We are developing computational methods for predicting protein tertiarty structure from sequence [35, 23, 19] and methods for error estimation of computational models . We have developed a database of predicted structure models of E. coli, EcoliPredict, which is a part of EcoliHub Project.
Protein surface is where function of a protein realizes. Especially interaction with proteins and chemical compounds occur at a specific site of a protein surface. Hence, finding characteristics sites for protein function, e.g. active sites of enzymes, protein interaction interface, is a promising way to predict function of a protein. The aims of this project include development of methods for protein surface shape comparison for fast database search  and characterization of surface geometrical property of proteins [30, 28]. We have developed 3D-Surfer , web-based software for fast protein comparison and surface analysis.
Function annotation of genes is a foundation of almost any molecular biology studies. Conventional methods for function annotation are homology search methods, such as BLAST and FASTA. These methods perform well when obvious homologs exist for a query protein, but don't provide any functional information otherwise. As a consequence, typically about only half of genes are annotated in a newly sequences genome. For a large scale omics analysis, it is helpful if function annotation coverage is larger even with less specific or low-resolution function [32, 26]. The goal of this project is to develop methods which can predict function to a larger number of genes than conventional homology search by providing low-resolution function when necessary witout losing accuracy. Our method, PFP , won the best prediction method in CASP7 and Automatic Function Prediction Meeting (AFP-SIG, ISMB 2005). Please try our PFP website .
Intergenic regions contain important information for gene regulation. In recent years various families of small non-coding RNAs (sRNAs) have been discovered both in bacterial and eukaryotic genomes. We have developed an ensemble approach of DNA motif discovery, which outperforms standalone programs [24, 20]. We have also computationally identified sRNAs in 30 bacterial genomes and conducted comparative study .