|
On-going and related projects Automated construction of FEATURE models
Since creating training sets by hand is time-consuming, we are exploring different ways to generate models automatically. One way is by using sequence motifs as seeds, extracting structural examples containing that motif for the positive training set and random or knowledge-guided methods to generate the negative training set. One could also imagine using other sources of functional information to seed training sets, such as PDB HETATMs or known ligand-protein pairs.
Relevant papers: Wu S, Liang MP, Altman RB. (2008) "A comprehensive scan of the Protein Data Bank with the SeqFEATURE library of functional site models and prediction of function for structural genomics targets." Genome Biology 9:R8. PMID: 18197987 Liang MP, Brutlag DL, Altman RB. Automated construction of structural motifs for predicting functional sites on protein structures. Pac Symp Biocomput. 2003;:204-15. PMID: 12603029 Modeling functional sites in 4D: coupling FEATURE with molecular dynamics
Proteins are dynamic molecules that undergo many small local changes as well as larger conformational ones. We are interested in studying how the structure of functional sites change as a result of natural protein movement, with the aim of improving performance of static function prediction methods.
Relevant papers: Glazer DS, Radmer RJ, Altman RB. (2008). Combining molecular dynamics and machine learning to improve protein function recognition. Pacific Symposium on Biocomputing 2008, 332-43. PMID: 18229697 Predicting metal binding sites
Use the Metal binding site prediction server FEATURE is especially suited for prediction of metal binding sites, which are small and can be represented naturally by a single point. We have done an in-depth study of zinc binding sites to produce a robust zinc binding site model. Relevant papers: Ebert JC, Altman RB. Robust recognition of zinc binding sites in proteins. Protein Sci. 2007, 17(1):54-65. PMID: 18042678 Clustering functional environments
We have traditionally used FEATURE as part of a supervised learning system for modeling functional sites, but it can also be used in an unsupervised fashion to discover potentially novel functional sites. We apply different clustering algorithms to microenvironment vectors and use external knowledge to filter the results. Many known sites are rediscovered, but we also identify intriguing clusters which may represent new annotations for proteins or functional sites. Most recently, we applied these techniques to a data set of cysteine-based microenvironments.
Relevant papers: Wu S, Liu T, Altman RB. Analysis of cysteine-based protein microenvironments reveals novel functional sites and recurring structural motifs. (manuscript in preparation) Yoon S, Ebert JC, Chung EY, Di Micheli G, Altman RB. Clustering protein environments for function predictions: finding PROSITE motifs in 3D. BMC Bioinform. 2007 May 22;8(Suppl 4):S10. PMID: 17570144 Improving FEATURE through homology modeling
(Description coming soon)
|
|||||||||||
| © 2007 Helix Lab, Stanford University Please email questions, comments, and issues to shwu19 at stanford.edu |
||||||||||||