|
|
Contents:
ONLINE ANALYSIS
- Using the WebFEATURE Interface
- Viewing results interactively in Chime
OFFLINE ANALYSIS
- Downloading results for further analysis
- Viewing results in RasMol
- Viewing results in PyMOL
INTERPRETING SEQFEATURE FILES
- Points files
- Feature files
- Score files
- Interpreting results
Using the WebFEATURE Interface
WebFEATURE allows a user to scan a molecular structure for a
particular functional site. The WebFEATURE interface is fairly simple
to use.
Step 1. Pick a structure to scan
- If you know the PDB (Protein Data Bank) identification number of
the molecule, enter it into the textfield labeled "PDB
ID".
- If you do not know the PDB id, click on the link to
"PDB". The PDB will allow you to look up a structure using
the molecule's name. Then enter its id number into the PDB ID text
field in WebFEATURE.
- If you have a structure (in PDB format) on your local machine, you
can upload it by clicking the "Browse" button next to
"Upload a structure".
Step 2. Choose a type of site to scan
- Pre-made models are available in the drop-down menu. Choose from hand-curated models (RNA_binding, ATP,
Calcium, and Chloride), or SeqFEATURE models (all upper-case, built from
PROSITE patterns).
Step 3. Choose how you want to receive your results.
You can run WebFEATURE either in:
- "Interactive" mode, where you wait for your results to
return to the web browser. Choose this mode if your structure is not
too large.
- or by E-mail, where WebFEATURE sends you a URL link to your
results via e-mail, once the WebFEATURE scan of your structure is
complete. Choose this mode if your structure is large, i.e. a
ribosomal structure, as it may take around 10-15 minutes to scan.
- Choose either mode by clicking the respective option button. If
"E-mail" mode is chosen, then the user must input their
e-mail address in the "email" text field.
Step 4. Click "Submit" button.
back to contents
Step 1. Make sure Chime plug-in is installed properly in your browser.
You can download Chime from http://www.mdlchime.com/chime/.
Follow their instructions for Chime installation.
Step 2. Turn on JavaScript Support for browser.
Step 3. Open browser window.
- If "interactive" mode was chosen when the WebFEATURE
scan was started, the browser will be open and results will be
automatically loaded into the browser.
- If e-mail mode was chosen during the scan, open the results page
from the URL link provided in the e-mail message sent to you by
WebFEATURE.
Step 4. Play with your results!
Navigating the WebFEATURE Results Window
Below are screenshots of the results page. Each panel is labeled
with its function. Click the image for a bigger view.
Chime Viewer Window
- The structure, het atoms, and hits above cutoff are automatically
opened in the Chime Viewer Window of the WebFEATURE results page. The
model is displayed in "cartoon" representation, the het atoms in "dot"
representation, and hits above cutoff as red spheres. Use the Hits
Panel and Represenation buttons to change their representation and
cutoff.
Hits Panel
- Change the hits (red spheres) visualized in the
Chime viewer by adjusting the cutoff in Hits Panel. This is done by:
Entering a new score cutoff in the "Cutoff" textfield, or by clicking
on a bar in the "Score Distribution" histogram.
- Color visualized hits by
score. Click on the "By Score" color button in the Hits
Panel. Lowest hits are colored closer to the blue end of the color
spectrum, while highest scoring hits are colored towards the red end
of the color spectrum.
Model Info Panel
- The model Info Panel provides background information about the
model used for scanning.
- Click on the "More Info" button to view the 2-D plot of
the statistical model used in scanning. Red squares represent
abundant property-volume pairs in the sites versus the nonsites. Green
squares represent deficient property-volume pairs in the sites versus
the nonsites.
Manipulating Representations
(Novice Chime Users)
- The buttons below the Chime Viewer provide basic manipulations on
the molecule, hetero atoms, and WebFEATURE hits.
- Click on a selection first, i.e. molecule, water,
or het.
- Then Click on a "representation"
button, i.e., spacefill, dot, cartoon, backbone, or sticks.
- To make hits appear bigger, select the desired
cutoff by either clicking on a bar in the Score Distribution
Histogram, or by entering a score cutoff in the "Cutoff"
textfield. Then click the "Spacefill" button under the Chime
Viewer.
(Advanced Chime Users)
- Go to http://www.umass.edu/microbio/chime
for a tutorial on Chime
- WebFEATURE hits are represented as the residue type "HIT" when
loaded into Chime. Use the right-button mouse functions to change the
representations of the hits and molecule. Hit scores are represented
as B-Factor, or Temperature. Coloring by temperature will color the
hits according to score. Lower scoring hits are closer to blue in the
color spectrum, while highest scoring hits are closer to red in the
color spectrum.
Running Another WebFEATURE Scan
- You can run another scan from the WebFEATURE results page by
filling out the appropriate information in the "WebFEATURE Scan"
Panel. WebFEATURE will return the results in whichever mode was
previously run, "Interactive" or "E-mail" modes.
back to contents
The results of a WebFEATURE scan can be downloaded for further
analysis using the molecular modeling tools RasMol, and PyMOL.
These tools allow integration of WebFEATURE results with other
structural and bioinformatic analyses and the generation of
publication quality images as well as provide a more powerful command
line interface. More information on installation, usage, and tutorials
on these tools can be found at: http://www.openrasmol.org/, http://www.pymol.org.
-
Before you begin offline analysis, either RasMol or PyMOL
must be installed on your local machine as well as the python scripts
and modules we provide from this site.
Step 1. Download results
- For RasMol analysis, click on the link to
"pdb-hit" in the "Files:" area of the Hits Panel (see screen shot)
- Name and save your file.
- If you are using Internet Explorer, you will be prompted to
save the pdb-hit file to disk.
- If you are using Netscape, right-mouse click on the link
"pdb-hit". Choose "Save Link As..." and save the
pdb-hit file with the .pdb extension to its filename. Otherwise,
simply clicking on the link will open the pdb-hit file in a new Chime
viewer window.
- For analysis in PyMOL, click on the
link to "hits" in the "Files:" area of the Hits Panel (see screen shot)
- A list of X,Y,Z coordinates and scores will be shown as text in
the browser.
- You can save this as a file or copy and paste the text into a new
file using any text editing tool.
- Be sure to label the file using the file extension
".hit", i.e. "yourfilename.hit".
Step 2. Download Scripts for Visualization Tools
Step 3. Launch either RasMol or PyMOL
and follow the directions in the next section.
back to contents
Step 0. Install RasMol
- Download and install RasMol from http://www.openrasmol.org/
Step 1. Download results
- Download the pdb-hit file from the Files section of the Hits Panel (see screenshot)
- Right click on pdb-hit and select "Save Target As..." (Internet Explorer) or "Save Link As..."
(Netscape) to save the pdb-hit file
Step 2. Load structure into RasMol
- Start up RasMol
- Go to File | Open and select the pdb-hit file downloaded from Step 1.
- Initialize the representation by typing the following RasMol commands:
wireframe off; cartoon;
select hetero and not water and not hit; dots;
select hit; color atoms red;
select hit; spacefill off;
select hit and temperature > 5000; spacefill 100;
This will turn off the default wireframe representation of the structure and display it as a cartoon.
It will then display the hetero atoms as dot spheres. Finally it will render the hits above cutoff 50.0
from WebFEATURE as red spheres of radius 100 RasMol units (1/250th Angstroms). The hits are encoded as
HETATMs of atom name HIT and residue name HIT with the hit scores stored in the temperature field.
Cutoff scores must be scaled by 100 before being used as cutoff values for the temperature. For
instance, to have a cutoff of 50.0, use 5000 as the threshold for the temperature.
Step 3. Adjust cutoff and hit representation
- The cutoff can be adjusted by typing the following RasMol commands:
select hit; spacefill off;
select hit and temperature > cutoff_times_100;
spacefill 100;
This will render hits above the specified cutoff as spheres of radius 100 RasMol units (1/250th Angstroms).
The cutoff score must be scaled by 100 before being used as the cutoff value for the temperature.
For instance, to use a cutoff of 50.0, use a threshold of 5000 for the temperature.
- The color of the hits can be changed by:
select hit; color atoms color_name;
- The radius of the hits can be changed by replacing:
spacefill 100;
with:
spacefill hit_radius;
when setting the cutoff. The radius can be specified as an integer for RasMol units (1/250th Angstroms)
or as a value containing a decimal point for Angstroms.
Step 4. Manipulate and interact with model
- Information on selecting, changing representation, and interacting with the model is available at
http://www.openrasmol.org/ and
http://www.umass.edu/microbio/RasMol/
back to contents
Step 0. Install PyMOL and viewhits.py
- Download and install PyMOL from http://www.pymol.org/
- Download the viewhits module for PyMOL
- Save the viewhits.py in a location easily remembered
Step 1. Download results
- Download the hits and pdb file from the Files section of the Hits Panel (see screenshot).
- Right click on pdb and select "Save Target As..." (Internet Explorer) or "Save Link As..." (Netscape) to save the pdb file
- Right click on hits and select "Save Target As..." (Internet Explorer) or "Save Link As..." (Netscape) to save the hits file
Step 2. Load structure into PyMOL
- Start PyMOL
- Load the pdb file previously downloaded in Step 1 by typing:
load pdb_filename
Step 3. Load viewhits module
- Load the viewhits.py module by typing:
run viewhits.py
or
run viewhits_sf.py
Step 4. Adjust cutoff and hit representation
- The hits can be viewed by typing (substitute "viewhits_sf" if you are using viewhits_sf.py):
viewhits hits_filename
- The cutoff can be adjusted by adding the cutoff parameter:
viewhits hits_filename, cutoff
- The color of the hits can be changed by:
viewhits hits_filename, color=color_name
- The radius of the hits can be changed by:
viewhits hits_filename, radius=radius
- The full usage for viewhits is:
viewhits hits_filename [, cutoff=cutoff]
[, radius=radius] [, color=color_name]
- More information can be available by typing:
help viewhits
Step 5. Manipulate and interact with model
- Information on selecting, changing representation, and interacting with the model is available at http://www.pymol.org/
back to contents
SeqFEATURE accepts input for calculating feature vectors in the form of "points files" (extension .points or .ptf). Points files contain a label for the protein, the X, Y, and Z coordinates, as well as a label for the residue ID, chain ID, and atom ID for each point. An example is shown below (the first column sometimes contains a unique identifier for the site, similar to "Env_1bqy_0"):

You might encounter these files if you run a full SeqFEATURE library scan on a structure, or if you download, install, and run Feature from your own machine.
SeqFEATURE calculates feature vectors for each point it is given and outputs them into feature files (extension .features or .ff), one feature file for each points file. Each line consists of one feature vector; each feature vector contains the values calculated for 480 features (see publications for details), tab-delimited, with the site identifier in the first column and the site description (residue ID, chain, and atom) in the last column. You might encounter these files if you run a full SeqFEATURE library scan on a structure, or if you download, install, and run Feature from your own machine.
SeqFEATURE outputs results of its scans into score files (extension .scores or .zscores), usually one score file per feature file. Score files contain the site identifier in the first column, the score or z-score in the second column, the X, Y, and Z coordinates, and the site description (residue ID, chain, and atom). See example below:

In the case of data from the PDB scan, which can be retrieved using either PDB ID or SeqFEATURE model, the second column contains the name of the model, and the site description is split into its three constituent parts. See below:

Statistics for each model can be used to evaluate score files and are available here. The model statistics file contains the AUC, partial AUC (calculated using only the top-scoring 100 negative sites as the background), the 100% specificity z-score cutoff (at which 100% of the negatives from the training set are predicted correctly), the corresponding training set sensitivity at that cutoff, the 99% specificity z-score cutoff, the sensitivity at that cutoff, and the 95% specificity z-score cutoff and sensitivity at that cutoff.
When evaluating the strength of a prediction, one should consider a number of factors:
- Model AUC and ROC curve (How well does the model predict true positives as opposed to false positives?)
- Score distribution of training sites (How well does the model distinguish between real sites and background?)
- Model cutoff (How much higher than the cutoff is the score?)
- Multiple hits (Does the model predict multiple hits in a same region, or do other similar models predict hits in the same region?)
- Visual inspection (Does the local region contain features characteristic of the predicted function?)
- Corroboration with other methods
ROC and other performance plots can be found on each model's Info page, accessible through the model drop-down menu on the main WebFEATURE page.
back to contents
|