Loopy

Loopy is a program which tries to find likely loops to connect fragments of a partial protein structure based on the expected structure and the density map.

Building loops using structural and density information

Loopy builds the loops in three phases. First a tree of possible CAs between the fragment is build, next the unlikely ones are removed and the rest of the main chain atoms determined, and finally the best loops are selected. The tree can be build either towards the C-terminus of the N-terminus of the protein, or both.

Protocol

Job title
Title for the current experiment
Experimental data
Select whether to use a map or an mtz file. In the case of an mtz file, the program will use fft to compute the corresponding map.

Files

Input map
Input map to use
MTZ
Mtz file to use. F and PHI are used to compute the corresponding map using fft. We need to save this file, since we need to reread the map more than once.
Input pdb
Input pdb for your protein. Please, remove residues which you would like to rebuild from this file. This frontend of loopy will not rebuild any residues.
Name for first loop pdb
The name of this file is used as a format to determine the names of the other loops to save
Number of loops
Select the number of loops you'd like the program to save. It might very well be that the number of loops left after pruning is less, than this number. In that case the number of loops saved, will be less than you asked for. If no loops are found at all, twiddle with the parameters, specifically those in the folder "Selecting best CAs"

Crystal Parameters

The spacegroup name and cell dimensions are extracted from the map/mtz file.

Definition of loop

In this folder you select which loop to build.
N-term anchor
Anchor residue of a fragment on the N terminus side of the protein. Note that if you want to rebuild some residues, you need to remove them from the pdb file
C-term anchor
Anchor residue of a fragment on the C terminus side of the protein. Note that if you want to rebuild some residues, you need to remove them from the pdb file
Loop length
Number of residues in the loop including the two anchor points
Loop sequence
Sequence of amino acids (one letter code) of the residues in the loop including the two anchor points.
Build both ways
If selected (default) trees of possible loops are generated starting from the N terminus anchor, and from the C terminus anchor. The best loops to save are selected from the combined set. Since the quatrapeptides from either end of the fragments will in general differ, just as the map, starting from a different anchor will influence the loops generated.
Build towards C-terminus
If you didn't select to build both ways, you can indicate whether you want to build the tree towards the C terminus of the protein, or towards the N terminus

Selecting best loops

In this folder you can set the thresholds used to prune the tree from incorrect loops and the weights used to select the best loops.
Deviation distance loop connection
The distance between the end CA of the loop and the connecting CA of the structure should be approximately equal to CA-CA distance. Set the allowed error in the distance.
Threshold density correlation CAs
After pruning on the distance, the next step is to select the best trees based on the density correlation of the CAs. This number sets the number of best loops kept based on the density correlation of the CAs only.
Structural threshold
You can prune on the structure of the end CA of the loop and the connecting quatrapeptide. Set the threshold for the minimal value for the log likelihood of this structure
Minimum for this stage
Set this value, if you want to ensure to keep at least a certain number of loops after pruning on the structure... overruling the structural threshold if necessary
Maximum for this stage
Set this value, if you want to ensure that the number of loops doesn't exceed a certain amount after structural pruning... keeping only those with the highest structural likelihood
Main chain density correlation
After pruning on the structure, the peptide planes for all residues in the selected loops are determined. The loops are sorted to the best density correlation of the main chain atoms (including Cb for non-GLY). This threshold sets the number of best loops kept
Weight main chain
Finally the best loops are selected by determining the density correlation of the main chain atoms (including Cb if present) and the correlation of the side chains. You can use this weight to give the main chain correlation more or less impact.
Weight side chain
Finally the best loops are selected by determining the density correlation of the main chain atoms (including Cb if present) and the correlation of the side chains. You can use this weight to give the side chain correlation more or less impact.

Selecting best CAs

During the building of the tree of possible paths, shells of CAs are generated (see top). In this folder you can set the thresholds etc. which determine how to select the best CAs from all the CAs in one such a shell. Note: generated CAs with a negative density correlation will be removed immediately.
Likelihood threshold
This is the threshold for the log likelihood of a CA to represent the fifth CA of a peptapeptide, based on density correlation, CA-CA distance, and structure.
Weight distance
Weight for the distance likelihood
Weight density
Weight for the likelihood of the density correlation
Weight structure
Weight for the structural likelihood
Structure table to C
Filename for the probability table for the angles and dihedral angles of a pentapeptide in the direction of the C terminus
Structure table to N
Filename for the probability table for the angles and dihedral angles of a pentapeptide in the direction of the N terminus
Minimum distance CA
Measure for the minimal distance between CAs from the same shell. The CA with the best likelihood is kept.
Maximum number of CAs
Maximum number of CAs from each shell to keep. Note: The CAs kept will all be used as a new suggestion for the current residue in the loop, and thus as a new node in the tree. The number of possible loops generated will expand exponentially with this number.
Force minimum number of CAs
Force a minimum number of CAs in a shell to be kept, even if the likelihood is less than the threshold set. This makes the loop building a bit more flexible in low density areas, or for pentapeptide structures which occur less often.

Generating CAs

This folder describes how the shells of CAs are generated.
Select generation CA shell
Default a shell with a uniform and regular distribution of CAs at exactly CA-CA distance is generated. You can also choose for a uniform and random distribution of the CAs. In that case the shell is generated with a given thickness
Number of CAs
Number of CAs generated within a shell. In the case of a regular distribution this number is rounded downwards to the closest Fibonacci number.
CA-CA distance
Distance to use between successive CAs.
Shell thickness
(random shell only) Thickness of the generated shell of CAs.
SD CA-CA distance
(random shell only) We assume that the probability for the CA-CA distance is described by a Gaussian. With this value you can set the standard deviation fo the Gaussian function.
Keep CAs with negative density halfway
Due to the structure of a peptide, we expect the density correlation halfway between successive CAs to be positive. A quick first selection of CAs from the shell is thus (apart from the density correlation at the generated CA) based on the density correlation midpoint. Default for this option is false.

Density Handling

In this folder you can set the details of the map handling in detail.
Interpolation method
Choose the interpolation method used to determine the density correlation from the map. The quick, but less accurate option is Cubic Interpolation. More accurate, but significantly slower is Best, which is based on a gaussian density correlation function to similate the shape of an atom. Note: the density correlation is determined many times, thus the difference in speed between the options can be huge. The side chain correlation at the end is always determined using the gaussian version.
Atom radius
Radius used to determine the density correlation
B factor
At the moment the values for the b-factor in the pdb are ignored. The value set, will be used for all atoms
Remove atoms by factor
To avoid overlaps between the generated loop and the protein structure in the pdb, atoms in the pdb (apart from dummies, or residues in chains consisting only of main chain atoms) are removed from the map. This is done by flipping the density in the map to negative values at the position of these atoms. With this factor you can set the factor with which the density is changed
Density threshold residues
Threshold for the density correlation of residues after loop building. This is used to check overlap between the loop and possible fragments of main chain atoms in the pdb.
Density threshold dummies
Threshold for the density correlation of dummies after loop building. This is used to check overlap between the loop and possible dummy atoms in the pdb.

Log files of Loopy

Loopy writes is own logs to file. The extend of messages depends on the levels you set in this folder.
Message level
Level of the messages to be written to file. (Value from 0 till 9)
Abort level
If a message of this level is encountered, terminate the program. Standard values are 7 or 8
Message file
Name for the message file (plain text) of Loopy.
XML output file
Name for the XML message file (xml format) of Loopy.

Krista Joosten
Last modified: Tue Aug 15 13:34:54 CEST 2006