Automated Solvent Building

 

The solvent building procedure within ARP/wARP Version 7.0 has only marginally changed since the previous release. Within this module restrained reciprocal space refinement is carried out with REFMAC while ARP/wARP is performing automatic adjustment of the solvent structure. Resolution of the data should be 2.5 or higher. The output is the protein model with the solvent molecules transformed with symmetry operations to lie around the protein.

 

The ARP/wARP solvent building module requires the X-ray data (in MTZ format) and the protein model (in PDB format) without solvent or with a partial solvent model.

 

 

o               MTZ in X-ray data in the MTZ format containing structure factor amplitudes and their standard deviations.

o               Fobs Sigma If the MTZ column labels for structure factor amplitudes and their standard deviations have obvious names, they will be recognised automatically. Otherwise please use the scrolling button, navigate to List All Labels and chose appropriate ones.

o               Starting model in Provide the PDB file with coordinates of the protein only. If the file already contains some solvent sites, these will be updated during the iterative solvent building.

o               Output model Provide the name of the file where output PDB of the protein with the built solvent will be written to.

 

 

There are a number of options that can be added either in the main GUI panel or under the Parameters section. You normally should not need to worry about these. A brief description is given below.

 

o               ARP/REFMAC refinement cycles By default 20 cycles will be carried out. Please monitor R factor / R free for convergence.

o               Free R flag It is advantageous to use R free flag for solvent building. You can chose to use R-free, this will cause additional options to appear within the section Refmac parameters. The default is not to use R free.

 

o               Add atoms This follows by two numbers defining the threshold (in density sigmas above the mean) for addition and removal of solvent atoms. The defaults are 3.4 and 1.0, respectively, which should work for most cases.

o               Disable Wilson plot statistics check The current Wilson plot checking routine is probably too stringent.  You may disable the check and the warnings if you are sure that the X-ray data is of high quality.

 

o               Cycles of refinement in each Refmac run Refmac is invoked to refine the hybrid model before the density maps are computed. The default is 1 cycle and there is usually no need to change this.

o               Matrix weight for Xray / Geometry The default is automatic weighting. This proved to work well and, probably, there is no need to change this parameter.

o               Scaling model The default is to use solvent correction for scaling low angle part of the X-ray data. You can turn this off (chose simple solvent correct) if your low angle data are missing (e.g. your data have about 8 low resolution cutoff) or they suffer from missing overloaded reflections.

o               Scaling B factor The default is to use anisotropic B factor for scaling the X-ray data. You can turn this off (chose isotropic scaling B factor) if your data are systematically incomplete (e.g. a cone is missing in reciprocal space).

o               Data with free R label This parameter appears if the free R flag is chosen for refinement of the protein part of the model. Here you can provide a column label for the free R flag.

o               Scaling and sigmaa calculations This parameter appears if the free R flag is chosen for refinement of the protein part of the model. The scaling and calculation of sA coefficients by Refmac map can be computed on the bases of the free reflections (this is the default) or using all reflections.

o               TLS refinement The default is not to do a TLS refinement of a hybrid model.

o               Input a user-defined library file If you already have a Refmac-style cif library for, e.g. your already present ligand, you can input it here.

 

o               Space group, Cell, ARP/wARP asymmetric unit, Wilson B factor and Solvent content are derived automatically from the MTZ and the PDB files, displayed for information only and cannot be changed. However, you may want to check whether their values conform to your expectations.

o               Resolution By default all reflections present in the MTZ file will be used. You can check the box (Use reflections between) and then narrow the range if you are aware of certain deficiencies of your data.

 

 

o               Refinement with refmac The R factor (and R free if requested) are printed after refinement of the protein with Refmac. Check that the value of the R factor is decreasing upon solvent building.

o               Job termination The statement Task completed successfully indicates that the job is finished with no error. An error statement

QUITTING ARP/wARP module stopped with an error message: name_of_the_programme indicated that one of the modules of the task has terminated with an error message. Please refer to the specified log file.

 

 

Running solvent building from command line (auto_solvent.sh)

The script file auto_solvent.sh in the $warpbin directory allows you to run the solvent building as a single-line command without the use of the GUI. The use of auto_solvent.sh is fairly simple. The script prints out help information if it is invoked without arguments.

Required keywords are: datafile (followed by the mtz-file name with the full path) and protein (followed by the pdb-file name of the protein model with the full path).

Optional keywords include: workdir (followed by the full path to the working directory), solventfileout (followed by the name of the PDB file where the output will be written), fp (followed by the fp label), sigfp (followed by the sigfp label) and freer (followed by the Rfree label). The defaults for the first two are FP and SIGFP, respectively. Alternatively, if the mtz file contains only one column for structure factor amplitudes and only one column for their standard deviations, these will be taken. The number of cycles (default is 20) can be changed with keyword restrcyc. The user-defined library and the tls-tensor for Refmac can be supplied by using the keywords extralibrary and tlsin.

Example call (assumed to be started from workdir where test data should reside):

 

$warpbin/auto_solvent.sh                                                  \

              datafile {mtzfile}                                          \

              protein {starting_PDB_file}                                 \

              [workdir {FULLPATH_WORKING_DIRECTORY}]                      \

              [solventfileout {output_PDB_file}]                          \

              [fp {fp_label}] [sigfp {sigfp_label}] [freer {freer_label}] \

              [restrcyc {number_of_cycles}]                               \

              [extralibrary {user_defined_library_for_Refmac5}]           \

              [tlsin {fixed pre-refined TLS tensors from Refmac5}]

 

The script will then create a directory in the workdir whose name will be printed and where a parameter file will be created. The log files and additional output files as well as the building results can be found in the directory created by auto_solvent.sh.