Kuntz Home / DOCK Home / DOCKumentation Contents / Beginner's Guide
prev section prev toc up next next section

Running DOCK

Setting Up / INDOCK / Starting / Restarting / Results

Setting up Directories

Before starting DOCK, it is a good idea to confirm that there is disk space where you plan to put the output. Each DOCK run requires a file called INDOCK, which contains the input parameters. Since some of the parameters will be different for each run, a directory should be created to contain each INDOCK and probably the corresponding output. Using a separate directory is a good idea even for just one run.


Creating INDOCK

Parameters in the INDOCK file are specified by keywords, which are listed one to a line. The desired value of each parameter follows the keyword on the same line. You may create the INDOCK file with a text editor or copy one of the examples supplied with DOCK and modify it to suit your needs. The reference manual includes several sample INDOCK files. You only need to include in your INDOCK file those parameters relevant to your calculation or variables whose values you want to change from their defaults. Any line beginning with # will be considered a comment and ignored; comments can be quite useful in making the file more understandable.

Keywords available for INDOCK are listed here. The values suggested are reasonable initial guesses; they may not be the best values for your particular system. We suggest that you experiment with them to see what works best for you.

The input and output parameters are relevant to all DOCK runs.

The matching parameters determine how many different ligand orientations DOCK will examine - these parameters are relevant to all DOCK runs. Increasing or decreasing the number of orientations increases or decreases the amount of computer time used by DOCK and the disk space used by single-mode runs. It may take some experimentation with these parameters to discover what works best for a particular system. Be cautious when increasing bin sizes and bin overlaps; small changes can produce large increases in the number of orientations generated.

Are there standard bin sizes?

When running SINGLE mode to generate many orientations for one molecule, one uses these parameters. When running SEARCH mode to obtain the best orientation for each of many molecules in a database, one uses these parameters. Which scoring parameters to use depends on the scoring option chosen. distmap_file should be included for any option involving contact scoring (contact, contact+delphi, contact+forcefield). Delphi_file is used only for contact+delphi. The remaining parameters in this table pertain to force field scoring and are used with the contact+forcefield or forcefield options.

The chemical matching parameters are used to specify how labeled spheres and atoms (if any are used) are to be matched. Leave them out if you do not use chemical matching.

Parameters for force field score optimization (minimization) of orientations are listed here. Minimization varies the position of ligands in order to find orientations with improved force field scores. Consult the reference manual for a for a description of how to use minimization. It lists the relevant INDOCK keywords and gives guidelines for choosing parameter values.


Starting a DOCK Run

Before running DOCK it is a good idea to check whether there are other jobs running on the same machine. DOCK runs use substantial amounts of CPU time; consider any other users sharing your computers when deciding whether to start more than one run at a time. Be aware of any policies your site has regarding cpu time used.

Start DOCK from the directory you created for INDOCK. Check a few minutes after you start the run to be sure that it is still going; if it has stopped, look for mistakes in the input. Beginners should check disk usage occasionally while the job is running, just in case the program is creating incredibly large files which might overflow the available space.

During a SEARCH run (which can take anywhere from hours to days to weeks to finish), you can follow DOCK's progress through the database by looking at the last few lines of OUTDOCK. The number preceding the last nathvy tells approximately how many ligands have been examined.


Restarting a Search Run

In SEARCH mode, DOCK periodically saves in the output file the information necessary to restart the search from its current location in the database. If there is a power failure or the system crashes, you can set up a new run to start where the last one was stopped. First, you must rename OUTDOCK, since DOCK will try to create a new OUTDOCK file, and it cannot do so if one already exists. Then set the
restart parameter in INDOCK to yes and start the job again. (Do not change the remaining files, since DOCK needs them to restart successfully.) When the restarted run finishes, the sorted list of ligands in the output file will include the top scorers from the entire database. However, some of the statistics in OUTDOCK will refer to just those ligands examined in the restarted run - see the reference manual for details.


Looking at the Results

DOCK puts its output in the directory it was started from, that is, where the INDOCK file is. For SINGLE runs, there is one file of orientations per sphere center; the names of these files are
outfil+the cluster number. For search runs, there are files containing top-scoring ligands for each type of scoring chosen. Ligands with the highest contact scores are in a file named outfil+the cluster number, top electrostatic ligands are in a file named outfil+eel+the cluster number, and the file of ligands with the best force field scores is called outfil+ff+the cluster number. The ligand files are in extended PDB format, which differs from PDB format in the columns to the right of the coordinates in the ATOM records. Each orientation or ligand in the file has a separate residue number. The scores are given in the REMARK records at the beginning of each residue and are also listed near the end of OUTDOCK.

Extended PDB format allows more information to be included in the atom records. Scores for options not used originally may be quickly evaluated for ligand files with this format using scoreopt or scoreopt2. However, some molecular display programs may not accept this format. If you find that you cannot display your ligands, you can convert them to PDB format using the program x2pdb supplied with DOCK. A useful way to view ligands is to display the surface of the protein active site along with a few important residues, then examine ligands one at a time. showesp may be used to visualize the electrostatic potential due to the protein, and showprobe can display the interaction energy of a probe with the force field grid. splitmol can be used to separate ligand orientations into individual files if necessary.


prev section prev toc up next next section

Curator: Daniel Gschwend, gschwend@cgl.ucsf.edu (rev. 1 September 1995)