Chemical matching, also called "labeling" or "coloring," was introduced in DOCK 2.3 (Shoichet and Kuntz, 1993). In DOCK 2.3 (never distributed), five labels were hardwired into DOCK: neutral, positive, negative, electron-donor and electron-acceptor. The matching rules were also hardwired: neutral-neutral, positive-negative, donor-acceptor. The integration of the DOCK 2.3 labeling code into DOCK 3.0 presented some difficulties, because the bin arrays, focusing, and ligand database format had diverged between 2.3 and 3.0. Also, the labeling of receptor spheres had been done in an ad hoc way in DOCK 2.3. Additionally, it was desired to make the chemical labeling more flexible, to allow the user to choose the number of labels and the matching rules. Accordingly, the labeling in DOCK 3.5 has been implemented in a more flexible way, where the user chooses the names of the labels and the matching rules.
header line file type DOCK 3.5 ligand_atoms DOCK 3.5 ligand database DOCK 3.5 receptor_spheres receptor sphere clusters/site points DOCK 3.5 parameter INDOCK DOCK 3.5 option DOCKOPTThis first line is of variable length and free format, with spaces separating the fields. The sphere cluster file now contains integer labels assigned to each sphere. The new format is described under sphgen. The new format file contains a list of all of the spheres generated at the end of the file as cluster 0. This obviates the need for the tosph program, which is now no longer distributed. The ligand database file also contains atom labels; see the mol2db section for a description of the new format. Both sphere and ligand files now contain color tables, which are simply lists of the colors (labels), near the beginning of the file, and in the same format for both spheres and ligands. If the user decides not to do atom or sphere coloring, the color table is empty, and the atom and sphere colors are set to 0, which means uncolored. No provision has been made for coloring ligand spheres, since there is no obvious way to do it. New formats for the
INDOCK
and DOCKOPT
files
are described in the dock3.5 section. The new format uses keywords, rather than position in the file, to identify input variables.
Usage
sphgen
This program is run in the same manner as in DOCK 3.0, since it does no
labeling. The output file contains an empty color table. The output format will
be larger than before, because of the all-sphere cluster "0" at the end. chemgrid
The format of the INCHEM
file has changed somewhat. The lines previouslydefining the grid center, grid dimensions, and output grid box file name have
been replaced by the name of the input grid box. The box file is in the format
created by showbox. The center and dimensions are read from the remark
lines of the box file; the rest of the box file is ignored. In the OUTPARM
file the program now reports all residues calculated to have a net charge. A
non-integer charge probably indicates a parameterization error for that
residue. The lines in OUTPARM
reporting charged residues each contain the
string "CHARGED RESIDUE" (viz. grep).dock
The rationale behind a special chemical_matching
command, rather than just
allowing the presence of one or more match commands to implicitly turn on
chemical (colored) matching is that if one desires to toggle back and forth
between steric and chemical matching, it is easier to comment out or uncomment
just one line, chemical_matching
, than to do so for several match lines. In
addition to the chemical_matching
command, one must have one or more
match ligand_color receptor_color
commands to specify the labeled matching rules. The
case_sensitive
command determines whether color names must have the same case
in order to be regarded as being the same color. A yes (default) means that
polar and POLAR are different colors; a no means that they are regarded as
being the same color.mol2db
Each entry in a SYBYL MOL2 file may contain not only a primary molecule, the
ligand, but also other molecules, such as counter ions. The program has always
endeavored to identify which atoms belong to the primary molecule so that only
it, and not the counter ions, is written to the DOCK database output file. The
program now uses the bond records rather than the substructure number to
determine which atoms make up the primary molecule. This avoids possible
problems such as a protein which has a different substructure number for each
amino acid residue. Database Processing
We are now distributing a conversion scheme for creating SYBYL MOL2 files and
DOCK databases out of Molecular Design Limited SD files (such as those used
with MACCS or ISIS). Two methods are provided; one is the scheme traditionally
used here at UCSF and the other is a recently developed alternative. Although
the latter is still under active development, it is robust and easy to use and
therefore likely to supplant the traditional method. Please see the
Database Preparation section for further details.File Format Interchange
We have located a program, available free of charge via anonymous ftp, which
interchanges just about any molecular file format. This program is called
babel and is available
here.Versions supporting Unix, Mac, and MSDOS platforms are available. We
are not vouching for its accuracy, only highlighting its availability.Accessory Programs
Several new utilities have been included with the DOCK 3.5 release. These are
programs we have found to be useful with various aspects of the docking
process. Please note that some older programs have been superseded. What
follows is an overview of utilities present in the DOCK 3.5 distribution.
Programs new to the 3.5 release are in italics. For further documentation on
each of these programs, follow the appropriate link.
INDOCK
filesINDOCK
files to the new keyword formatDelphi
DOCK and Delphi (Klapper et al., 1986; Gilson, Sharp, and Honig, 1987)
have diverged in recent years with respect to input/output formats. We are
no longer actively supporting an interface to Delphi. Version 3.0 is the
last version of Delphi known to function correctly with DOCK, although more
recent releases have not been examined.Backward Compatibility
A large effort has been made to make DOCK as backward compatible as possible.
DOCK 3.5 can read any combination of old (3.0) and new (3.5) format ligand
database, sphere cluster, INDOCK
and DOCKOPT
files. Of course, labeled matching
can take place only if the first three of these files are in the new format.
The new versions of sphgen, cluster, showsphere, and
showbox read and write both new and old format sphere cluster files.
mkdb and convsyb write only earlier format ligand database files
(i.e., without labeled atoms).
Many DOCK arrays and variables that were previously passed as subroutinearguments are now located in new common blocks in new include (.h) files.
Matching code that was located in four source files (Internal Changes Invisible to the User
The .com
compilation shell scripts have been replaced by a Makefile
for the
main directory and each subdirectory. The Unix make
command will recompile only
those source files that have changed since the last compilation. Each Makefile
contains rules regarding what source files an object file depends upon, and
which object files an executable file depends upon. The programmer must keep
the Makefile
up to date, taking into account new files and new dependencies as
they occur. If there is an error in the Makefile
, the compiled program will
have a bug in it. For example, if two source files share a common block in an
include file, and one of the source files does not have this include file
listed in its dependency list, it will not be recompiled when the include file
changes, and the two .o object files will have incompatible common blocks. single.f
, search.f
,
fractm.f
, and sfract.f
) is now coalesced into one source file: match.f
. Scoring
code from these four files has similarly been coalesced into a single file:
mscore.f
. The keyword parameter reading code is in keyword.f
.
Curator: Daniel Gschwend, gschwend@cgl.ucsf.edu (rev. 1 September 1995)