Kuntz Home /
DOCK Home /
DOCKumentation Contents /
Database Preparation
mkdb
George Seibel
The interactive program mkdb converts Cambridge Structural Database
(CSD) formatted coordinate files to a database format used by DOCK (SEARCH
mode; contact scoring only). Only the largest molecule in each CSD record is
written out, thus removing counterions and solvent, and handling instances of
multiple molecules per unit cell. The CSD-formatted coordinates can be taken
from the formatted file distributed by Cambridge, or obtained by running the
Cambridge program RETRIEVE. The user is asked for the input and output file
names, and whether or not hydrogen atoms should be included in the output if
present in the input. While hydrogens are not presently used in generating
orientations or contact scores, is often convenient to include them anyway so
that they will appear in the final ligand output from DOCK.
The database format produced by mkdb contains the following
information: the first line contains the CSD reference code, the number of
heavy (nonhydrogen) atoms, and the number of hydrogens (format A8,2I3); the
next lines list the atomic numbers of the heavy atoms (reusable format 40I2);
finally, the coordinates of the heavy atoms are given (reusable format 16I5).
The coordinates represent translation to the positive quadrant, multiplication
by 1000, and rounding to the nearest integer. The first three numbers are the
coordinates for the first heavy atom, the next three are for the second heavy
atom, and so on. The hydrogen coordinates, if any, are listed after the heavy
atom coordinates.
Curator: Daniel Gschwend, gschwend@cgl.ucsf.edu (rev. 1 September 1995)