Kuntz Home / DOCK Home / DOCKumentation Contents / Database Preparation
prev section prev toc up next next section

mkdb

George Seibel

The interactive program mkdb converts Cambridge Structural Database (CSD) formatted coordinate files to a database format used by DOCK (SEARCH mode; contact scoring only). Only the largest molecule in each CSD record is written out, thus removing counterions and solvent, and handling instances of multiple molecules per unit cell. The CSD-formatted coordinates can be taken from the formatted file distributed by Cambridge, or obtained by running the Cambridge program RETRIEVE. The user is asked for the input and output file names, and whether or not hydrogen atoms should be included in the output if present in the input. While hydrogens are not presently used in generating orientations or contact scores, is often convenient to include them anyway so that they will appear in the final ligand output from DOCK.

The database format produced by mkdb contains the following information: the first line contains the CSD reference code, the number of heavy (nonhydrogen) atoms, and the number of hydrogens (format A8,2I3); the next lines list the atomic numbers of the heavy atoms (reusable format 40I2); finally, the coordinates of the heavy atoms are given (reusable format 16I5). The coordinates represent translation to the positive quadrant, multiplication by 1000, and rounding to the nearest integer. The first three numbers are the coordinates for the first heavy atom, the next three are for the second heavy atom, and so on. The hydrogen coordinates, if any, are listed after the heavy atom coordinates.


prev section prev toc up next next section

Curator: Daniel Gschwend, gschwend@cgl.ucsf.edu (rev. 1 September 1995)