You need some version of MS if you plan on using SPHGEN to generate site points for docking (the standard approach; however, see below). MS was written by Mike Connolly (Science 221: 709, 1983 and J. Appl. Cryst. 16: 548, 1983) while at UCSF. There now exist two versions of MS - a UCSF version and a QCPE version, differing primarily only in file formats. Both versions are supported with auxiliary utilities provided with DOCK. If you obtain the MidasPlus display software from the UCSF Computer Graphics Lab, you will be provided with the UCSF version of MS (dms). Alternatively, you should obtain the QCPE MS version for a nominal charge from QCPE, item #429.
Tel. (812) 855-5539
Fax. (812) 855-4784
Eml. qcpe@ucs.indiana.edu
It is important to realize that DOCK uses only the centers of the
spheres in the docking process; no other information about the spheres
is used (e.g. radius). These sphere centers are more generally referred
to as site points, as they are in fact only 3D coordinates of points
within the target site. Site points need not be generated by SPHGEN,
although we recommend this method as it does an excellent job of capturing
shape information about the site of interest. Any method which results
in points throughout the site could be used. For example, you could use
the coordinates of a known ligand; you could use points along a solvent
accessible surface of the receptor; you could use a grid of points in
the site if you had years of CPU time... Alternatives that take into
account the chemistry of the site are also possible, for example,
Goodford's GRID program. Interaction "hotspots" generated by this
program can be used as surrogate sphere centers (we in fact provide a
few tools to help interface with the GRID program). Be imaginative -
you needn't use SPHGEN.
We have utilities to smoothly automate the process of getting from
standard PDB files to surfaces of desired regions. These are
distributed with DOCK 3.5 (the autoMS suite), but we will be glad to
distribute them to DOCK users who have more dated versions of the
software.
Since our resident expert on Delphi (Brian Shoichet) left the group
some years ago, we have no longer been able to actively support Delphi
forcefield scoring with DOCK. The interface is still in the DOCK code,
so in theory Delphi scoring should work. However, Biosym has changed
their Delphi output format in recent years, but we do not provide a
reformatting tool. It is my understanding that upon contacting Biosym,
they can provide you with a binary-to-ASCII reformatting program so
that Delphi output from Biosym will be usable with DOCK.
We do not use receptors with hydrogens in generating molecular surfaces.
This is for several reasons, among them 1) the MS parameters for heavy atoms
are already rather generous, and 2) the surface often ends up being too
bumpy when hydrogens are included. Hydrogens are not used in docking, but are required for force-field scoring.
The following are a few thoughts on philosophies behind using the
Cambridge Structural Database (CSD) for docking. First, it can be
misleading to believe that just because these structures have been
crystallographically determined that they are more likely to represent
bioactive conformations. Often, the biologically observed bound
conformation is indeed not the global minimum energy conformation.
It remains to be proven that using the CSD as opposed to a database
of rule-based conformations (e.g. by CONCORD, such as the ACD, CMC
or MDDR) is any more effective. Second, we have found the getting
a hold of CSD compounds can be tremendously difficult, with a very a
high attrition rate at the stage of obtaining compounds for assay.
This should be a concern when using databases of compounds for purposes
of "easy" access to biological characterization. We don't mean to
deter you from using the CSD, only to highlight two common issues.
Yeah, yeah, I know. Because of the many and varied output types and formats of DOCK,
detecting exactly how the output was generated is difficult. So...this
is a complex way of saying that this program is likely not to be
infallible... If you find problems, please check with the Kuntz group
for the latest version.
There are three fundamental operations in using this feature:
These points are addressed in some detail here:
Some general points: labels can be anything and mean anything you want
them to (you might have a set of labels "purple, green, red, yellow,
blue", or a set "hot, warm, cold", whatever these might mean to you).
The rules you set up define which labels can match with which other
labels. The matching rules can use the same label more than once (for
example, cold-hot and cold-warm). However, each ligand atom or receptor
site point can only have one label assigned to it. By "matching
labels", we mean that come docking time when one atom is supposed to be
placed upon a particular site point, the program assesses whether the
corresponding labels can match together depending upon the rules you
have set forth. Not all atoms or site points need be assigned a label:
unlabeled points match everything.
Final comments: chemical matching is a very flexible tool and can
address a lot of needs, but may be daunting to implement. Keep with it
and you will probably be pleased with its operation. Also, do not think
that chemical matching must related to chemical complementarity. One
advanced implementation might be towards finding mechanism-based
inhibitors. For example, one could label certain reactive functional
groups in ligands appropriately, and certain receptive centers in the
target site for a reaction to occur similarly. By setting up matching
of these labels, one could attempt to locate ligands which placed
reactive groups adjacent to, say, nucleophiles on the receptor. Be
creative!
No! Every system is different, and you should experiment to see what
works best. In general, larger bin sizes and/or larger overlaps increases
sampling, effectively having dock "try harder". There are no standardized
parameters which work for everything, sorry. Be careful when increasing
the parameters, as small changes can have exponential effects on run-times.
For cases when you have an experimentally determined binding mode for
a ligand (e.g. substrate or inhibitor), make sure you can reproduce this
with your chosen parameters before attempting a database screen.
As a general rule of thumb, generate around 5,000-10,000 matches per ligand
when using force-field score minimization, and at least 20,000 matches without
optimization.
WARNING--parameters not found for
ATOM 11 HA THR 1
negative -100 -3
hydrophobic -3 3
positive 3 100
where the numbers represent ranges for the electrostatic potential that
would correspond to the listed color, i.e. hydrophobic regions might be
in areas of low absolute ESP. Assigning labels according to
electrostatic potential can be difficult and may require some
experimentation to get the desired results.
Curator: Malin Young,
mmyoung@polonius.ucsf.edu