Kuntz Home / DOCK Home / FAQ Contents
DOCK FAQ

A DOCKumentation supplement

rev. 31 August 1995


Current DOCK Clients

  1. Do I really need QCPE's version of MS? Where can I get it?

  2. You need some version of MS if you plan on using SPHGEN to generate site points for docking (the standard approach; however,
    see below). MS was written by Mike Connolly (Science 221: 709, 1983 and J. Appl. Cryst. 16: 548, 1983) while at UCSF. There now exist two versions of MS - a UCSF version and a QCPE version, differing primarily only in file formats. Both versions are supported with auxiliary utilities provided with DOCK. If you obtain the MidasPlus display software from the UCSF Computer Graphics Lab, you will be provided with the UCSF version of MS (dms). Alternatively, you should obtain the QCPE MS version for a nominal charge from QCPE, item #429.

  3. Do I need to use SPHGEN? What are some alternative methods of generating site points?

  4. It is important to realize that DOCK uses only the centers of the spheres in the docking process; no other information about the spheres is used (e.g. radius). These sphere centers are more generally referred to as site points, as they are in fact only 3D coordinates of points within the target site. Site points need not be generated by SPHGEN, although we recommend this method as it does an excellent job of capturing shape information about the site of interest. Any method which results in points throughout the site could be used. For example, you could use the coordinates of a known ligand; you could use points along a solvent accessible surface of the receptor; you could use a grid of points in the site if you had years of CPU time... Alternatives that take into account the chemistry of the site are also possible, for example, Goodford's GRID program. Interaction "hotspots" generated by this program can be used as surrogate sphere centers (we in fact provide a few tools to help interface with the GRID program). Be imaginative - you needn't use SPHGEN.

  5. How do I get QCPE MS to recognize standard PDB files?

  6. We have utilities to smoothly automate the process of getting from standard PDB files to surfaces of desired regions. These are distributed with DOCK 3.5 (the autoMS suite), but we will be glad to distribute them to DOCK users who have more dated versions of the software.

  7. What is the status on using Delphi electrostatic scoring with DOCK?

  8. Since our resident expert on Delphi (Brian Shoichet) left the group some years ago, we have no longer been able to actively support Delphi forcefield scoring with DOCK. The interface is still in the DOCK code, so in theory Delphi scoring should work. However, Biosym has changed their Delphi output format in recent years, but we do not provide a reformatting tool. It is my understanding that upon contacting Biosym, they can provide you with a binary-to-ASCII reformatting program so that Delphi output from Biosym will be usable with DOCK.

  9. What about hydrogens in molecular surfaces, docking, and scoring?

  10. We do not use receptors with hydrogens in generating molecular surfaces. This is for several reasons, among them 1) the MS parameters for heavy atoms are already rather generous, and 2) the surface often ends up being too bumpy when hydrogens are included. Hydrogens are not used in docking, but are required for force-field scoring.

  11. What's all this stuff in my CHEMGRID OUTPARM file?

  12. What about using the Cambridge Structural Database with DOCK?

  13. The following are a few thoughts on philosophies behind using the Cambridge Structural Database (CSD) for docking. First, it can be misleading to believe that just because these structures have been crystallographically determined that they are more likely to represent bioactive conformations. Often, the biologically observed bound conformation is indeed not the global minimum energy conformation. It remains to be proven that using the CSD as opposed to a database of rule-based conformations (e.g. by CONCORD, such as the ACD, CMC or MDDR) is any more effective. Second, we have found the getting a hold of CSD compounds can be tremendously difficult, with a very a high attrition rate at the stage of obtaining compounds for assay. This should be a concern when using databases of compounds for purposes of "easy" access to biological characterization. We don't mean to deter you from using the CSD, only to highlight two common issues.

  14. Bugs in sortDOCKout??

  15. Yeah, yeah, I know. Because of the many and varied output types and formats of DOCK, detecting exactly how the output was generated is difficult. So...this is a complex way of saying that this program is likely not to be infallible... If you find problems, please check with the Kuntz group for the latest version.

  16. How in the world do I use chemical matching (a.k.a. coloring)?!

  17. There are three fundamental operations in using this feature:
    1. coming up with labels and deriving matching rules,
    2. assigning labels to ligand atoms, and
    3. assigning labels to receptor site points.

    These points are addressed in some detail here:

    1. Coming up with labels and matching rules. Labels can be anything you want them to be. The standard procedure is to have them embody ideas of chemical complementarity. For example, a set of labels might be hydrophobic, acceptor, donor, polar, plus, and minus (representing respectively, hydrophobic points, hydrogen bond acceptors, hydrogen bond donors, points which can both accept and donate hydrogen bonds, positively charged points, and negatively charged points). Simple matching rules might then be hydrophobic-hydrophobic, acceptor-donor, acceptor-polar, donor-polar, plus-minus. These would be "allowed" combinations amongst ligand atoms and receptor site points.

      Some general points: labels can be anything and mean anything you want them to (you might have a set of labels "purple, green, red, yellow, blue", or a set "hot, warm, cold", whatever these might mean to you). The rules you set up define which labels can match with which other labels. The matching rules can use the same label more than once (for example, cold-hot and cold-warm). However, each ligand atom or receptor site point can only have one label assigned to it. By "matching labels", we mean that come docking time when one atom is supposed to be placed upon a particular site point, the program assesses whether the corresponding labels can match together depending upon the rules you have set forth. Not all atoms or site points need be assigned a label: unlabeled points match everything.

    2. Assigning labels to ligand atoms. This is done with the mol2db program, which takes a MOL2 ligand database and a set of label definitions (see the ./examples/3dfr/colcrit/keymtx file for a fairly complex example). The output is a dock 3.5 database file containing ligand which have their atoms labeled according to the definitions you provided. If our programs do not suit your set-up, feel free to write your own labeling algorithms and programs, as long as the output format is DOCK 3.5 database format. If you are only working with one ligand, it may be most flexible to do this by hand!

    3. Assigning labels to receptor site points. This is somewhat more complex, and consequently there are several alternatives.

      Final comments: chemical matching is a very flexible tool and can address a lot of needs, but may be daunting to implement. Keep with it and you will probably be pleased with its operation. Also, do not think that chemical matching must related to chemical complementarity. One advanced implementation might be towards finding mechanism-based inhibitors. For example, one could label certain reactive functional groups in ligands appropriately, and certain receptive centers in the target site for a reaction to occur similarly. By setting up matching of these labels, one could attempt to locate ligands which placed reactive groups adjacent to, say, nucleophiles on the receptor. Be creative!

    4. Are there "standard" bin sizes?

    5. No! Every system is different, and you should experiment to see what works best. In general, larger bin sizes and/or larger overlaps increases sampling, effectively having dock "try harder". There are no standardized parameters which work for everything, sorry. Be careful when increasing the parameters, as small changes can have exponential effects on run-times. For cases when you have an experimentally determined binding mode for a ligand (e.g. substrate or inhibitor), make sure you can reproduce this with your chosen parameters before attempting a database screen. As a general rule of thumb, generate around 5,000-10,000 matches per ligand when using force-field score minimization, and at least 20,000 matches without optimization.
      
      Dock Home 
      
      Contents 
      
      Continue
      

      Curator: Malin Young, mmyoung@polonius.ucsf.edu