Version 3.2

CLUSTALW
Multiple Sequence Alignment

Selected Sequence(s)
  • Human DNA mismatch repair protein homolog (hMLH1) mRNA, complete
  • Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
  • S.cerevisiae chromosome XIII cosmid 8520_
  • Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
  • Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_


    Fasta label (*)Workbench label
    GENPEPT:463989Human DNA mismatch repair protein homolog (hMLH1) mRNA, complete
    GENPEPT:7595954Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
    GENPEPT:825572S.cerevisiae chromosome XIII cosmid 8520_
    GENPEPT:1724118Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
    GENPEPT:3192877Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_

    (*) Clustalw cuts off Fasta labels after the first space (e.g. ">abc def" becomes ">abc").


    Sequence alignment

    Consensus key (see documentation for details)
    * - single, fully conserved residue
    : - conservation of strong groups
    . - conservation of weak groups
      - no consensus
    
    
    CLUSTAL W (1.81) multiple sequence alignment
    
    
    GENPEPT_1724118      --MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAKSTNIQVIVREGGLKL
    GENPEPT_7595954      --MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAKSTNIQVVVKEGGLKL
    GENPEPT_463989       --MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAKSTSIQVIVKEGGLKL
    GENPEPT_3192877      MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQSTHIQVQVKAGGLKL
    GENPEPT_825572       -----MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDANATMIDILVKEGGIKV
                               .  *: **  ***:*****:*  *.**:**: **.:**::* *:: *: **:*:
    
    GENPEPT_1724118      IQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEALASISHVAHVTITTK
    GENPEPT_7595954      IQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEALASISHVAHVTITTK
    GENPEPT_463989       IQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEALASISHVAHVTITTK
    GENPEPT_3192877      LQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEALASISHVAHLSIQTK
    GENPEPT_825572       LQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEALASISHVARVTVTTK
                         :** ***:**.: ** *:*********  ****: * *:**************:::: **
    
    GENPEPT_1724118      TADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRKKALKNPSEEYGKILE
    GENPEPT_7595954      TADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRRKALKNPSEEYGKILE
    GENPEPT_463989       TADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRRKALKNPSEEYGKILE
    GENPEPT_3192877      TAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRRQALRSPAEEFQRLSE
    GENPEPT_825572       VKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRLRALRSHNDEYSKILD
                         . . :*.::.:*::**:   *** **::** * :****:*:  * :**:.  :*: :: :
    
    GENPEPT_1724118      VVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNAVSRELIEVG-CEDKT
    GENPEPT_7595954      VVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNAVSRELIEVG-CEDKT
    GENPEPT_463989       VVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSIFGNAVSRELIEIG-CEDKT
    GENPEPT_3192877      VLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAAISKELLEFS-HRDEV
    GENPEPT_825572       VVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKSVASNLITFHISKVED
                         *:.**::*.  :.*: :* *::   : .    :  :.** ::. ::: :*: .   . : 
    
    GENPEPT_1724118      LAFK-MNGYISNANYSVKKCIF-LLFINHRLVESAALKKAIEAVYAAYLPKNTHPFLYLI
    GENPEPT_7595954      LAFK-MNGYISNANYSVKKCIF-LLFINHRLVESAALRKAIETVYAAYLPKNTHPFLYLS
    GENPEPT_463989       LAFK-MNGYISNANYSVKKCIF-LLFINHRLVESTSLRKAIETVYAAYLPKNTHPFLYLS
    GENPEPT_3192877      YKFE-AECLITQVNYSAKKCQM-LLFINQRLVESTALRTSVDSIYATYLPRGHHPFVYMS
    GENPEPT_825572       LNLESVDGKVCNLNFISKKSISPIFFINNRLVTCDLLRRALNSVYSNYLPKGNRPFIYLG
                           ::  :  : : *:  **.   ::***:*** .  *: :::::*: ***:. :**:*: 
    
    GENPEPT_1724118      LEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGSNSSRMYFTQTLLPGLAG-
    GENPEPT_7595954      LEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIESKLLGSNSSRMYFTQTLLPGLAG-
    GENPEPT_463989       LEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGSNSSRMYFTQTLLPGLAG-
    GENPEPT_3192877      LTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVEARLLGSNATRTFYKQLRLPGAP--
    GENPEPT_825572       IVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLHAELSAIDTSRTFKASSISTNKPES
                         : : *  :********:**:** ::.*:: : :::.:.* . :::* :  .   .. .  
    
    GENPEPT_1724118      -----PSGEAVKSTTGIASSSTSGSGDKVHAYQMVRTDSRDQKLDAFMQPVSRRLPSQPQ
    GENPEPT_7595954      -----PSGEAARPTTGVASSSTSGSGDKVYAYQMVRTDSRDQKLDAFLQPVSSLVPSQPQ
    GENPEPT_463989       -----PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQPLSKPLSSQPQ
    GENPEPT_3192877      ---------------DLDETQLADKTQRIYPKEMVRTDSTEQKLDKFLAPLVK-------
    GENPEPT_825572       LIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRKAKRQENKLVRIDASQAKITSFLSS
                                        .: .:.   .       :: ::.  ::**  :  .          
    
    GENPEPT_1724118      D--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAADSASLERESVIGASEVVAPQRHPSS
    GENPEPT_7595954      DPAPVRGARTEGSPERATREDEEMLALPAPAEAAAESENLERESLMETSDAAQKAAPTSS
    GENPEPT_463989       --AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAKNQSLEGDTTKGTSEMSEKRGPTSS
    GENPEPT_3192877      ----------------------------------------------------SDSGVSSS
    GENPEPT_825572       S--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTLNESEQPRDANTINDNDLKDQPKKK
                                                                                   ..
    
    GENPEPT_1724118      PGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRIINLTSVLSLQEEINDRGHETLREML
    GENPEPT_7595954      PGSSRKRHREDSDVEMVENASGKEMTAACYPRRRIINLTSVLSLQEEISERCHETLREIL
    GENPEPT_463989       --NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRIINLTSVLSLQEEINEQGHEVLREML
    GENPEPT_3192877      SSQEASRLPEES------------FRVTAAKKSREVRLSSVLDMRKRVERQCSVQLRSTL
    GENPEPT_825572       QKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNVNLTSIKKLREKVDDSIHRELTDIF
                               :  . :                   .   :.*:*: .:::.:.      * . :
    
    GENPEPT_1724118      RNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEELFYQILIYDFANFGVLRLPEPAPLF
    GENPEPT_7595954      RNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEELFYQILIYDFANFGVLRLSEPAPLF
    GENPEPT_463989       HNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEELFYQILIYDFANFGVLRLSEPAPLF
    GENPEPT_3192877      KNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEELFYQRMIYEFQNCSEITICPPLPLK
    GENPEPT_825572       ANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYELFYQIGLTDFANFGKINLQSTNVSD
                          *  :** *: .   *  **: :*:: :  ... *****  : :* * . : :  .    
    
    GENPEPT_1724118      DFAMLALDSPESGWTEEDGPKEGLAEYIVEFLKKKAKMLADYFSVEIDEEGN--------
    GENPEPT_7595954      DLAMLALDSPESGWTEDDGPKEGLAEYIVEFLKKKAEMLADYFSVEIDEEGN--------
    GENPEPT_463989       DLAMLALDSPESGWTEEDGPKEGLAEYIVEFLKKKAEMLADYFSLEIDEEGN--------
    GENPEPT_3192877      ELLILSLESRAAGWTPEDEDKAELADGAADILLKKAPIMREYFGLRISEDGM--------
    GENPEPT_825572       DIVLYNLLSEFDELN-DDASK----EKIISKIWDMSSMLNEYYSIELVNDGLDNDLKSVK
                         :: :  * *     . :*  *    :   . : . : :: :*:.:.: ::*         
    
    GENPEPT_1724118      LIGLPLLIDSYVPPLEGLPIFILRLATEVNWDEE-ECFESLSKECAVFYSIRKQYILEES
    GENPEPT_7595954      LIGLPLLIDSYVPPLEGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYILEES
    GENPEPT_463989       LIGLPLLIDNYVPPLEGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYISEES
    GENPEPT_3192877      LESLPSLLHQHRPCVAHLPVYLLRLATEVDWEQETRCFETFCRETARFY-----------
    GENPEPT_825572       LKSLPLLLKGYIPSLVKLPFFIYRLGKEVDWEDEQECLDGILREIALLYIPDMVPKVDTS
                         * .** *:. : * :  **.:: **..**:*::* .*:: : :* * :*           
    
    GENPEPT_1724118      ALSGQQSDMPGSPSKPWKWT--VEHIIYKAFRSHLLPPKHFTEDGNVLQLANLPDLCKVF
    GENPEPT_7595954      TLSGQQSDMPGSTSKPWKWT--VEHIIYKAFRSHLLPPKHFTEDGNVLQLANLPDLYKVF
    GENPEPT_463989       TLSGQQSEVPGSIPNSWKWT--VEHIVYKALRSHILPPKHFTEDGNILQLANLPDLYKVF
    GENPEPT_3192877      ---AQLDWREGATAVFSRWT--MEHVLFPAFKKYLLPPR---IKDQIYELTNLPTLYKVF
    GENPEPT_825572       DASLSEDEKAQFINRKEHISSLLEHVLFPCIKRRFLAPRHILKD--VVEIANLPDLYKVF
                             . .          : :  :**::: .::  :*.*:    .  : :::*** * ***
    
    GENPEPT_1724118      ERC
    GENPEPT_7595954      ERC
    GENPEPT_463989       ERC
    GENPEPT_3192877      ERC
    GENPEPT_825572       ERC
                         ***
    
    
    

    Clustal W dendrogram



    Unrooted tree (generated by Phylip's Drawtree)

    Download a PostScript version of the output



    Phylip-format dendrogram

    (
    (
    (
    GENPEPT_3192877:0.24219,
    GENPEPT_825572:0.37772)
    :0.19411,
    GENPEPT_463989:0.05567)
    :0.02574,
    GENPEPT_1724118:0.04825,
    GENPEPT_7595954:0.03497);
    
    

    Clustal W options and diagnostic messages

    Alignment type: Protein                 Alignment order: aligned                
    
                        Pairwise alignment parameters
    
    Method: accurate                        
    Matrix: Gonnet                          
    Gap open penalty: 10.00                 Gap extension penalty: 0.10             
    
                        Multiple alignment parameters
    
    Matrix: Gonnet                          Negative matrix?: no                    
    Gap open penalty: 10.00                 Gap extension penalty: 0.20             
    % identity for delay: 30                Residue-specific gap penalties: on      
    Penalize end gaps: on                   Hydrophilic gap penalties: on           
    Gap separation distance: 0              Hydrophilic residues: GPSNDQEKR         
    
    
    
    
     CLUSTAL W (1.81) Multiple Sequence Alignments
    
    
    
    Sequence type explicitly set to Protein
    Sequence format is Pearson
    Sequence 1: GENPEPT_3192877      663 aa
    Sequence 2: GENPEPT_1724118      757 aa
    Sequence 3: GENPEPT_825572       769 aa
    Sequence 4: GENPEPT_7595954      760 aa
    Sequence 5: GENPEPT_463989       756 aa
    Start of Pairwise alignments
    Aligning...
    Sequences (1:2) Aligned. Score:  48
    Sequences (1:3) Aligned. Score:  38
    Sequences (1:4) Aligned. Score:  50
    Sequences (1:5) Aligned. Score:  51
    Sequences (2:3) Aligned. Score:  36
    Sequences (2:4) Aligned. Score:  91
    Sequences (2:5) Aligned. Score:  86
    Sequences (3:4) Aligned. Score:  36
    Sequences (3:5) Aligned. Score:  36
    Sequences (4:5) Aligned. Score:  88
    Time for pairwise alignment: 2.186492
    
    Guide tree        file created:   [../tmp-dir/17104.CLUSTALW.dnd]
    Start of Multiple Alignment
    There are 4 groups
    Aligning...
    Group 1: Sequences:   2      Score:15727
    Group 2: Sequences:   3      Score:15398
    Group 3: Sequences:   4      Score:10921
    Group 4: Sequences:   5      Score:10151
    Time for multiple alignment: 4.390420
    
    Alignment Score 24505
    CLUSTAL-Alignment file created  [../tmp-dir/17104.CLUSTALW.aln]
    
    

    Citation

      Algorithm Citation:

      Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved software for multiple sequence alignment. Computer Applications in the Biosciences (CABIOS), 8(2):189-191.

      Thompson J.D., Higgins D.G., Gibson T.J. "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res. 22:4673-4680(1994).

      Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package (Version 3.2). Cladistics 5: 164-166.

      Program Citation:

      CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson, modified; any errors are due to the modifications.

      PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author. Department of Genetics, University of Washington, Seattle.


    Copyright (C) 1999, Board of Trustees of the University of Illinois.