Version 3.2

CLUSTALW
Multiple Sequence Alignment

Selected Sequence(s)
  • DNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1) [Homo sapiens (Human)]
  • S.cerevisiae PMS1 gene encoding DNA mismatch repair protein,
  • Caenorhabditis elegans cosmid T28A8, complete sequence_
  • Drosophila melanogaster genomic scaffold 142000013386047 section 5
  • Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
  • Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.


    Fasta label (*)Workbench label
    MLH1_HUMANDNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1) [Homo sapiens (Human)]
    GENPEPT:172203S.cerevisiae PMS1 gene encoding DNA mismatch repair protein,
    GENPEPT:3880333Caenorhabditis elegans cosmid T28A8, complete sequence_
    GENPEPT:7304079Drosophila melanogaster genomic scaffold 142000013386047 section 5
    GENPEPT:1724118Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
    GENPEPT:7595954Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.

    (*) Clustalw cuts off Fasta labels after the first space (e.g. ">abc def" becomes ">abc").


    Sequence alignment

    Consensus key (see documentation for details)
    * - single, fully conserved residue
    : - conservation of strong groups
    . - conservation of weak groups
      - no consensus
    
    
    CLUSTAL W (1.81) multiple sequence alignment
    
    
    GENPEPT_7595954      ---------------------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIK
    GENPEPT_1724118      ---------------------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIK
    MLH1_HUMAN           ---------------------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIK
    GENPEPT_7304079      -------------------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALK
    GENPEPT_3880333      ----------MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIK
    GENPEPT_172203       MFHHIENLLIETEKRCKQKEQRYIPVKYLFSMTQIHQINDIDVHRITSGQVITDLTTAVK
                                                           *::: :  *:*:::*:::    .*:*
    
    GENPEPT_7595954      EMIENCLDAKSTNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAS
    GENPEPT_1724118      EMTENCLDAKSTNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAM
    MLH1_HUMAN           EMIENCLDAKSTSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLAS
    GENPEPT_7304079      ELLENSLDAQSTHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQ
    GENPEPT_3880333      ELVENSLDAGATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMH
    GENPEPT_172203       ELVDNSIDANANQIEIIFKDYGLESIECSDNGDGIDPSNYEFLALKHYTSKIAKFQDVAK
                         *: :*.:** :. * : .:  **: :: .*** **  .:  ::. :. ***:  *:*:  
    
    GENPEPT_7595954      ISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVE
    GENPEPT_1724118      ISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVE
    MLH1_HUMAN           ISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVE
    GENPEPT_7304079      IATFGFRGEALASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIE
    GENPEPT_3880333      MKTYGFRGEALASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITAT
    GENPEPT_172203       VQTLGFRGEALSSLCGIAKLSVITTTSPPKADKLEYDMVGHITSKTTTSR-NKGTTVLVS
                         : * *******:*:. :*::.: :. :  *.        *:: . ....  ::** :   
    
    GENPEPT_7595954      DLFYNIITRRKALKNPSE-EYGKILEVVGRYSIHNSGISFSVK---KQGETVSDVRTLPN
    GENPEPT_1724118      DLFYNIITRKKALKNPSE-EYGKILEVVGRYSIHNSGISFSVK---KQGETVSDVRTLPN
    MLH1_HUMAN           DLFYNIATRRKALKNPSE-EYGKILEVVGRYSVHNAGISFSVK---KQGETVADVRTLPN
    GENPEPT_7304079      DLFYNMPQRRQALRSPAE-EFQRLSEVLARYAVHNPRVGFTLR---KQGDAQPALRTPVA
    GENPEPT_3880333      DLFYNLPTRRNKMTTHGE-EAKMVNDTLLRFAIHRPDVSFALR---Q--NQAGDFRTKGD
    GENPEPT_172203       QLFHNLPVRQKEFSKTFKRQFTKCLTVIQGYAIINAAIKFSVWNITPKGKKNLILSTMRN
                         :**:*:  *:: : .  : :      .:  ::: .. : *::       .    . *   
    
    GENPEPT_7595954      ATTVDNIRSIFGNAVSRELIEVGCEDKT---------------------LAFKMNGYISN
    GENPEPT_1724118      ATTVDNIRSIFGNAVSRELIEVGCEDKT---------------------LAFKMNGYISN
    MLH1_HUMAN           ASTVDNIRSIFGNAVSRELIEIGCEDKT---------------------LAFKMNGYISN
    GENPEPT_7304079      SSRSENIRIIYGAAISKELLEFSHRDEV---------------------YKFEAECLITQ
    GENPEPT_3880333      GNFRDVVCNLLGRDVADTILPLSLNSTR---------------------LKFTFTGHISK
    GENPEPT_172203       SSMRKNISSVFGAGGMRGLEEVDLVLDLNPFKNRMLGKYTDDPDFLDLDYKIRVKGYISQ
                         ..  . :  : *      :  ..                            :     *::
    
    GENPEPT_7595954      ----------ANYSVKKCIFLLFINHRLVESAALRKAIETVYAAYLPKNTHPFLYLSLEI
    GENPEPT_1724118      ----------ANYSVKKCIFLLFINHRLVESAALKKAIEAVYAAYLPKNTHPFLYLILEI
    MLH1_HUMAN           ----------ANYSVKKCIFLLFINHRLVESTSLRKAIETVYAAYLPKNTHPFLYLSLEI
    GENPEPT_7304079      ----------VNYSAKKCQMLLFINQRLVESTALRTSVDSIYATYLPRGHHPFVYMSLTL
    GENPEPT_3880333      PIASATAAIAQNRKTSRSFFSVFINGRSVRCDILKHPIDEVLGARQLH--AQFCALHLQI
    GENPEPT_172203       -------NSFGCGRNSKDRQFIYVNKRPVEYSTLLKCCNEVYKTFN-NVQFPAVFLNLEL
                                        .:    :::* * *.   *    : :  :   .       : * :
    
    GENPEPT_7595954      SPQNVDVNVHPTKHEVHFLHEESILQRVQQHIESKLLGS--NSSRMYFTQTLLPGLAGPS
    GENPEPT_1724118      SPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGS--NSSRMYFTQTLLPGLAGPS
    MLH1_HUMAN           SPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGS--NSSRMYFTQTLLPGLAGPS
    GENPEPT_7304079      PPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVEARLLGS--NATRTFYKQLRLPGAP---
    GENPEPT_3880333      DETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFEKVIGEI--FGFEALDVEKPEEEQPDIE
    GENPEPT_172203       PMSLIDVNVTPDKRVILLHNERAVIDIFKTTLSDYYNRQELALPKRMCSQSEQQAQKRLK
                             :**** * *. : :  :. ::: .:  ..           .    :          
    
    GENPEPT_7595954      GEAARPTTG-----------------VASSSTSGSGDKVYAYQMVRTDSRDQKLDAFLQP
    GENPEPT_1724118      GEAVKSTTG-----------------IASSSTSGSGDKVHAYQMVRTDSRDQKLDAFMQP
    MLH1_HUMAN           GEMVKSTTS-----------------LTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQP
    GENPEPT_7304079      --------D-----------------LDETQLADKTQRIYPKEMVRTDSTEQKLDKFLAP
    GENPEPT_3880333      NLVMIPMSQSLKSIEAIRKPDTKPEFKSSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTR
    GENPEPT_172203       TEVFDDRSTTHESDNE----------NYHTARSESNQSNHAHFNSTTGVIDKSNGTELTS
                                                      .    . .         *.  ::. .  :  
    
    GENPEPT_7595954      VSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAESENLERESLMETSD
    GENPEPT_1724118      VSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAADSASLERESVIGASE
    MLH1_HUMAN           LSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAKNQSLEGDTTKGTSE
    GENPEPT_7304079      LVK---------------------------------------------------------
    GENPEPT_3880333      GGAVGPTTSND-------------------DIFGGSGILKRARTEDSTGGEKEPEDLNTD
    GENPEPT_172203       VMDGNYTNVTDVIGSECEVSVDSSVVLDEGNSSTPTKKLPSIKTDSQNLSDLNLNNFSNP
    
    
    GENPEPT_7595954      AAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYP-------RRRIINLTSVLS
    GENPEPT_1724118      VVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYP-------RRRIINLTSVLS
    MLH1_HUMAN           MSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTP-------RRRIINLTSVLS
    GENPEPT_7304079      --SDSGVSSSSSQEASRLPEES------------FRVTAAK-------KSREVRLSSVLD
    GENPEPT_3880333      FDDVSMVSLVSTADGRRLNESQDLG-----EDDDVDFEYGK-------THREFHFESIEV
    GENPEPT_172203       EFQNITSPDKARSLEKVVEEPVYFDIDGEKFQEKAVLSQADGLVFVDNECHEHTNDCCHQ
                                .           *                              :     .   
    
    GENPEPT_7595954      LQEEISERCHETLREILRNHSFVGCVNPQWALAQHQTKLYLLNTTKLSEELFY-----QI
    GENPEPT_1724118      LQEEINDRGHETLREMLRNHTFVGCVNPQWALAQHQTKLYLLNTTKLSEELFY-----QI
    MLH1_HUMAN           LQEEINEQGHEVLREMLHNHSFVGCVNPQWALAQHQTKLYLLNTTKLSEELFY-----QI
    GENPEPT_7304079      MRKRVERQCSVQLRSTLKNLVYVGCVDERRALFQHETRLYMCNTRSFSEELFY-----QR
    GENPEPT_3880333      LRKEIIANSSQSLREMFKTSTFVGSINVKQVLIQFGTSLYHLDFSTVLREFFY-----QI
    GENPEPT_172203       ERRGSTDTEQDDEADSIYAEIEPVEINVRTPLKNSRKSISKDNYRSLSDGLTHRKFEDEI
                          :.           . :        :: :  * :  . :   :  ..   : :     : 
    
    GENPEPT_7595954      LIYDFANFGVLRLSEPAPLFDLAMLALDSPESG-------------------------WT
    GENPEPT_1724118      LIYDFANFGVLRLPEPAPLFDFAMLALDSPESG-------------------------WT
    MLH1_HUMAN           LIYDFANFGVLRLSEPAPLFDLAMLALDSPESG-------------------------WT
    GENPEPT_7304079      MIYEFQNCSEITISPPLPLKELLILSLESEAAG-------------------------WT
    GENPEPT_3880333      SVFSFGNYGSYRLDEEPPAIIEILELLGELSTREPNY---------------------AA
    GENPEPT_172203       LEYNLSTKNFKEISKNGKQMSSIISKRKSEAQENIIKNKDELEDFEQGEKYLTLTVSKND
                           :.: . .   :          :    .                               
    
    GENPEPT_7595954      EDDGPKEGLAEYIVEFLKKKAEMLADYFSVEIDEEGN--------------------LIG
    GENPEPT_1724118      EEDGPKEGLAEYIVEFLKKKAKMLADYFSVEIDEEGN--------------------LIG
    MLH1_HUMAN           EEDGPKEGLAEYIVEFLKKKAEMLADYFSLEIDEEGN--------------------LIG
    GENPEPT_7304079      PEDGDKAELADGAADILLKKAPIMREYFGLRISEDGM--------------------LES
    GENPEPT_3880333      FEVFANVENRFAAEKLLAEHADLLHDYFAIKLDQLENGR----------------LHITE
    GENPEPT_172203       FKKMEVVGQFNLGFIIVTRKVDNKSDLFIVDQHASDEKYNFETLQAVTVFKSQKLIIPQP
                          .             :: .:.    : * :                              
    
    GENPEPT_7595954      LPLLIDSYVPPLEGLPIFILR-LATEVNWDEEKECFESLSKECAMFYSIRKQYILEESTL
    GENPEPT_1724118      LPLLIDSYVPPLEGLPIFILR-LATEVNWDEE-ECFESLSKECAVFYSIRKQYILEESAL
    MLH1_HUMAN           LPLLIDNYVPPLEGLPIFILR-LATEVNWDEEKECFESLSKECAMFYSIRKQYISEESTL
    GENPEPT_7304079      LPSLLHQHRPCVAHLPVYLLR-LATEVDWEQETRCFETFCRETARFY-------------
    GENPEPT_3880333      IPSLVHYFVPQLEKLPFLIAT-LVLNVDYDDEQNTFRTICRAIGDLFTLDTN-------F
    GENPEPT_172203       VELSVIDELVVLDNLPVFEKNGFKLKIDEEEEFGSRVKLLSLPTSKQTLFDLGDFNELIH
                         :   :      :  **.     :  ::: ::*     .:                     
    
    GENPEPT_7595954      SGQQSDMPGSTSKPWKWTVEHIIYKAFRSHLLPPKHFTEDGNVLQLANLPDLYKVFERC-
    GENPEPT_1724118      SGQQSDMPGSPSKPWKWTVEHIIYKAFRSHLLPPKHFTEDGNVLQLANLPDLCKVFERC-
    MLH1_HUMAN           SGQQSEVPGSIPNSWKWTVEHIVYKALRSHILPPKHFTEDGNILQLANLPDLYKVFERC-
    GENPEPT_7304079      -AQLDWREGATAGFSRWTMEHVLFPAFKKYLLPPPRIKD--QIYELTNLPTLYKVFERC-
    GENPEPT_3880333      ITLDKKISAFSATPWKTLIKEVLMPLVKRKFIPPEHFKQAGVIRQLADSHDLYKVFERCG
    GENPEPT_172203       LIKEDGGLRRDNIRCSKIRSMFAMRACRSSIMIGKPLNKKTMTRVVHNLSELDKPWNCPH
                             .              . .     :  ::    :..      : :   * * ::   
    
    GENPEPT_7595954      -----------------------
    GENPEPT_1724118      -----------------------
    MLH1_HUMAN           -----------------------
    GENPEPT_7304079      -----------------------
    GENPEPT_3880333      T----------------------
    GENPEPT_172203       GRPTMRHLMEIRDWSSFSKDYEI
    
    
    
    

    Clustal W dendrogram



    Unrooted tree (generated by Phylip's Drawtree)

    Download a PostScript version of the output



    Phylip-format dendrogram

    (
    (
    GENPEPT_7595954:0.03522,
    GENPEPT_1724118:0.04800)
    :0.02570,
    (
    GENPEPT_7304079:0.24214,
    (
    GENPEPT_3880333:0.34503,
    GENPEPT_172203:0.50735)
    :0.07715)
    :0.19191,
    MLH1_HUMAN:0.05571);
    
    

    Clustal W options and diagnostic messages

    Alignment type: Protein                 Alignment order: aligned                
    
                        Pairwise alignment parameters
    
    Method: accurate                        
    Matrix: Gonnet                          
    Gap open penalty: 10.00                 Gap extension penalty: 0.10             
    
                        Multiple alignment parameters
    
    Matrix: Gonnet                          Negative matrix?: no                    
    Gap open penalty: 10.00                 Gap extension penalty: 0.20             
    % identity for delay: 30                Residue-specific gap penalties: on      
    Penalize end gaps: on                   Hydrophilic gap penalties: on           
    Gap separation distance: 0              Hydrophilic residues: GPSNDQEKR         
    
    
    
    
     CLUSTAL W (1.81) Multiple Sequence Alignments
    
    
    
    Sequence type explicitly set to Protein
    Sequence format is Pearson
    Sequence 1: GENPEPT_7595954      760 aa
    Sequence 2: GENPEPT_1724118      757 aa
    Sequence 3: GENPEPT_7304079      664 aa
    Sequence 4: GENPEPT_3880333      779 aa
    Sequence 5: GENPEPT_172203       904 aa
    Sequence 6: MLH1_HUMAN           756 aa
    Start of Pairwise alignments
    Aligning...
    Sequences (1:2) Aligned. Score:  91
    Sequences (1:3) Aligned. Score:  50
    Sequences (1:4) Aligned. Score:  32
    Sequences (1:5) Aligned. Score:  15
    Sequences (1:6) Aligned. Score:  88
    Sequences (2:3) Aligned. Score:  48
    Sequences (2:4) Aligned. Score:  32
    Sequences (2:5) Aligned. Score:  15
    Sequences (2:6) Aligned. Score:  86
    Sequences (3:4) Aligned. Score:  33
    Sequences (3:5) Aligned. Score:  17
    Sequences (3:6) Aligned. Score:  51
    Sequences (4:5) Aligned. Score:  14
    Sequences (4:6) Aligned. Score:  32
    Sequences (5:6) Aligned. Score:  16
    Time for pairwise alignment: 1.099579
    
    Guide tree        file created:   [../tmp-dir/14440.CLUSTALW.dnd]
    Start of Multiple Alignment
    There are 5 groups
    Aligning...
    Group 1: Sequences:   2      Score:15727
    Group 2: Sequences:   3      Score:15399
    Group 3: Sequences:   4      Score:10965
    Group 4: Sequences:   5      Score:7769
    Group 5:                     Delayed
    Sequence:5     Score:5698
    Time for multiple alignment: 2.428712
    
    Alignment Score 24818
    CLUSTAL-Alignment file created  [../tmp-dir/14440.CLUSTALW.aln]
    
    

    Citation

      Algorithm Citation:

      Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved software for multiple sequence alignment. Computer Applications in the Biosciences (CABIOS), 8(2):189-191.

      Thompson J.D., Higgins D.G., Gibson T.J. "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res. 22:4673-4680(1994).

      Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package (Version 3.2). Cladistics 5: 164-166.

      Program Citation:

      CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson, modified; any errors are due to the modifications.

      PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author. Department of Genetics, University of Washington, Seattle.


    Copyright (C) 1999, Board of Trustees of the University of Illinois.