Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
Fasta label (*) | Workbench label |
---|
MLH1_HUMAN | DNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1) [Homo sapiens (Human)] |
GENPEPT:172203 | S.cerevisiae PMS1 gene encoding DNA mismatch repair protein, |
GENPEPT:3880333 | Caenorhabditis elegans cosmid T28A8, complete sequence_ |
GENPEPT:7304079 | Drosophila melanogaster genomic scaffold 142000013386047 section 5 |
GENPEPT:1724118 | Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete |
GENPEPT:7595954 | Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds. |
(*) Clustalw cuts off Fasta labels after the first space (e.g. ">abc def" becomes ">abc").
Sequence alignment
Consensus key (see documentation for details)
* - single, fully conserved residue
: - conservation of strong groups
. - conservation of weak groups
- no consensus
CLUSTAL W (1.81) multiple sequence alignment
GENPEPT_7595954 ---------------------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIK
GENPEPT_1724118 ---------------------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIK
MLH1_HUMAN ---------------------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIK
GENPEPT_7304079 -------------------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALK
GENPEPT_3880333 ----------MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIK
GENPEPT_172203 MFHHIENLLIETEKRCKQKEQRYIPVKYLFSMTQIHQINDIDVHRITSGQVITDLTTAVK
*::: : *:*:::*::: .*:*
GENPEPT_7595954 EMIENCLDAKSTNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAS
GENPEPT_1724118 EMTENCLDAKSTNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAM
MLH1_HUMAN EMIENCLDAKSTSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLAS
GENPEPT_7304079 ELLENSLDAQSTHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQ
GENPEPT_3880333 ELVENSLDAGATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMH
GENPEPT_172203 ELVDNSIDANANQIEIIFKDYGLESIECSDNGDGIDPSNYEFLALKHYTSKIAKFQDVAK
*: :*.:** :. * : .: **: :: .*** ** .: ::. :. ***: *:*:
GENPEPT_7595954 ISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVE
GENPEPT_1724118 ISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVE
MLH1_HUMAN ISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVE
GENPEPT_7304079 IATFGFRGEALASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIE
GENPEPT_3880333 MKTYGFRGEALASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITAT
GENPEPT_172203 VQTLGFRGEALSSLCGIAKLSVITTTSPPKADKLEYDMVGHITSKTTTSR-NKGTTVLVS
: * *******:*:. :*::.: :. : *. *:: . .... ::** :
GENPEPT_7595954 DLFYNIITRRKALKNPSE-EYGKILEVVGRYSIHNSGISFSVK---KQGETVSDVRTLPN
GENPEPT_1724118 DLFYNIITRKKALKNPSE-EYGKILEVVGRYSIHNSGISFSVK---KQGETVSDVRTLPN
MLH1_HUMAN DLFYNIATRRKALKNPSE-EYGKILEVVGRYSVHNAGISFSVK---KQGETVADVRTLPN
GENPEPT_7304079 DLFYNMPQRRQALRSPAE-EFQRLSEVLARYAVHNPRVGFTLR---KQGDAQPALRTPVA
GENPEPT_3880333 DLFYNLPTRRNKMTTHGE-EAKMVNDTLLRFAIHRPDVSFALR---Q--NQAGDFRTKGD
GENPEPT_172203 QLFHNLPVRQKEFSKTFKRQFTKCLTVIQGYAIINAAIKFSVWNITPKGKKNLILSTMRN
:**:*: *:: : . : : .: ::: .. : *:: . . *
GENPEPT_7595954 ATTVDNIRSIFGNAVSRELIEVGCEDKT---------------------LAFKMNGYISN
GENPEPT_1724118 ATTVDNIRSIFGNAVSRELIEVGCEDKT---------------------LAFKMNGYISN
MLH1_HUMAN ASTVDNIRSIFGNAVSRELIEIGCEDKT---------------------LAFKMNGYISN
GENPEPT_7304079 SSRSENIRIIYGAAISKELLEFSHRDEV---------------------YKFEAECLITQ
GENPEPT_3880333 GNFRDVVCNLLGRDVADTILPLSLNSTR---------------------LKFTFTGHISK
GENPEPT_172203 SSMRKNISSVFGAGGMRGLEEVDLVLDLNPFKNRMLGKYTDDPDFLDLDYKIRVKGYISQ
.. . : : * : .. : *::
GENPEPT_7595954 ----------ANYSVKKCIFLLFINHRLVESAALRKAIETVYAAYLPKNTHPFLYLSLEI
GENPEPT_1724118 ----------ANYSVKKCIFLLFINHRLVESAALKKAIEAVYAAYLPKNTHPFLYLILEI
MLH1_HUMAN ----------ANYSVKKCIFLLFINHRLVESTSLRKAIETVYAAYLPKNTHPFLYLSLEI
GENPEPT_7304079 ----------VNYSAKKCQMLLFINQRLVESTALRTSVDSIYATYLPRGHHPFVYMSLTL
GENPEPT_3880333 PIASATAAIAQNRKTSRSFFSVFINGRSVRCDILKHPIDEVLGARQLH--AQFCALHLQI
GENPEPT_172203 -------NSFGCGRNSKDRQFIYVNKRPVEYSTLLKCCNEVYKTFN-NVQFPAVFLNLEL
.: :::* * *. * : : : . : * :
GENPEPT_7595954 SPQNVDVNVHPTKHEVHFLHEESILQRVQQHIESKLLGS--NSSRMYFTQTLLPGLAGPS
GENPEPT_1724118 SPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGS--NSSRMYFTQTLLPGLAGPS
MLH1_HUMAN SPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGS--NSSRMYFTQTLLPGLAGPS
GENPEPT_7304079 PPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVEARLLGS--NATRTFYKQLRLPGAP---
GENPEPT_3880333 DETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFEKVIGEI--FGFEALDVEKPEEEQPDIE
GENPEPT_172203 PMSLIDVNVTPDKRVILLHNERAVIDIFKTTLSDYYNRQELALPKRMCSQSEQQAQKRLK
:**** * *. : : :. ::: .: .. . :
GENPEPT_7595954 GEAARPTTG-----------------VASSSTSGSGDKVYAYQMVRTDSRDQKLDAFLQP
GENPEPT_1724118 GEAVKSTTG-----------------IASSSTSGSGDKVHAYQMVRTDSRDQKLDAFMQP
MLH1_HUMAN GEMVKSTTS-----------------LTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQP
GENPEPT_7304079 --------D-----------------LDETQLADKTQRIYPKEMVRTDSTEQKLDKFLAP
GENPEPT_3880333 NLVMIPMSQSLKSIEAIRKPDTKPEFKSSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTR
GENPEPT_172203 TEVFDDRSTTHESDNE----------NYHTARSESNQSNHAHFNSTTGVIDKSNGTELTS
. . . *. ::. . :
GENPEPT_7595954 VSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAESENLERESLMETSD
GENPEPT_1724118 VSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAADSASLERESVIGASE
MLH1_HUMAN LSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAKNQSLEGDTTKGTSE
GENPEPT_7304079 LVK---------------------------------------------------------
GENPEPT_3880333 GGAVGPTTSND-------------------DIFGGSGILKRARTEDSTGGEKEPEDLNTD
GENPEPT_172203 VMDGNYTNVTDVIGSECEVSVDSSVVLDEGNSSTPTKKLPSIKTDSQNLSDLNLNNFSNP
GENPEPT_7595954 AAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYP-------RRRIINLTSVLS
GENPEPT_1724118 VVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYP-------RRRIINLTSVLS
MLH1_HUMAN MSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTP-------RRRIINLTSVLS
GENPEPT_7304079 --SDSGVSSSSSQEASRLPEES------------FRVTAAK-------KSREVRLSSVLD
GENPEPT_3880333 FDDVSMVSLVSTADGRRLNESQDLG-----EDDDVDFEYGK-------THREFHFESIEV
GENPEPT_172203 EFQNITSPDKARSLEKVVEEPVYFDIDGEKFQEKAVLSQADGLVFVDNECHEHTNDCCHQ
. * : .
GENPEPT_7595954 LQEEISERCHETLREILRNHSFVGCVNPQWALAQHQTKLYLLNTTKLSEELFY-----QI
GENPEPT_1724118 LQEEINDRGHETLREMLRNHTFVGCVNPQWALAQHQTKLYLLNTTKLSEELFY-----QI
MLH1_HUMAN LQEEINEQGHEVLREMLHNHSFVGCVNPQWALAQHQTKLYLLNTTKLSEELFY-----QI
GENPEPT_7304079 MRKRVERQCSVQLRSTLKNLVYVGCVDERRALFQHETRLYMCNTRSFSEELFY-----QR
GENPEPT_3880333 LRKEIIANSSQSLREMFKTSTFVGSINVKQVLIQFGTSLYHLDFSTVLREFFY-----QI
GENPEPT_172203 ERRGSTDTEQDDEADSIYAEIEPVEINVRTPLKNSRKSISKDNYRSLSDGLTHRKFEDEI
:. . : :: : * : . : : .. : : :
GENPEPT_7595954 LIYDFANFGVLRLSEPAPLFDLAMLALDSPESG-------------------------WT
GENPEPT_1724118 LIYDFANFGVLRLPEPAPLFDFAMLALDSPESG-------------------------WT
MLH1_HUMAN LIYDFANFGVLRLSEPAPLFDLAMLALDSPESG-------------------------WT
GENPEPT_7304079 MIYEFQNCSEITISPPLPLKELLILSLESEAAG-------------------------WT
GENPEPT_3880333 SVFSFGNYGSYRLDEEPPAIIEILELLGELSTREPNY---------------------AA
GENPEPT_172203 LEYNLSTKNFKEISKNGKQMSSIISKRKSEAQENIIKNKDELEDFEQGEKYLTLTVSKND
:.: . . : : .
GENPEPT_7595954 EDDGPKEGLAEYIVEFLKKKAEMLADYFSVEIDEEGN--------------------LIG
GENPEPT_1724118 EEDGPKEGLAEYIVEFLKKKAKMLADYFSVEIDEEGN--------------------LIG
MLH1_HUMAN EEDGPKEGLAEYIVEFLKKKAEMLADYFSLEIDEEGN--------------------LIG
GENPEPT_7304079 PEDGDKAELADGAADILLKKAPIMREYFGLRISEDGM--------------------LES
GENPEPT_3880333 FEVFANVENRFAAEKLLAEHADLLHDYFAIKLDQLENGR----------------LHITE
GENPEPT_172203 FKKMEVVGQFNLGFIIVTRKVDNKSDLFIVDQHASDEKYNFETLQAVTVFKSQKLIIPQP
. :: .:. : * :
GENPEPT_7595954 LPLLIDSYVPPLEGLPIFILR-LATEVNWDEEKECFESLSKECAMFYSIRKQYILEESTL
GENPEPT_1724118 LPLLIDSYVPPLEGLPIFILR-LATEVNWDEE-ECFESLSKECAVFYSIRKQYILEESAL
MLH1_HUMAN LPLLIDNYVPPLEGLPIFILR-LATEVNWDEEKECFESLSKECAMFYSIRKQYISEESTL
GENPEPT_7304079 LPSLLHQHRPCVAHLPVYLLR-LATEVDWEQETRCFETFCRETARFY-------------
GENPEPT_3880333 IPSLVHYFVPQLEKLPFLIAT-LVLNVDYDDEQNTFRTICRAIGDLFTLDTN-------F
GENPEPT_172203 VELSVIDELVVLDNLPVFEKNGFKLKIDEEEEFGSRVKLLSLPTSKQTLFDLGDFNELIH
: : : **. : ::: ::* .:
GENPEPT_7595954 SGQQSDMPGSTSKPWKWTVEHIIYKAFRSHLLPPKHFTEDGNVLQLANLPDLYKVFERC-
GENPEPT_1724118 SGQQSDMPGSPSKPWKWTVEHIIYKAFRSHLLPPKHFTEDGNVLQLANLPDLCKVFERC-
MLH1_HUMAN SGQQSEVPGSIPNSWKWTVEHIVYKALRSHILPPKHFTEDGNILQLANLPDLYKVFERC-
GENPEPT_7304079 -AQLDWREGATAGFSRWTMEHVLFPAFKKYLLPPPRIKD--QIYELTNLPTLYKVFERC-
GENPEPT_3880333 ITLDKKISAFSATPWKTLIKEVLMPLVKRKFIPPEHFKQAGVIRQLADSHDLYKVFERCG
GENPEPT_172203 LIKEDGGLRRDNIRCSKIRSMFAMRACRSSIMIGKPLNKKTMTRVVHNLSELDKPWNCPH
. . . : :: :.. : : * * ::
GENPEPT_7595954 -----------------------
GENPEPT_1724118 -----------------------
MLH1_HUMAN -----------------------
GENPEPT_7304079 -----------------------
GENPEPT_3880333 T----------------------
GENPEPT_172203 GRPTMRHLMEIRDWSSFSKDYEI
Clustal W dendrogram
Unrooted tree (generated by Phylip's Drawtree)
Phylip-format dendrogram
(
(
GENPEPT_7595954:0.03522,
GENPEPT_1724118:0.04800)
:0.02570,
(
GENPEPT_7304079:0.24214,
(
GENPEPT_3880333:0.34503,
GENPEPT_172203:0.50735)
:0.07715)
:0.19191,
MLH1_HUMAN:0.05571);
Clustal W options and diagnostic messages
Alignment type: Protein Alignment order: aligned
Pairwise alignment parameters
Method: accurate
Matrix: Gonnet
Gap open penalty: 10.00 Gap extension penalty: 0.10
Multiple alignment parameters
Matrix: Gonnet Negative matrix?: no
Gap open penalty: 10.00 Gap extension penalty: 0.20
% identity for delay: 30 Residue-specific gap penalties: on
Penalize end gaps: on Hydrophilic gap penalties: on
Gap separation distance: 0 Hydrophilic residues: GPSNDQEKR
CLUSTAL W (1.81) Multiple Sequence Alignments
Sequence type explicitly set to Protein
Sequence format is Pearson
Sequence 1: GENPEPT_7595954 760 aa
Sequence 2: GENPEPT_1724118 757 aa
Sequence 3: GENPEPT_7304079 664 aa
Sequence 4: GENPEPT_3880333 779 aa
Sequence 5: GENPEPT_172203 904 aa
Sequence 6: MLH1_HUMAN 756 aa
Start of Pairwise alignments
Aligning...
Sequences (1:2) Aligned. Score: 91
Sequences (1:3) Aligned. Score: 50
Sequences (1:4) Aligned. Score: 32
Sequences (1:5) Aligned. Score: 15
Sequences (1:6) Aligned. Score: 88
Sequences (2:3) Aligned. Score: 48
Sequences (2:4) Aligned. Score: 32
Sequences (2:5) Aligned. Score: 15
Sequences (2:6) Aligned. Score: 86
Sequences (3:4) Aligned. Score: 33
Sequences (3:5) Aligned. Score: 17
Sequences (3:6) Aligned. Score: 51
Sequences (4:5) Aligned. Score: 14
Sequences (4:6) Aligned. Score: 32
Sequences (5:6) Aligned. Score: 16
Time for pairwise alignment: 1.099579
Guide tree file created: [../tmp-dir/14440.CLUSTALW.dnd]
Start of Multiple Alignment
There are 5 groups
Aligning...
Group 1: Sequences: 2 Score:15727
Group 2: Sequences: 3 Score:15399
Group 3: Sequences: 4 Score:10965
Group 4: Sequences: 5 Score:7769
Group 5: Delayed
Sequence:5 Score:5698
Time for multiple alignment: 2.428712
Alignment Score 24818
CLUSTAL-Alignment file created [../tmp-dir/14440.CLUSTALW.aln]
Citation
Algorithm Citation:
Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved
software for multiple sequence alignment. Computer Applications in the
Biosciences (CABIOS), 8(2):189-191.
Thompson J.D., Higgins D.G., Gibson T.J. "CLUSTAL W: improving the
sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice."
Nucleic Acids Res. 22:4673-4680(1994).
Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package
(Version 3.2). Cladistics 5: 164-166.
Program Citation:
CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson,
modified; any errors are due to the modifications.
PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package)
version 3.5c. Distributed by the author. Department of Genetics,
University of Washington, Seattle.
Copyright (C) 1999, Board of Trustees of the University of Illinois.