1. Add MLH1_Human protein to the Biology Workbench. Predict its secondary structure by GOR4.

>MLH1_HUMAN
MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAKSTSIQVI
VKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFR
GEALASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQI
TVEDLFYNIATRRKALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGET
VADVRTLPNASTVDNIRSIFGNAVSRELIEIGCEDKTLAFKMNGYISNAN
YSVKKCIFLLFINHRLVESTSLRKAIETVYAAYLPKNTHPFLYLSLEISP
QNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGSNSSRMYFTQTLLP
GLAGPSGEMVKSTTSLTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQPL
SKPLSSQPQAIVTEDKTDISSGRARQQDEEMLELPAPAEVAAKNQSLEGD
TTKGTSEMSEKRGPTSSNPRKRHREDSDVEMVEDDSRKEMTAACTPRRRI
INLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQWALAQHQTKLYLL
NTTKLSEELFYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEE
DGPKEGLAEYIVEFLKKKAEMLADYFSLEIDEEGNLIGLPLLIDNYVPPL
EGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYISEESTLSGQ
QSEVPGSIPNSWKWTVEHIVYKALRSHILPPKHFTEDGNILQLANLPDLY
KVFERC
LEGEND:


2. Do a homology searching of MLH1_Human in Genpept Full Release Database. Import MLH1-like protein of C. elegans, S. cerevisiae, D. melanogaster, R. norvegicus and M. musculus to your workbench. Run CLUSTALW to get multiple sequence alignment for these six proteins.

Selected Sequence(s)

Sequence alignment

Consensus key (see documentation for details)
* - single, fully conserved residue
: - conservation of strong groups
. - conservation of weak groups
  - no consensus


CLUSTAL W (1.81) multiple sequence alignment


GENPEPT_1724118      -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAK
GENPEPT_7595954      -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
MLH1_HUMAN           -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_3192877      ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQ
GENPEPT_460627       --------------------MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDAN
GENPEPT_3880333      MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIKELVENSLDAG
                                             *: *   ***::****::  * **:**: **.:** 

GENPEPT_1724118      STNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEA
GENPEPT_7595954      STNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEA
MLH1_HUMAN           STSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEA
GENPEPT_3192877      STHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEA
GENPEPT_460627       ATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEA
GENPEPT_3880333      ATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMHMKTYGFRGEA
                     :* * : :: **:*::*: ***.**.: *: ::****:****  ****  : *:******

GENPEPT_1724118      LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRK
GENPEPT_7595954      LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRR
MLH1_HUMAN           LASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRR
GENPEPT_3192877      LASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRR
GENPEPT_460627       LASISHVARVTVTTKVKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRL
GENPEPT_3880333      LASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITATDLFYNLPTRR
                     ***:****::.: :*  . :*.::..: :**:   .** **::** *   ***:*:  * 

GENPEPT_1724118      KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
GENPEPT_7595954      KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
MLH1_HUMAN           KALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSIFGNA
GENPEPT_3192877      QALRSPAEEFQRLSEVLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAA
GENPEPT_460627       RALRSHNDEYSKILDVVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKS
GENPEPT_3880333      NKMTTHGEEAKMVNDTLLRFAIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRD
                     . : .  :*   : :.: *:::*   :.*: ::  :    . .    .  : :  : .  

GENPEPT_1724118      VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESAA
GENPEPT_7595954      VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESAA
MLH1_HUMAN           VSRELIEIG-CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESTS
GENPEPT_3192877      ISKELLEFS-HRDEVYKFE-AECLITQVNYSAKKCQ----------MLLFINQRLVESTA
GENPEPT_460627       VASNLITFHISKVEDLNLESVDGKVCNLNFISKKSIS---------LIFFINNRLVTCDL
GENPEPT_3880333      VADTILPLS-LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI
                     ::  :: .   .     :      : :     . .           : .*** * * .  

GENPEPT_1724118      LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_7595954      LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIE
MLH1_HUMAN           LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_3192877      LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVE
GENPEPT_460627       LRRALNSVYSNYLPKGFRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLH
GENPEPT_3880333      LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFE
                     *: .:: : .    :    *  : : :    :********..* ** ::.*:: :   ..

GENPEPT_1724118      SKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTGIASSSTSGSGDKVHAYQMVR
GENPEPT_7595954      SKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTGVASSSTSGSGDKVYAYQMVR
MLH1_HUMAN           SKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVR
GENPEPT_3192877      ARLLGSNATRTFYKQLRLPGAP-----------------DLDETQLADKTQRIYPKEMVR
GENPEPT_460627       AELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRK
GENPEPT_3880333      KVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPMSQSLKSIEAIRKPDTKPEFK
                       :      .    .      .                    . .              :

GENPEPT_1724118      TDSRDQKLDAFMQPVSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAAD
GENPEPT_7595954      TDSRDQKLDAFLQPVSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAE
MLH1_HUMAN           TDSREQKLDAFLQPLSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAK
GENPEPT_3192877      TDSTEQKLDKFLAPLVK-------------------------------------------
GENPEPT_460627       AKRQENKLVRIDASQAKITSFLSSS--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTL
GENPEPT_3880333      SSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTRGGAVGPTTSNDDIFGGSGILKRARTED
                     :.    *                                                     

GENPEPT_1724118      SASLERESVIGASEVVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRII
GENPEPT_7595954      SENLERESLMETSDAAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRII
MLH1_HUMAN           NQSLEGDTTKGTSEMSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRII
GENPEPT_3192877      ----------------SDSGVSSSSSQEASRLPEES------------FRVTAAKKSREV
GENPEPT_460627       NESEQPRDANTINDNDLKDQPKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNV
GENPEPT_3880333      STGGEKEPEDLNTDFDDVSMVSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREF
                                                   :  .                         .

GENPEPT_1724118      NLTSVLSLQEEINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_7595954      NLTSVLSLQEEISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
MLH1_HUMAN           NLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_3192877      RLSSVLDMRKRVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL
GENPEPT_460627       NLTSIKKLREKVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL
GENPEPT_3880333      HFESIEVLRKEIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF
                     .: *:  :::.:       * . : .  :** :: .   .  *.   *:  :  ..  *:

GENPEPT_1724118      FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_7595954      FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-----EYIVEF
MLH1_HUMAN           FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_3192877      FYQRMIYEFQNCSEITICPPLPLKELLILSLESRAAGWTPEDEDKAELA-----DGAADI
GENPEPT_460627       FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK---------EKIISK
GENPEPT_3880333      FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFANVENRFAAEKL
                     ***  : .* * .   :        : :  * .       :                 . 

GENPEPT_1724118      LKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
GENPEPT_7595954      LKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
MLH1_HUMAN           LKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNYVPPLEGLPIFILRLATEVNW
GENPEPT_3192877      LLKKAPIMREYFGLRISEDGM--------LESLPSLLHQHRPCVAHLPVYLLRLATEVDW
GENPEPT_460627       IWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGYIPSLVKLPFFIYRLGKEVDW
GENPEPT_3880333      LAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYFVPQLEKLPFLIATLVLNVDY
                     : . : :: :*:.:.: :           :  :* *:. . * :  **. :  *  :*::

GENPEPT_1724118      DEE-ECFESLSKECAVFYSIRKQYILEESALSGQQSDMPGSPSKPWKWT--VEHIIYKAF
GENPEPT_7595954      DEEKECFESLSKECAMFYSIRKQYILEESTLSGQQSDMPGSTSKPWKWT--VEHIIYKAF
MLH1_HUMAN           DEEKECFESLSKECAMFYSIRKQYISEESTLSGQQSEVPGSIPNSWKWT--VEHIVYKAL
GENPEPT_3192877      EQETRCFETFCRETARFY--------------AQLDWREGATAVFSRWT--MEHVLFPAF
GENPEPT_460627       EDEQECLDGILREIALLYIPDMVPKVDTLDASLSEDEKAQFINRKEHISSLLEHVLFPCI
GENPEPT_3880333      DDEQNTFRTICRAIGDLFTLDTN---------FITLDKKISAFSATPWKTLIKEVLMPLV
                     ::* . :  : :  . ::                              .  ::.::   .

GENPEPT_1724118      RSHLLPPKHFTEDGNVLQLANLPDLCKVFERC--
GENPEPT_7595954      RSHLLPPKHFTEDGNVLQLANLPDLYKVFERC--
MLH1_HUMAN           RSHILPPKHFTEDGNILQLANLPDLYKVFERC--
GENPEPT_3192877      KKYLLPPR---IKDQIYELTNLPTLYKVFERC--
GENPEPT_460627       KRRFLAPRHILKD--VVEIANLPDLYKVFERC--
GENPEPT_3880333      KRKFIPPEHFKQAGVIRQLADSHDLYKVFERCGT
                     :  ::.*.       : ::::   * ******  


3. Perform BOXSHADE program to get a color-coded plot for the results of question 2.


4. Draw rooted phylogenetic tree for these proteins.


Fasta label (*) Workbench label
MLH1_HUMAN DNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1) [Homo sapiens (Human)]
GENPEPT:7595954 Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
GENPEPT:460627 Saccharomyces cerevisiae DNA mismatch repair (MLH1) gene, complete
GENPEPT:1724118 Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
GENPEPT:3192877 Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_
GENPEPT:3880333 Caenorhabditis elegans cosmid T28A8, complete sequence_