assignment 6


1.
DNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1) [Homo sapiens (Human)]

SequenceMSFVAGVIRR LDETVVNRIA AGEVIQRPAN AIKEMIENCL DAKSTSIQVIStructureCCCCEEEEEE CCHHHHHHHH HCHHHHHHHH HHHHHHHHHC CCCCCCHHHHSequenceVKEGGLKLIQ IQDNGTGIRK EDLDIVCERF TTSKLQSFED LASISTYGFRStructureHHHCCCEEEE ECCCCCCCHH HHHHHHHCCC CCCCCCCCHH HHHHCCCCCCSequenceGEALASISHV AHVTITTKTA DGKCAYRASY SDGKLKAPPK PCAGNQGTQIStructureCCHHHHHHHE EEEEEEECCC CCCCEEEECC CCCCCCCCCC CCCCCCCCEESequenceTVEDLFYNIA TRRKALKNPS EEYGKILEVV GRYSVHNAGI SFSVKKQGETStructureEEHHHHHHHH HHHHHHCCCC HHHHHEEEEE ECCCCCCCCE EEEECCCCCESequenceVADVRTLPNA STVDNIRSIF GNAVSRELIE IGCEDKTLAF KMNGYISNANStructureEEEEEECCCC CCCCCEEEEC CCCCCHHHHH HCCCHHHHHH CCCCCEECCCSequenceYSVKKCIFLL FINHRLVEST SLRKAIETVY AAYLPKNTHP FLYLSLEISPStructureCCCCCEEEEE ECCCCHHHHH HHHHHHHHHH HHCCCCCCCC EEEECCCCCCSequenceQNVDVNVHPT KHEVHFLHEE SILERVQQHI ESKLLGSNSS RMYFTQTLLPStructureCCCCEEECCC CCHHHHHHHH HHHHHHHHHH HHHHHCCCCC CEEEEEEECCSequenceGLAGPSGEMV KSTTSLTSSS TSGSSDKVYA HQMVRTDSRE QKLDAFLQPLStructureCCCCCCCCEE EEEEEEEEEC CCCCCCHHHH HHHHHHHHHH HHHHHHHCCCSequenceSKPLSSQPQA IVTEDKTDIS SGRARQQDEE MLELPAPAEV AAKNQSLEGDStructureCCCCCCCCCE EECCCCCCHH HHHHHHHHHH HHHCCCHHHH HHHHHCCCCCSequenceTTKGTSEMSE KRGPTSSNPR KRHREDSDVE MVEDDSRKEM TAACTPRRRIStructureCCCCCCHHHC CCCCCCCCCC CCCCCCCCHH HHHHHHHHHH HHHCCCCCEESequenceINLTSVLSLQ EEINEQGHEV LREMLHNHSF VGCVNPQWAL AQHQTKLYLLStructureECCCCHHHHH HHHHHHHHHH HHHHHCCCCE EEEECCCCHH HHHHHHHHHHSequenceNTTKLSEELF YQILIYDFAN FGVLRLSEPA PLFDLAMLAL DSPESGWTEEStructureHCCCCHHHHH HHHHHHCCCC CCEECCCCCC CHHHHHHHHC CCCCCCCCCCSequenceDGPKEGLAEY IVEFLKKKAE MLADYFSLEI DEEGNLIGLP LLIDNYVPPLStructureCCCCCCHHHH HHHHHHHHHH HHHHHHHHHH HHCCCCCCCC EEECCCCCCCSequenceEGLPIFILRL ATEVNWDEEK ECFESLSKEC AMFYSIRKQY ISEESTLSGQStructureCCCCHHHHHH HHHHCHHHHH HCCCCCCCCC HHHHHCCCCC CCHHHHCCCCSequenceQSEVPGSIPN SWKWTVEHIV YKALRSHILP PKHFTEDGNI LQLANLPDLYStructureCCCCCCCCCC CCCEEEECHH HHHHHCCCCC CCCCCCCCHH HHHHCCCCCESequenceKVFERCStructureEEEEEC
LEGEND: Alpha Helix = H Beta Sheet = E Random Coil = C

2.
Alignment of six proteins:
* - single, fully conserved residue
: - conservation of strong groups
. - conservation of weak groups
GENPEPT_7595954                   -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIK       -----Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
GENPEPT_1724118                   -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIK       -----Drosophila melanogaster  mutL homolog (Mlh1) gene, complete cds_
MLH1_HUMAN                        -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIK       -----Mus musculus  MutL homolog 1 protein (MLH1) mRNA, complete cds.
GENPEPT_3192877                   ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALK       -----Saccharomyces cerevisiae  DNA mismatch repair (MLH1) gene, complete
GENPEPT_460627                    --------------------MSLRIKALDASVVNKIAAGEIIISPVNALK       -----DNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1)[Homo sapiens (Human)] 
hypothetical_protein_T28A8.7      MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIK       -----hypothetical protein T28A8.7 - Caenorhabditis elegans 
                                                          *: *   ***::****::  * **:*

GENPEPT_7595954                   EMIENCLDAKSTNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTS
GENPEPT_1724118                   EMTENCLDAKSTNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTS
MLH1_HUMAN                        EMIENCLDAKSTSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTS
GENPEPT_3192877                   ELLENSLDAQSTHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTS
GENPEPT_460627                    EMMENSIDANATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTS
hypothetical_protein_T28A8.7      ELVENSLDAGATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATS
                                  *: **.:** :* * : :: **:*::*: ***.**.: *: ::****:**

GENPEPT_7595954                   KLQTFEDLASISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDG
GENPEPT_1724118                   KLQTFEDLAMISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDG
MLH1_HUMAN                        KLQSFEDLASISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDG
GENPEPT_3192877                   KLTRFEDLSQIATFGFRGEALASISHVAHLSIQTKTAKEKCGYKATYADG
GENPEPT_460627                    KLQKFEDLSQIQTYGFRGEALASISHVARVTVTTKVKEDRCAWRVSYAEG
hypothetical_protein_T28A8.7      KLQKFEDLMHMKTYGFRGEALASLSHVAKVNIVSKRADAKCAYQANFLDG
                                  **  ****  : *:*********:****::.: :*  . :*.::..: :*

GENPEPT_7595954                   KLQAPPKPCAGNQGTLITVEDLFYNIITRRKALKNPSEEYGKILEVVGRY
GENPEPT_1724118                   KLQAPPKPCAGNQGTLITVEDLFYNIITRKKALKNPSEEYGKILEVVGRY
MLH1_HUMAN                        KLKAPPKPCAGNQGTQITVEDLFYNIATRRKALKNPSEEYGKILEVVGRY
GENPEPT_3192877                   KLQGQPKPCAGNQGTIICIEDLFYNMPQRRQALRSPAEEFQRLSEVLARY
GENPEPT_460627                    KMLESPKPVAGKDGTTILVEDLFFNIPSRLRALRSHNDEYSKILDVVGRY
hypothetical_protein_T28A8.7      KMTADTKPAAGKNGTCITATDLFYNLPTRRNKMTTHGEEAKMVNDTLLRF
                                  *:   .** **::** *   ***:*:  * . : .  :*   : :.: *:

GENPEPT_7595954                   SIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNAVSRELIEVG-
GENPEPT_1724118                   SIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNAVSRELIEVG-
MLH1_HUMAN                        SVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSIFGNAVSRELIEIG-
GENPEPT_3192877                   AVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAAISKELLEFS-
GENPEPT_460627                    AIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKSVASNLITFHI
hypothetical_protein_T28A8.7      AIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRDVADTILPLS-
                                  ::*   :.*: ::  :    . .    .  : :  : .  ::  :: .  

GENPEPT_7595954                   CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESAA
GENPEPT_1724118                   CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESAA
MLH1_HUMAN                        CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESTS
GENPEPT_3192877                   HRDEVYKFE-AECLITQVNYSAKKCQ----------MLLFINQRLVESTA
GENPEPT_460627                    SKVEDLNLESVDGKVCNLNFISKKSIS---------LIFFINNRLVTCDL
hypothetical_protein_T28A8.7      LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI
                                   .     :      : :     . .           : .*** * * .  

GENPEPT_7595954                   LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEES
GENPEPT_1724118                   LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEES
MLH1_HUMAN                        LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEES
GENPEPT_3192877                   LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEE
GENPEPT_460627                    LRRALNSVYSNYLPKGFRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDE
hypothetical_protein_T28A8.7      LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEE
                                  *: .:: : .    :    *  : : :    :********..* ** ::.

GENPEPT_7595954                   ILQRVQQHIESKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTG
GENPEPT_1724118                   ILERVQQHIESKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTG
MLH1_HUMAN                        ILERVQQHIESKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTS
GENPEPT_3192877                   IVDSIKQQVEARLLGSNATRTFYKQLRLPGAP-----------------D
GENPEPT_460627                    IIEKIANQLHAELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKS
hypothetical_protein_T28A8.7      IIEEIRAYFEKVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPM
                                  *:: :   ..  :      .    .      .                  

GENPEPT_7595954                   VASSSTSGSGDKVYAYQMVRTDSRDQKLDAFLQPVSSLVPSQPQDPAPVR
GENPEPT_1724118                   IASSSTSGSGDKVHAYQMVRTDSRDQKLDAFMQPVSRRLPSQPQD--PVP
MLH1_HUMAN                        LTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQPLSKPLSSQPQ--AIVT
GENPEPT_3192877                   LDETQLADKTQRIYPKEMVRTDSTEQKLDKFLAPLVK-------------
GENPEPT_460627                    LRQAQVVENSYTTANSQLRKAKRQENKLVRIDASQAKITSFLSSS--QQF
hypothetical_protein_T28A8.7      SQSLKSIEAIRKPDTKPEFKSSPSAWKSDKKRVDYMEVRTDAKERKIDEF
                                    . .              ::.    *                       

GENPEPT_7595954                   GARTEGSPERATREDEEMLALPAPAEAAAESENLERESLMETSDAAQKAA
GENPEPT_1724118                   GNRTEGSPEKAMQKDQEISELPAPMEAAADSASLERESVIGASEVVAPQR
MLH1_HUMAN                        EDKTDISSGRARQQDEEMLELPAPAEVAAKNQSLEGDTTKGTSEMSEKRG
GENPEPT_3192877                   ----------------------------------------------SDSG
GENPEPT_460627                    NFEGSSTKRQLSEPKVTNVSHSQEAEKLTLNESEQPRDANTINDNDLKDQ
hypothetical_protein_T28A8.7      VTRGGAVGPTTSNDDIFGGSGILKRARTEDSTGGEKEPEDLNTDFDDVSM


GENPEPT_7595954                   PTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRIINLTSVLSLQE
GENPEPT_1724118                   HPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRIINLTSVLSLQE
MLH1_HUMAN                        PTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRIINLTSVLSLQE
GENPEPT_3192877                   VSSSSSQEASRLPEES------------FRVTAAKKSREVRLSSVLDMRK
GENPEPT_460627                    PKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNVNLTSIKKLRE
hypothetical_protein_T28A8.7      VSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREFHFESIEVLRK
                                            :  .                         ..: *:  :::

GENPEPT_7595954                   EISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_1724118                   EINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
MLH1_HUMAN                        EINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_3192877                   RVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL
GENPEPT_460627                    KVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL
hypothetical_protein_T28A8.7      EIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF
                                  .:       * . : .  :** :: .   .  *.   *:  :  ..  *:

GENPEPT_7595954                   FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-
GENPEPT_1724118                   FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-
MLH1_HUMAN                        FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-
GENPEPT_3192877                   FYQRMIYEFQNCSEITICPPLPLKELLILSLESRAAGWTPEDEDKAELA-
GENPEPT_460627                    FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK-----
hypothetical_protein_T28A8.7      FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFAN
                                  ***  : .* * .   :        : :  * .       :         

GENPEPT_7595954                   ----EYIVEFLKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSY
GENPEPT_1724118                   ----EYIVEFLKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSY
MLH1_HUMAN                        ----EYIVEFLKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNY
GENPEPT_3192877                   ----DGAADILLKKAPIMREYFGLRISEDGM--------LESLPSLLHQH
GENPEPT_460627                    ----EKIISKIWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGY
hypothetical_protein_T28A8.7      VENRFAAEKLLAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYF
                                          . : . : :: :*:.:.: :           :  :* *:. .

GENPEPT_7595954                   VPPLEGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYILEEST
GENPEPT_1724118                   VPPLEGLPIFILRLATEVNWDEE-ECFESLSKECAVFYSIRKQYILEESA
MLH1_HUMAN                        VPPLEGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYISEEST
GENPEPT_3192877                   RPCVAHLPVYLLRLATEVDWEQETRCFETFCRETARFY------------
GENPEPT_460627                    IPSLVKLPFFIYRLGKEVDWEDEQECLDGILREIALLYIPDMVPKVDTLD
hypothetical_protein_T28A8.7      VPQLEKLPFLIATLVLNVDYDDEQNTFRTICRAIGDLFTLDTN-------
                                   * :  **. :  *  :*::::* . :  : :  . ::            

GENPEPT_7595954                   LSGQQSDMPGSTSKPWKWT--VEHIIYKAFRSHLLPPKHFTEDGNVLQLA
GENPEPT_1724118                   LSGQQSDMPGSPSKPWKWT--VEHIIYKAFRSHLLPPKHFTEDGNVLQLA
MLH1_HUMAN                        LSGQQSEVPGSIPNSWKWT--VEHIVYKALRSHILPPKHFTEDGNILQLA
GENPEPT_3192877                   --AQLDWREGATAVFSRWT--MEHVLFPAFKKYLLPPR---IKDQIYELT
GENPEPT_460627                    ASLSEDEKAQFINRKEHISSLLEHVLFPCIKRRFLAPRHILKD--VVEIA
hypothetical_protein_T28A8.7      --FITLDKKISAFSATPWKTLIKEVLMPLVKRKFIPPEHFKQAGVIRQLA
                                                    .  ::.::   .:  ::.*.       : :::

GENPEPT_7595954                   NLPDLYKVFERC--
GENPEPT_1724118                   NLPDLCKVFERC--
MLH1_HUMAN                        NLPDLYKVFERC--
GENPEPT_3192877                   NLPTLYKVFERC--
GENPEPT_460627                    NLPDLYKVFERC--
hypothetical_protein_T28A8.7      DSHDLYKVFERCGT
                                  :   * ******  




3.


GENPEPT_7595954 -----Rattus norvegicus
GENPEPT_1724118 -----Drosophila melanogaster
MLH1_HUMAN       -----Mus musculus
GENPEPT_3192877 -----Saccharomyces cerevisiae
GENPEPT_460627   -----[Homo sapiens (Human)]
hypothetical_protein_T28A8.7 -----Caenorhabditis elegans


4.
GENPEPT_7595954 -----Rattus norvegicus
GENPEPT_1724118 -----Drosophila melanogaster
MLH1_HUMAN       -----Mus musculus
GENPEPT_3192877 -----Saccharomyces cerevisiae
GENPEPT_460627   -----[Homo sapiens (Human)]
hypothetical_protein_T28A8.7 -----Caenorhabditis elegans