Compare human colon cancer gene MLH1 with other genes.
- To use ORF finder to translate DNA sequence to protein sequence in all reading frames. -
- To use blastn, blastp, CD search and blast 2 sequence programs for searching and comparison. -


1. Compare MLH1 (answer of assignment 2.6) and mutS (answer of 2.7) sequence.

A:No significant alignment.

2. Translate the above two gene sequences to protein sequences.

A:
MLH1

>lcl|Sequence 1 ORF:22..2292 Frame +1
MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAKSTSIQVIVKEGGLKLIQIQDNGTGIRK
EDLDIVCERFTTSKLQSFEDLASISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPK
PCAGNQGTQITVEDLFYNIATRRKALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNA
STVDNIRSIFGNAVSRELIEIGCEDKTLAFKMNGYISNANYSVKKCIFLLFINHRLVESTSLRKAIETVY
AAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGSNSSRMYFTQTLLP
GLAGPSGEMVKSTTSLTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQPLSKPLSSQPQAIVTEDKTDIS
SGRARQQDEEMLELPAPAEVAAKNQSLEGDTTKGTSEMSEKRGPTSSNPRKRHREDSDVEMVEDDSRKEM
TAACTPRRRIINLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQWALAQHQTKLYLLNTTKLSEELF
YQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLAEYIVEFLKKKAEMLADYFSLEI
DEEGNLIGLPLLIDNYVPPLEGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYISEESTLSGQ
QSEVPGSIPNSWKWTVEHIVYKALRSHILPPKHFTEDGNILQLANLPDLYKVFERC*


mutS

>lcl|Sequence 1 ORF:1..2562 Frame +1
MSAIENFDAHTPMMQQYLRLKAQHPEILLFYRMGDFYELFYDDAKRASQLLDISLTKRGASAGEPIPMAG
IPYHAVENYLAKLVNQGESVAICEQIGDPATSKGPVERKVVRIVTPGTISDEALLQERQDNLLAAIWQDS
KGFGYATLDISSGRFRLSEPADRETMAAELQRTNPAELLYAEDFAEMSLIEGRRGLRRRPLWEFEIDTAR
QQLNLQFGTRDLVGFGVENAPRGLCAAGCLLQYAKDTQRTTLPHIRSITMEREQDSIIMDAATRRNLEIT
QNLAGGAENTLASVLDCTVTPMGSRMLKRWLHMPVRDTRVLLERQQTIGALQDFTAGLQPVLRQVGDLER
ILARLALRTARPRDLARMRHAFQQLPELRAQLETVDSAPVQALREKMGEFAELRDLLERAIIDTPPVLVR
DGGVIASGYNEELDEWRALADGATDYLERLEVRERERTGLDTLKVGFNAVHGYYIQISRGQSHLAPINYM
RRQTLKNAERYIIPELKEYEDKVLTSKGKALALEKQLYEELFDLLLPHLEALQQSASALAELDVLVNLAE
RAYTLNYTCPTFIDKPGIRITEGRHPVVEQVLNEPFIANPLNLSPQRRMLIITGPNMGGKSTYMRQTALI
ALMAYIGSYVPAQKVEIGPIDRIFTRVGAADDLASGRSTFMVEMTETANILHNATEYSLVLMDEIGRGTS
TYDGLSLAWACAENLANKIKALTLFATHYFELTQLPEKMEGVANVHLDALEHGDTIAFMHSVQDGAASKS
YGLAVAALAGVPKEVIKRARQKLRELESISPNAAATQVDGTQMSLLSVPEETSPAVEALENLDPDSLTPR
QALEWIYRLKSLV*


3.Perform protein sequence homology searching for MLH1 in GenBank. Give the 10 highest hits.

A:
Score E
Sequences producing significant alignments: (bits) Value

gi|13878583|sp|Q9JK91|MLH1_MOUSE DNA MISMATCH REPAIR PROTEI... 1292 0.0
gi|13591989|ref|NP_112315.1| mismatch repair protein [Rattu... 1289 0.0
gi|4557757|ref|NP_000240.1| mutL homolog 1; mutL (E. coli) ... 1467 0.0
gi|466462|gb|AAA17374.1| (U07418) human homolog of E. coli ... 1466 0.0
gi|604369|gb|AAA85687.1| (U17857) hMLH1 gene product [Homo ... 1453 0.0
gi|12835158|dbj|BAB23172.1| (AK004105) putative [Mus musculus] 753 0.0
gi|13543339|gb|AAH05833.1|AAH05833 (BC005833) Similar to mu... 731 0.0
gi|7304079|gb|AAF59117.1| (AE003838) Mlh1 gene product [Dro... 615 e-175
gi|3192877|gb|AAC19117.1| (AF068257) mutL homolog [Drosophi... 608 e-173
gi|460627|gb|AAA16835.1| (U07187) Mlh1p [Saccharomyces cere... 471 e-132

4. Compare human MLH1 protein with MLH1 in M. musculus, R. norvegicus and D. melanogaster. Give the pairwise alignment and % of sequence smility.

A:
human MLH1 protein with

MLH1 in M. musculus Identities = 651/760 (85%), Positives = 693/760 (90%), Gaps = 4/760 (0%)
MLH1 in R. norvegicus Identities = 639/758 (84%), Positives = 684/758 (89%), Gaps = 3/758 (0%)
MLH1 in D. melanogaster Identities = 335/751 (44%), Positives = 453/751 (59%), Gaps = 94/751 (12%)


5. Search the conserve domain (CD) for MLH1. Give the position of the CD, name of CD and Pfam ID number.

A:
position: No.147-179 amino acid
name: DNA_mis_repair, DNA mismatch repair protein. Also known as the mutL/hexB/PMS1 pfam family.
ID: pfam01119

6. Show multiple alignment of MLH1 conserve domain with 5 sequences from the top of the CD alignment.

A:You may go to here