HOMEWORK#7
Retrieving DNA sequence / GenBank
- due on 5/12
1. Sho-Hua cloned a gene in the lab. The part of DNA sequence is listed here:
1 tttgaaagac cccacccgta ggtggcaagc tagcttaagt aacgccactt tgcaaggcat
61 ggaaaaatac ataactgaga ataggaaagt tcagatcaag gtcaggaaca aagaaacagc
121 tgaataccaa acaggatatc tgtggtaagc ggttcctgcc ccggctcagg gccaagaaca
181 gatgagacag ctgagtgatg ggccaaacag gatatctgtg gtaagcagtt cctgccccgg
241 ctcggggcca agaacagatg gtccccagat gcggtccagc cctcagcagt ttctagtgaa
301 tcatcagatg tttccagggt gccccaagga cctgaaaatg accctgtacc ttatttgaac
361 taaccaatca gttcgcttct cgcttctgtt cgcgcgcttc cgctctccga gctcaataaa
421 agagcccaca acccctcact cggcgcgcca gtcttccgat agactgcgtc gcccgggtac
481 ccgtattccc aataaagcct cttgctgttt gcatccgaat cgtggtctcg ctgttccttg
541 ggagggtctc ctctgagtga ttgactaccc acgacggggg tctttcattt gggggctcgt
601 ccgggatttg gagacccctg cccagggacc accgacccac caccgggagg taagctggcc
661 agcaacttat ctgtgtctgt ccgattgtct agtgtctatg tttgatgtta tgcgcctgcg
721 tctgtactag ttagctaact agctctgtat ctggcggacc cgtggtggaa ctgacgagtt
781 ctgaacaccc ggccgcaacc ctgggagacg tcccagggac tttgggggcc gtttttgtgg
841 cccgacctga ggaagggagt cgatgtggaa tccgaccccg tcaggatatg tggttctggt
901 aggagacgag aacctaaaac agttcccgcc tccgtctgaa tttttgcttt cggtttggaa
961 ccgaagccgc gcgtcttgtc tgctgcagca tcgttctgtg ttgtctctgt ctgactgtgt
1021 ttctgtattt gtctgaaaat tagggccaga ctgttaccac tcccttaagt ttgaccttag
1081 gtcactggaa agatgtcgag cggatcgctc acaaccagtc ggtagatgtc aagaagagac
1141 gttgggttac cttctgctct gcagaatggc caacctttaa cgtcggatgg ccgcgagacg
1201 gcacctttaa ccgagacctc atcacccagg ttaagatcaa ggtcttttca cctggcccgc
1261 atggacaccc agaccaggtc ccctacatcg tgacctggga agccttggct tttgaccccc
1321 ctccctgggt caagcccttt gtacacccta agcctccgcc tcctcttcct ccatccgccc
1381 cgtctctccc ccttgaacct cctcgttcga ccccgcctcg atcctccctt tatccagccc
1441 tcactccttc tctaggcggg aattcgttag cttggtaagt gaccagctac agtcggaaac
1501 catcagcaag caggtatgta ctctccaggg tgggcctggc ttccccagtc aagactccag
1561 ggatttgagg gacgctgtgg gctcttctct tacatgtacc ttttgctagc ctcaaccctg
1621 actatcttcc aggtcattgt tccaacatgg ccctgtggat cgacaggatg caactcctgt
1681 cttgcattgc actaagtctt gcacttgtca caaacagtgc acctacttca agttctacaa
1741 agaaaacaca gctgcaactg gagcatttac tgctggattt acagatgatt ttgaatggaa
1801 ttaataatta caagaatccc aaactcaccc gcatgctcac atttaagttt tacatgccca
1861 agaaggccac agaactgaaa catctgcagt gtctagaaga agaactcaaa cctctggagg
1921 aagtgctaaa tttagctcaa agcaaaaact ttcacttaag gcctagggac ttaatcagca
1981 atatcaacgt aatagttctc gagctaaagg gatctgaaac aacattcatg tgtgaatatg
2041 ctgatgagac agccaccatt gtggaatttc tgaacagatg gattaccttt tgtcaaagca
2101 tcatctcaac actaacttga taattaagtg cttcccactt aaaacatatc aggatccgct
2161 gtggaatgtg tgtcagttag ggtgtggaaa gtccccaggc tccccagcag gcagaagtat
2221 gcaaagcatg catctcaatt agtcagcaac caggtgtgga aagtccccag gctccccagc
2281 aggcagaagt atgcaaagca tgcatctcaa ttagtcagca accatagtcc cgcccctaac
2341 tccgcccatc ccgcccctaa ctccgcccag ttccgcccat tctccgcccc atggctgact
2401 aatttttttt atttatgcag aggccgaggc cgcctcggcc tctgagctat tccagaagta
2461 gtgaggaggc ttttttggag gcctaggctt ttgcaaaaag cttgggctgc aggtcgaggc
2521 ggatctgatc aagagacagg atgaggatcg tttcgcatga ttgaacaaga tggattgcac
2581 gcaggttctc cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca
2641 atcggctgct ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt
2701 gtcaagaccg acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg
2761 tggctggcca cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga
2821 agggactggc tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct
2881 cctgccgaga aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg
Please help him to identify the gene and its possible function.
2. How many nucleotide and protein sequence of Lycopersicon esculentum were know?
Answer:
<1>
*gb|J02263|MLM124
Moloney murine sarcoma virus clone 124 (proviral), complete genome.
Length = 5833, Plus Strand HSPs: Score = 7183 (1984.8 bits), Expect = 0.0, P = 0.0 Identities = 1447/1460 (99%),
Positives = 1447/1460 (99%), Strand = Plus / Plus
*emb|V01185|REMSVX
Genome of murine sarcoma virus (strain 124).
Contains genes for the gag polyprotein, which is post- translationally cleaved into the core proteins p15, p12, p30 and p10, for an unknown protein (gene X), and the transforming g... Length = 5833
Plus Strand HSPs:
Score = 7165 (1979.8 bits), Expect = 0.0, P = 0.0 Identities = 1445/1460 (98%), Positives = 1445/1460 (98%),
Strand = Plus / Plus
Moloney murine sarcoma virus(M-MuSV), was obtained from sarcomas induced in Balb/c mice after passage of high doses of Moloney murine leukemia virus(M-MuLV; Moloney, 1966). M-MuSV is capable of transformation of fibroblasts in vitro and induction of tumors in the animals, but is unable to replicate. From the stocks of M-MuSV, J. Ball and colleagues have isolated a strain of M-MuSV clone 124, which has an overabundance of the transforming virus to that of the helper M-MuSV (Ball et al., 1973). Retroviruses can serve as excellent eucaryotic vectors, and hence a detailed knowledge of the nucleotide sequence of pMSV-1L should be very beneficial for construction of appropriate plasmid vectors.
source : 1.......5833
/organism="Murine sarcoma virus"
/strain="124."
misc_feature : 2.......590
/note="5' terminal repeat"
CDS : 1042.......2658
/note="gag polyprotein"
CDS : 3875.......4999
/note="unknown reading frame (gene x)"
misc_feature : 5245.......5833
/note="3' terminal repeat"
<2>
*Lycopersicon esculentum class II small heat shock protein Le-HSP17.6 mRNA, complete cds gi|1773290|gb|U72396|LEU72396 [1773290]
1 tacggctgcg agaagacgac agaaggggac tgcaattaca aatcaaacca aaattgacaa
61 atttcacgca caaaatcaca atatccaaaa atttctcaat actgaaaatg gatttgaggt
121 tgttgggtat cgataacaca ccactcttcc acactctcca ccatatgatg gaagctgccg
181 gtgaagattc cgacaagtct gtcaatgcac catcaaggaa ctatgttcgt gatgctaagg
241 ccatggctgc tacaccagcg gatgtgaagg agtatcctaa ttcgtatgtt tttgttgtgg
301 atatgccagg gttgaaatct ggagatatca aagtgcaggt ggaagaagac aatgtgctgt
361 tgattagtgg tgaaaggaag agggaagaag agaaagaagg tgcaaagttt attaggatgg
421 agagaagggt tgggaaattc atgaggaagt ttagtctgcc agagaatgcg aatactgatg
481 caatttctgc agtttgtcaa gatggagttc tgactgttac tgttcagaaa ttgcctcctc
541 ctgagccaaa gaaacccaaa acaattgagg tgaaagttgc ttgaagttat ggactctgtt
601 ttgatggttt gtggtatgat gtagtagaaa taaagttgta ggagtagtga acttttcctt
661 tcatctttct gctatgtttt cacgtctgtt tgaatgttac aatagccatg ggtattgttt
721 gttttgatgc caaaaaaa
1 MDLRLLGIDNTPLFHTLHHMMEAAGEDSDKSVNAPSRNYV
41 RDAKAMAATPADVKEYPNSYVFVVDMPGLKSGDIKVQVEE
81 DNVLLISGERKREEEKEGAKFIRMERRVGKFMRKFSLPEN
121 ANTDAISAVCQDGVLTVTVQKLPPPEPKKPKTIEVKVA