David, an undergraduate student at department of Life Science, sequenced a gene from an organis that he found near the Cheng-Kung lake. Here is the sequence
GAAAAAAATAAAGAAAATGAACAAACCCCAAACTATTTATGAAAAGCTCGGAGGCGAAAATGCCATGAAG GCTGCCGTCCCTCTCTTCTACAAGAAGGTCTTAGCTGATGAAAGAGTCAAGCATTTCTTCAAGAACACCG ACATGGATCACCAAACCAAGCAATAAACTGACTTCCTCACCATGCTCTTAGGTGGTCCCAACCATTACAA GGGTAAAAATATGACTGAAGCTCACAAGGGTATGAACTTGCAAAACTTGCACTTTGATGCCATCATTGAA AACCTTGCTGCTACCCTTAAGGAGCTCGGTGTCACCGATGCTGTTATTAACGAGGCTGCTAAGGTCATCG AACACACCCGTAAGGATATGCTCGGCAAGTGAGATAGCTGCTGTTGCTGTTTATATTCTACTATTATTAA TTACTACTTAACACATCATCAAATAAATAGTAATTCTACTCAATTAAACTTGTCAGATTTCAATAAAAAT TATTTACTGTAATGAGAGTATTTATTGTGTATTGTTATGTATCGTTTATTGAAGATGATGATCAAGACAA ATCCCATGGTACCCGATCCTCGAATTC
1. Please help him to identify this gene (name, accession #, authors ...... )
NAME: Tetrahymena pyriformis mRNA for hemoglobin
ACCESSION: D13920
AUTHORS: Takagi,T., Iwaasa,H., Yuasa,H., Shikama,K., Takemasa,T. and Watanabe,Y.
2. Which organism does this gene belong?
ORGANISM: Tetrahymena pyriformis
Eukaryota; Alveolata; Ciliophora; Oligohymenophorea; Hymenostomatida; Tetrahymenina; Tetrahymena.
3. How many nucleotide, protein and structure have been known for this organism?
95 nucleotides and 231 proteins have been known for this organism,but no structure known.
4. Do the Blast search for this gene. List the top 10 most similar sequence.
Length = 494 Score = 50.1 bits (25), Expect = 0.002 Identities = 46/53 (86%) Strand = Plus / Plus
Length = 1384 Score = 48.1 bits (24), Expect = 0.010 Identities = 24/24 (100%) Strand = Plus / Minus
Length = 205035 Score = 48.1 bits (24), Expect = 0.010 Identities = 30/32 (93%) Strand = Plus / Plus
Length = 2087 Score = 46.1 bits (23), Expect = 0.039 Identities = 23/23 (100%) Strand = Plus / Plus
5)Mostuea brunonis chloroplast ndhF gene
Length = 2202 Score = 46.1 bits (23), Expect = 0.039 Identities = 23/23 (100%) Strand = Plus / Plus
Length = 2141 Score = 46.1 bits (23), Expect = 0.039 Identities = 23/23 (100%) Strand = Plus / Plus
Length = 2086 Score = 46.1 bits (23), Expect = 0.039 Identities = 23/23 (100%) Strand = Plus / Plus
Length = 2197 Score = 44.1 bits (22), Expect = 0.15 Identities = 22/22 (100%) Strand = Plus / Plus
9)Arabidopsis thaliana chromosome 1 BAC F28L5 genomic sequence, complete sequence
Length = 48404 Score = 34.2 bits (17), Expect(2) = 0.22 Identities = 17/17 (100%) Strand = Plus / Plus
10)Drosophila melanogaster genomic scaffold 142000013386035 section 67 of 105, complete sequence
Length = 233747 Score = 30.2 bits (15), Expect(5) = 0.24 Identities = 15/15 (100%) Strand = Plus / Plus
5. Using ORF finder to translate this gene. Show the correct protein sequence.
MNKPQTIYEKLGGENAMKAAVPLFYKKVLADERVKHFFKNTDMDHQTKQQTDFLTMLLGGPN
HYKGKNMTEAHKGMNLQNLHFDAIIENLAATLKELGVTDAVINEAAKVIEHTRKDMLGK