HOMEWORK#4

          1.Please find the genes for GroEL and GroES 
            that you showedthe structures in homework 3. 
 
           GroEs:2178..2471 color=green
           GroEl:2515..4161 color=blue
>gi|1790582|gb|AE000487|AE000487 Escherichia coli K-12 MG1655 
 section 377 of 400 of the complete genome
     2101 tggtcaccag ccgggaaacc acgtaagctc cggcgtcacc cataacagat acggactttc
     2161 tcaaaggaga gttatcaatg aatattcgtc cattgcatga tcgcgtgatc gtcaagcgta
     2221 aagaagttga aactaaatct gctggcggca tcgttctgac cggctctgca gcggctaaat
     2281 ccacccgcgg cgaagtgctg gctgtcggca atggccgtat ccttgaaaat ggcgaagtga
     2341 agccgctgga tgtgaaagtt ggcgacatcg ttattttcaa cgatggctac ggtgtgaaat
     2401 ctgagaagat cgacaatgaa gaagtgttga tcatgtccga aagcgacatt ctggcaattg
     2461 ttgaagcgta atccgcgcac gacactgaac atacgaattt aaggaataaa gataatggca
     2521 gctaaagacg taaaattcgg taacgacgct cgtgtgaaaa tgctgcgcgg cgtaaacgta
     2581 ctggcagatg cagtgaaagt taccctcggt ccaaaaggcc gtaacgtagt tctggataaa
     2641 tctttcggtg caccgaccat caccaaagat ggtgtttccg ttgctcgtga aatcgaactg
     2701 gaagacaagt tcgaaaatat gggtgcgcag atggtgaaag aagttgcctc taaagcaaac
     2761 gacgctgcag gcgacggtac caccactgca accgtactgg ctcaggctat catcactgaa
     2821 ggtctgaaag ctgttgctgc gggcatgaac ccgatggacc tgaaacgtgg tatcgacaaa
     2881 gcggttaccg ctgcagttga agaactgaaa gcgctgtccg taccatgctc tgactctaaa
     2941 gcgattgctc aggttggtac catctccgct aactccgacg aaaccgtagg taaactgatc
     3001 gctgaagcga tggacaaagt cggtaaagaa ggcgttatca ccgttgaaga cggtaccggt
     3061 ctgcaggacg aactggacgt ggttgaaggt atgcagttcg accgtggcta cctgtctcct
     3121 tacttcatca acaagccgga aactggcgca gtagaactgg aaagcccgtt catcctgctg
     3181 gctgacaaga aaatctccaa catccgcgaa atgctgccgg ttctggaagc tgttgccaaa
     3241 gcaggcaaac cgctgctgat catcgctgaa gatgtagaag gcgaagcgct ggcaactctg
     3301 gttgttaaca ccatgcgtgg catcgtgaaa gtcgctgcgg ttaaagcacc gggcttcggc
     3361 gatcgtcgta aagctatgct gcaggatatc gcaaccctga ctggcggtac cgtgatctct
     3421 gaagagatcg gtatggagct ggaaaaagca accctggaag acctgggtca ggctaaacgt
     3481 gttgtgatca acaaagacac caccactatc atcgatggcg tgggtgaaga agctgcaatc
     3541 cagggccgtg ttgctcagat ccgtcagcag attgaagaag caacttctga ctacgaccgt
     3601 gaaaaactgc aggaacgcgt agcgaaactg gcaggcggcg ttgcagttat caaagtgggt
     3661 gctgctaccg aagttgaaat gaaagagaaa aaagcacgcg ttgaagatgc cctgcacgcg
     3721 acccgtgctg cggtagaaga aggcgtggtt gctggtggtg gtgttgcgct gatccgcgta
     3781 gcgtctaaac tggctgacct gcgtggtcag aacgaagacc agaacgtggg tatcaaagtt
     3841 gcactgcgtg caatggaagc tccgctgcgt cagatcgtat tgaactgcgg cgaagaaccg
     3901 tctgttgttg ctaacaccgt taaaggcggc gacggcaact acggttacaa cgcagcaacc
     3961 gaagaatacg gcaacatgat cgacatgggt atcctggatc caaccaaagt aactcgttct
     4021 gctctgcagt acgcagcttc tgtggctggc ctgatgatca ccaccgaatg catggttacc
     4081 gacctgccga aaaacgatgc agctgactta ggcgctgctg gcggtatggg cggcatgggt
     4141 ggcatgggcg gcatgatgta attgccctgc acctcgcaga aataaacaaa cccccgggca     

  
    2.Which organism do these genes come from? How many 
      sequences of DNA and Protein have been known for 
      this organism?
    Ans:
     1.These genes come from Escherichia coli.
     2.Escherichia coli K-12 MG1655 complete genome
      Accession #: U00096 Length: 4639221 bp  G+C content: 50%
      4289 protein coding regions and 4405  annotated genes
     
    3.How many BLAST hits you can get if you use the gene
      you found for GroEL as the query sequence to search 
      against non-redundant GenBank + EMBL + DDBJ + PDB 
      sequence database? Please also show 10 sequences 
      that produce significant alignments ( you just need
      to show their names and ID numbers ). 
Distribution of 387 Blast Hits on the Query Sequence
    ID number
    gb|U14003|ECOUW93
    gb|AE000487|AE000487
    gb|U68778|SMU68778
    gb|U01039|U01039
    gb|U81143|KPU81143
    emb|X68526|YEHSP60
    emb|X82212|YEO8HSP60
    gb|M11294|ECOGROELA
    emb|X59366|YE8HSP60
    gb|AF005236|AF005236
    names
    Escherichia coli K-12 chromosomal region
    Escherichia coli K-12 MG1655 section 377
    Stenotrophomonas maltophilia GroEL and GroES
    Salmonella typhi TY2 heat shock protein GroEL
    Klebsiella pneumoniae heat shock protein
    Y.enterocolitica gene for heat shock protein 60
    Y.enterocolitica hsp60 gene
    E.coli groEL (mopA) gene coding for
    Y.enterocolitica (0:8) hsp60 gene for HSP60
    Sitophilus oryzae principal endosymbiont G.


¦^¤W¤@­¶