Homework 6

Multiple Sequence Alignment of MLH1 protein.
- To do Multiple Sequence Alignment in Biology Workbench
- To draw phylogenetic tree from alignment
- Deadline - 12/06/2001

  1. Add MLH1_Human protein to the Biology Workbench. Predict its secondary structure by GOR4.
    Ans. 
               >MLH1_HUMAN
    Sequence   MSFVAGVIRR LDETVVNRIA AGEVIQRPAN AIKEMIENCL DAKSTSIQVI
    Structure  CCCCEEEEEE CCHHHHHHHH HCHHHHHHHH HHHHHHHHHC CCCCCCHHHH
    
    Sequence   VKEGGLKLIQ IQDNGTGIRK EDLDIVCERF TTSKLQSFED LASISTYGFR
    Structure  HHHCCCEEEE ECCCCCCCHH HHHHHHHCCC CCCCCCCCHH HHHHCCCCCC
    
    Sequence   GEALASISHV AHVTITTKTA DGKCAYRASY SDGKLKAPPK PCAGNQGTQI
    Structure  CCHHHHHHHE EEEEEEECCC CCCCEEEECC CCCCCCCCCC CCCCCCCCEE
    
    Sequence   TVEDLFYNIA TRRKALKNPS EEYGKILEVV GRYSVHNAGI SFSVKKQGET
    Structure  EEHHHHHHHH HHHHHHCCCC HHHHHEEEEE ECCCCCCCCE EEEECCCCCE
    
    Sequence   VADVRTLPNA STVDNIRSIF GNAVSRELIE IGCEDKTLAF KMNGYISNAN
    Structure  EEEEEECCCC CCCCCEEEEC CCCCCHHHHH HCCCHHHHHH CCCCCEECCC
    
    Sequence   YSVKKCIFLL FINHRLVEST SLRKAIETVY AAYLPKNTHP FLYLSLEISP
    Structure  CCCCCEEEEE ECCCCHHHHH HHHHHHHHHH HHCCCCCCCC EEEECCCCCC
    
    Sequence   QNVDVNVHPT KHEVHFLHEE SILERVQQHI ESKLLGSNSS RMYFTQTLLP
    Structure  CCCCEEECCC CCHHHHHHHH HHHHHHHHHH HHHHHCCCCC CEEEEEEECC
    
    Sequence   GLAGPSGEMV KSTTSLTSSS TSGSSDKVYA HQMVRTDSRE QKLDAFLQPL
    Structure  CCCCCCCCEE EEEEEEEEEC CCCCCCHHHH HHHHHHHHHH HHHHHHHCCC
    
    Sequence   SKPLSSQPQA IVTEDKTDIS SGRARQQDEE MLELPAPAEV AAKNQSLEGD
    Structure  CCCCCCCCCE EECCCCCCHH HHHHHHHHHH HHHCCCHHHH HHHHHCCCCC
    
    Sequence   TTKGTSEMSE KRGPTSSNPR KRHREDSDVE MVEDDSRKEM TAACTPRRRI
    Structure  CCCCCCHHHC CCCCCCCCCC CCCCCCCCHH HHHHHHHHHH HHHCCCCCEE
    
    Sequence   INLTSVLSLQ EEINEQGHEV LREMLHNHSF VGCVNPQWAL AQHQTKLYLL
    Structure  ECCCCHHHHH HHHHHHHHHH HHHHHCCCCE EEEECCCCHH HHHHHHHHHH
    
    Sequence   NTTKLSEELF YQILIYDFAN FGVLRLSEPA PLFDLAMLAL DSPESGWTEE
    Structure  HCCCCHHHHH HHHHHHCCCC CCEECCCCCC CHHHHHHHHC CCCCCCCCCC
    
    Sequence   DGPKEGLAEY IVEFLKKKAE MLADYFSLEI DEEGNLIGLP LLIDNYVPPL
    Structure  CCCCCCHHHH HHHHHHHHHH HHHHHHHHHH HHCCCCCCCC EEECCCCCCC
    
    Sequence   EGLPIFILRL ATEVNWDEEK ECFESLSKEC AMFYSIRKQY ISEESTLSGQ
    Structure  CCCCHHHHHH HHHHCHHHHH HCCCCCCCCC HHHHHCCCCC CCHHHHCCCC
    
    Sequence   QSEVPGSIPN SWKWTVEHIV YKALRSHILP PKHFTEDGNI LQLANLPDLY
    Structure  CCCCCCCCCC CCCEEEECHH HHHHHCCCCC CCCCCCCCHH HHHHCCCCCE
    
    Sequence   KVFERC
    Structure  EEEEEC
    
    LEGEND:
      Alpha Helix = H Beta Sheet = E Random Coil = C


  2. Do a homology searching of MLH1_Human in Genpept Full Release Database. Import MLH1-like protein of C. elegans, S. cerevisiae, D. melanogaster, R. norvegicus and M. musculus to your workbench. Run CLUSTALW to get multiple sequence alignment for these six proteins.
    Ans. multiple sequence alignment for these six proteins
  3. Perform BOXSHADE program to get a color-coded plot for the results of question 2.
    Ans.
  4. Draw rooted phylogenetic tree for these proteins.
    Ans.