HOMEWORK #4 due on 5/19
Kevin Lyu, a graduate student in our department, got a cDNA that encode a protein from rice. Here is the sequence:CGGCACGAGCAGCAACTTAACTTGATCTTCGTGTGACCGATCGATGGCTCGCGCGGGACATAATAAGTAT
GTGGCGCGCGTGATGGTGGTGGCGCTGCTGTTGGCCGCGCGGGCACCCGTGACATGCGGGCAGGTGGTGA
GCACTTGGGCGCCGTGCATCATGTACGCCGACGGGGAGGGTGTCGCCCCCACCGGCGGCTGCTGCGACGG
GGTCAGGACCCTCAACTCCGCCGCCGCCACCACCGCCGACCGCCAGACCACCTGCGCCTGCCTCAAGCAG
CAGACCAAGGCCATGGGCCGGCTGAGGCCCGACCACGTCGCCGGCATCCCCTCCAAGTGCGGGGTCAACA
TCCCCTACGCTATCAGCCCTTCCACCGACTGCTCCAGGGTGCACTGAGTGGATCAACGTCAAGTGATGCC
ACAATAATAATGGAGAGATGGATCCATCGATCTGCGGCTCTCATTTTGCGGTTGCTATCTGCAATATTCG
TCGTCGTCGGAGAGATCGAGCTAGAAATGCATGTTACTCCTCCGTTCTGTTACTATCTGCTTACCTGTTG
CTTCGTGCGGTTTGATAGTGTCGTTATAGCTAGTGTAAGAGTGTGAGGGTTGATTTTGATCTGTCTCCTT
TACGGGACGAGGGGCACGGCGAATCATGCATGAATCTTAGAGGACCTGCTTGCATTGTACCTTACTCAGT
GCATGCTTCAATATATATCCATCAAATGAAGATCTTTTAATGAAAAAAAAAAAAAAAAAAAAAAAA
| 5'3' Frame 1
R H E Q Q L N L I F V Stop P I D G S R G T Stop Stop V C G A R D G G G A A V G R A G T R D Met R A G G E H L G A V H H V R R R G G C R P H R R L L R R G Q D P Q L R R R H H R R P P D H L R L P Q A A D Q G H G P A E A R P R R R H P L Q V R G Q H P L R Y Q P F H R L L Q G A L S G S T S S D A T I I Met E R W I H R S A A L I L R L L S A I F V V V G E I E L E Met H V T P P F C Y Y L L T C C F V R F D S V V I A S V R V Stop G L I L I C L L Y G T R G T A N H A Stop I L E D L L A L Y L T Q C Met L Q Y I S I K Stop R S F N E K K K K K K K 5'3' Frame 2 G T S S N L T Stop S S C D R S Met A R A G H N K Y V A R V Met V V A L L L A A R A P V T C G Q V V S T W A P C I Met Y A D G E G V A P T G G C C D G V R T L N S A A A T T A D R Q T T C A C L K Q Q T K A Met G R L R P D H V A G I P S K C G V N I P Y A I S P S T D C S R V H Stop V D Q R Q V Met P Q Stop Stop W R D G S I D L R L S F C G C Y L Q Y S S S S E R S S Stop K C Met L L L R S V T I C L P V A S C G L I V S L Stop L V Stop E C E G Stop F Stop S V S F T G R G A R R I Met H E S Stop R T C L H C T L L S A C F N I Y P S N E D L L Met K K K K K K K K 5'3' Frame 3 A R A A T Stop L D L R V T D R W L A R D I I S Met W R A Stop W W W R C C W P R G H P Stop H A G R W Stop A L G R R A S C T P T G R V S P P P A A A A T G S G P S T P P P P P P P T A R P P A P A S S S R P R P W A G Stop G P T T S P A S P P S A G S T S P T L S A L P P T A P G C T E W I N V K Stop C H N N N G E Met D P S I C G S H F A V A I C N I R R R R R D R A R N A C Y S S V L L L S A Y L L L R A V Stop Stop C R Y S Stop C K S V R V D F D L S P L R D E G H G E S C Met N L R G P A C I V P Y S V H A S I Y I H Q Met K I F Stop Stop K K K K K K K 3'5' Frame 1 F F F F F F F F H Stop K I F I Stop W I Y I E A C T E Stop G T Met Q A G P L R F Met H D S P C P S S R K G D R S K S T L T L L H Stop L Stop R H Y Q T A R S N R Stop A D S N R T E E Stop H A F L A R S L R R R R I L Q I A T A K Stop E P Q I D G S I S P L L L W H H L T L I H S V H P G A V G G R A D S V G D V D P A L G G D A G D V V G P Q P A H G L G L L L E A G A G G L A V G G G G G G G V E G P D P V A A A A G G G D T L P V G V H D A R R P S A H H L P A C H G C P R G Q Q Q R H H H H A R H I L I Met S R A S H R S V T R R S S Stop V A A R A 3'5' Frame 2 F F F F F F F F I K R S S F D G Y I L K H A L S K V Q C K Q V L Stop D S C Met I R R A P R P V K E T D Q N Q P S H S Y T S Y N D T I K P H E A T G K Q I V T E R R S N Met H F Stop L D L S D D D E Y C R Stop Q P Q N E S R R S Met D P S L H Y Y C G I T Stop R Stop S T Q C T L E Q S V E G L I A Stop G Met L T P H L E G Met P A T W S G L S R P Met A L V C C L R Q A Q V V W R S A V V A A A E L R V L T P S Q Q P P V G A T P S P S A Y Met Met H G A Q V L T T C P H V T G A R A A N S S A T T I T R A T Y L L C P A R A I D R S H E D Q V K L L LV P 3'5' Frame 3 F F F F F F F S L K D L H L Met D I Y Stop S Met H Stop V R Y N A S R S S K I H A Stop F A V P L V P Stop R R Q I K I N P H T L T L A I T T L S N R T K Q Q V S R Stop Stop Q N G G V T C I S S S I S P T T T N I A D S N R K Met R A A D R W I H L S I I I V A S L D V D P L S A P W S S R W K G Stop Stop R R G C Stop P R T W R G C R R R G R A S A G P W P W S A A Stop G R R R W S G G R R W W R R R S Stop G S Stop P R R S S R R W G R H PP R R R T Stop C T A P K C S P P A R Met S R V P A R P T A A P P P S R A P H T Y Y V P R E P S I G H T K I K L S C C S C |
|
But the most possible sequence is: ARAGHNKYVARVMVVALLLAARAPVTCGQVVSTWAPCIMYADGEGVA PTGGCCDGVRTLNSAAATTADRQTTCACLKQQTKAMGRLRPDHVAGIP SKCGVNIPYAISPSTDCSRVH |
Negatively charged residues
: 6
Positively charged residues : 15

| N-glycosylation site | 5-8 NLTS |
| N-myristoylation site | 1-6
GTSSNL
43-47 GQVVST 59-64 GVAPTG 65-70 GCCDGV 185-190 GLIVSL |
| Casein kinase II phosphorylation site | 8-11
SSCD
79-82 TTAD 160-163 SSSE 231-234 SNED |
| Protein kinase C phosphorylation site | 162-164 SER
165-167 SSK 202-204 TGR |
| Plant lipid transfer proteins signature | 108-129 IPSKCGVNIPYAISPSTDCSRV |