homework5

Ultilization of Tools in Workbench


We took the protein following as an example:

LKCNKLVPLFYKTCPAGKNICYKMFMVATP

KLPVKRGCIDVCPKSSLLVRYVCCNTDKCN

  1. To identify the protein

    We imported the sequence to Workbench,and then performed BLASTP to identify This sequence. As a result, we found several sequences similiar but not equal to it.
  2. Alignment between sequences

    We performed MSAto align nine similiar sequences to this protein. And then useed MSASHADE to color the pre-aligned result.

  3. Predict secondary structure

    There are four tools is available in Workbench to predict secondary structure of protein. Garnieris one of these and the following is the result predicted by it.
     60 aa; DCH = 158, DCS = 0
     
               .   10    .   20    .   30    .   40    .   50    .   60
           LKCNKLVPLFYKTCPAGKNICYKMFMVATPKLPVKRGCIDVCPKSSLLVRYVCCNTDKCN
     helix                                                             
     sheet     EEEEEEEEE      EEEEEEEEE         EEEEE    EEEEEEE       
     turns TTTT          TTTTT          TT TTTTT     TTTT       TTTTTTT
     coil               C              C  C                            
    
           
           
     helix 
     sheet 
     turns 
     coil  
    
     Residue totals: H:  0   E: 30   T: 27   C:  3
            percent: H:  0.0 E: 68.2 T: 61.4 C:  6.8 
     
  4. Show charge distribution

    The following is the analyzing results of charge distribution by SAPS:
    
    SAPS.  Version of January 7, 1995.
    Date run: Tue Jan  6 04:15:33 1998
    
    
    
    SWISS-PROT ANNOTATION:
    ID   29018
    DE   29018, 60 bases, 1E82 checksum.
    
    number of residues:   60;   molecular weight:   6.8 kdal
     
           1  LKCNKLVPLF YKTCPAGKNI CYKMFMVATP KLPVKRGCID VCPKSSLLVR YVCCNTDKCN
    
    
    
    COMPOSITIONAL ANALYSIS (extremes relative to: swp23s.q)
    
    A  :  2( 3.3%); C  :  8(13.3%); D  :  2( 3.3%); E  :  0( 0.0%); F  :  2( 3.3%)
    G  :  2( 3.3%); H  :  0( 0.0%); I  :  2( 3.3%); K  :  9(15.0%); L  :  6(10.0%)
    M  :  2( 3.3%); N  :  4( 6.7%); P  :  5( 8.3%); Q  :  0( 0.0%); R  :  2( 3.3%)
    S  :  2( 3.3%); T  :  3( 5.0%); V  :  6(10.0%); W  :  0( 0.0%); Y  :  3( 5.0%)
    
    KR      :   11 ( 18.3%);   ED      :    2 (  3.3%);   AGP     :    9 ( 15.0%);
    KRED    :   13 ( 21.7%);   KR-ED   :    9 ( 15.0%);   FIKMNY  :   22 ( 36.7%);
    LVIFM   :   18 ( 30.0%);   ST      :    5 (  8.3%).
    
    
    
    CHARGE DISTRIBUTIONAL ANALYSIS
     
           1  0+00+00000 0+00000+00 00+0000000 +000++000- 000+00000+ 000000-+00
    
    A. CHARGE CLUSTERS.
    
    
    Positive charge clusters (cmin = 13/30 or 18/45 or 22/60):  none
    
    
    Negative charge clusters:  not evaluated (frequency of - < 5%, too low)
    
    
    Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60):  none
    
    
    B. HIGH SCORING (UN)CHARGED SEGMENTS.
    
    
    
    
    High scoring positive charge segments:
    
    score=   2.00 frequency=   0.183  ( KR )
    score=   0.00 frequency=   0.000  ( BZX )
    score=  -1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
    score=  -2.00 frequency=   0.033  ( ED )
    
     Expected score/letter:  -0.483
     - now scoring for positive charge segments;    Average information/letter:   0.430
     Minimal length of displayed segments set to:  20
    
    M_0.01= 13.07  (cv=  7.77, lambda=  0.52686, k=  0.16371, x=  5.30;
                    90% confidence interval for segment length:  23 +-  25)
    M_0.05=  9.97  (x=  2.20)
    
    # of segments (>=20 residues) exceeding M_0.05: none
    
    
    
    
    High scoring negative charge segments:
    
    score=   2.00 frequency=   0.033  ( ED )
    score=   0.00 frequency=   0.000  ( BZX )
    score=  -1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
    score=  -2.00 frequency=   0.183  ( KR )
    
     Expected score/letter:  -1.083
     - now scoring for negative charge segments;    Average information/letter:   3.490
     Minimal length of displayed segments set to:  20
    
    M_0.01=  4.94  (cv=  2.54, lambda=  1.61122, k=  0.47671, x=  2.40;
                    90% confidence interval for segment length:   3 +-   3)
    M_0.05=  3.92  (x=  1.38)
    
    # of segments (>=20 residues) exceeding M_0.05: none
    
    
    
    
    High scoring mixed charge segments:
    
    score=   1.00 frequency=   0.217  ( KEDR )
    score=   0.00 frequency=   0.000  ( BZX )
    score=  -1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
    
     Expected score/letter:  -0.567
     - now scoring for mixed charge segments;    Average information/letter:   1.051
     Minimal length of displayed segments set to:  20
    
    M_0.01=  6.07  (cv=  3.19, lambda=  1.28520, k=  0.40993, x=  2.89;
                    90% confidence interval for segment length:  11 +-   9)
    M_0.05=  4.80  (x=  1.62)
    
    # of segments (>=20 residues) exceeding M_0.05: none
    
    
    
    
    High scoring uncharged segments:
    
    score=   1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
    score=   0.00 frequency=   0.000  ( BZX )
    score=  -8.00 frequency=   0.217  ( KEDR )
    
     Expected score/letter:  -0.950
     - now scoring for uncharged segments;    Average information/letter:   0.173
     Minimal length of displayed segments set to:  20
    
    M_0.01= 32.53  (cv= 20.56, lambda=  0.19916, k=  0.10900, x= 11.97;
                    90% confidence interval for segment length:  54 +-  44)
    M_0.05= 24.34  (x=  3.79)
    
    # of segments (>=20 residues) exceeding M_0.05: none
    
    
    C. CHARGE RUNS AND PATTERNS.
    
    pattern  (+)|  (-)|  (*)|  (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)|
    lmin0     5 |   3 |   6 |  29 |  10 |   6 |  10 |  11 |   6 |  12 | 
    lmin1     6 |   4 |   7 |  36 |  12 |   7 |  12 |  14 |   8 |  14 | 
    lmin2     7 |   4 |   8 |  39 |  13 |   8 |  14 |  15 |   9 |  16 | 
    
    There are no charge runs or patterns exceeding the given minimal lengths.
    
    Run count statistics:
    
      +  runs >=   3:   0
      -  runs >=   3:   0
      *  runs >=   4:   0
      0  runs >=  20:   0
    
    
    
    DISTRIBUTION OF OTHER AMINO ACID TYPES
    
    1. HIGH SCORING SEGMENTS.
    
    
    
    High scoring hydrophobic segments:
    
       2.00 (LVIFM)   1.00 (AGYCW)   0.00 (BZX)  -2.00 (PH)  -4.00 (STNQ)
      -8.00 (KEDR)
    
     Expected score/letter:  -1.650
     - now scoring for hydrophobic segments
    ........40........80.......120.......160.......200
    ******;    Average information/letter:   0.452
     Minimal length of displayed segments set to:  15
    
    M_0.01= 23.51  (cv= 13.41, lambda=  0.30526, k=  0.21941, x= 10.10;
                    90% confidence interval for segment length:  23 +-  17)
    M_0.05= 18.17  (x=  4.76)
    
    # of segments (>=15 residues) exceeding M_0.05: none
    
    
    
    
    High scoring transmembrane segments:
    
       5.00 (LVIF)   2.00 (AGM)   0.00 (BZX)  -1.00 (YCW)  -2.00 (ST)
      -6.00 (P)  -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED)
    
     Expected score/letter:  -3.483
     - now scoring for transmembrane segments
    ........40........80.......120.......160.......200
    ******;    Average information/letter:   0.513
     Minimal length of displayed segments set to:  15
    
    M_0.01= 45.47  (cv= 26.27, lambda=  0.15585, k=  0.20016, x= 19.19;
                    90% confidence interval for segment length:  20 +-  16)
    M_0.05= 35.01  (x=  8.74);     M_0.30= 22.56  (x= -3.71)
    
    # of segments (>=15 residues) exceeding M_0.30: none
    
    
    2. SPACINGS OF C.
    
    
    H2N-2-C-10-C-6-C-16-C-3-C-10
    CC   at   53
      -4-C-1-COOH
    
    
    
    REPETITIVE STRUCTURES.
    
    A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
    Repeat core block length:  4
    
    B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
       (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
    Repeat core block length:  8
    
    
    
    
    MULTIPLETS.
    
    A. AMINO ACID ALPHABET.
    
    1. Total number of amino acid multiplets:   3  (Expected range:   0-- 11)
    
    2. Histogram of spacings between consecutive amino acid multiplets:
       (1-5) 2   (6-10) 1   (11-20) 0   (>=21) 1
    
    3. Clusters of amino acid multiplets (cmin = 10/30 or 12/45 or 15/60):  none
    
    
    B. CHARGE ALPHABET.
    
    1. Total number of charge multiplets:   1  (Expected range:   0--  6)
       1 +plets (f+: 18.3%), 0 -plets (f-: 3.3%)
       Total number of charge altplets: 1 (Critical number: 4)
    
    2. Histogram of spacings between consecutive charge multiplets:
       (1-5) 0   (6-10) 0   (11-20) 0   (>=21) 2
    
    
    
    PERIODICITY ANALYSIS.
    
    A. AMINO ACID ALPHABET (core:  4; !-core: 4)
    
    Location        Period  Element         Copies  Core    Errors
    
    There are no periodicities of the prescribed length.
    
    B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core:  5; !-core: 5)
       and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core:  6; !-core: 8)
    
    Location        Period  Element         Copies  Core    Errors
    
    There are no periodicities of the prescribed length.
    
    
    
    SPACING ANALYSIS.
    
    Not evaluated (sequence length < 100 aa, too short).
    
    
    
ÿ