NCSA Computational Biology - SAPS

                  Results

                 Statistical Analysis of Protein Sequences.
----------------------------------------------------------------------------
                                    toxin
----------------------------------------------------------------------------

toxin

SAPS.  Version of January 7, 1995.
Date run: Mon Dec  9 07:46:12 1996

SWISS-PROT ANNOTATION:
ID   4143
DE   4143, 60 bases.

number of residues:   60;   molecular weight:   6.7 kdal

       1  LKCNKLVPLF YKTCPAGKNL CYKMFMVATP KVPVKRGCID VCPKSSLLVK YVCCNTDRCN

--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s.q)

A  :  2( 3.3%); C  :  8(13.3%); D  :  2( 3.3%); E  :  0( 0.0%); F  :  2( 3.3%)
G  :  2( 3.3%); H  :  0( 0.0%); I  :  1( 1.7%); K  :  9(15.0%); L  :  6(10.0%)
M  :  2( 3.3%); N  :  4( 6.7%); P  :  5( 8.3%); Q  :  0( 0.0%); R  :  2( 3.3%)
S  :  2( 3.3%); T  :  3( 5.0%); V  :  7(11.7%); W  :  0( 0.0%); Y  :  3( 5.0%)

KR      :   11 ( 18.3%);   ED      :    2 (  3.3%);   AGP     :    9 ( 15.0%);
KRED    :   13 ( 21.7%);   KR-ED   :    9 ( 15.0%);   FIKMNY  :   21 ( 35.0%);
LVIFM   :   18 ( 30.0%);   ST      :    5 (  8.3%).

--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS

       1  0+00+00000 0+00000+00 00+0000000 +000++000- 000+00000+ 000000-+00

A. CHARGE CLUSTERS.

Positive charge clusters (cmin = 13/30 or 18/45 or 22/60):  none

Negative charge clusters:  not evaluated (frequency of - < 5%, too low)

Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60):  none

B. HIGH SCORING (UN)CHARGED SEGMENTS.

______________________________________
High scoring positive charge segments:

score=   2.00 frequency=   0.183  ( KR )
score=   0.00 frequency=   0.000  ( BZX )
score=  -1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
score=  -2.00 frequency=   0.033  ( ED )

 Expected score/letter:  -0.483
 - now scoring for positive charge segments;    Average information/letter:   0.430
 Minimal length of displayed segments set to:  20

M_0.01= 13.07  (cv=  7.77, lambda=  0.52686, k=  0.16371, x=  5.30;
                90% confidence interval for segment length:  23 +-  25)
M_0.05=  9.97  (x=  2.20)

# of segments (>=20 residues) exceeding M_0.05: none

______________________________________
High scoring negative charge segments:

score=   2.00 frequency=   0.033  ( ED )
score=   0.00 frequency=   0.000  ( BZX )
score=  -1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
score=  -2.00 frequency=   0.183  ( KR )

 Expected score/letter:  -1.083
 - now scoring for negative charge segments;    Average information/letter:   3.490
 Minimal length of displayed segments set to:  20

M_0.01=  4.94  (cv=  2.54, lambda=  1.61122, k=  0.47671, x=  2.40;
                90% confidence interval for segment length:   3 +-   3)
M_0.05=  3.92  (x=  1.38)

# of segments (>=20 residues) exceeding M_0.05: none

___________________________________
High scoring mixed charge segments:

score=   1.00 frequency=   0.217  ( KEDR )
score=   0.00 frequency=   0.000  ( BZX )
score=  -1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )

 Expected score/letter:  -0.567
 - now scoring for mixed charge segments;    Average information/letter:   1.051
 Minimal length of displayed segments set to:  20

M_0.01=  6.07  (cv=  3.19, lambda=  1.28520, k=  0.40993, x=  2.89;
                90% confidence interval for segment length:  11 +-   9)
M_0.05=  4.80  (x=  1.62)

# of segments (>=20 residues) exceeding M_0.05: none

________________________________
High scoring uncharged segments:

score=   1.00 frequency=   0.783  ( LAGSVTIPNFQYHMCW )
score=   0.00 frequency=   0.000  ( BZX )
score=  -8.00 frequency=   0.217  ( KEDR )

 Expected score/letter:  -0.950
 - now scoring for uncharged segments;    Average information/letter:   0.173
 Minimal length of displayed segments set to:  20

M_0.01= 32.53  (cv= 20.56, lambda=  0.19916, k=  0.10900, x= 11.97;
                90% confidence interval for segment length:  54 +-  44)
M_0.05= 24.34  (x=  3.79)

# of segments (>=20 residues) exceeding M_0.05: none

C. CHARGE RUNS AND PATTERNS.

pattern  (+)|  (-)|  (*)|  (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)|
lmin0     5 |   3 |   6 |  29 |  10 |   6 |  10 |  11 |   6 |  12 |
lmin1     6 |   4 |   7 |  36 |  12 |   7 |  12 |  14 |   8 |  14 |
lmin2     7 |   4 |   8 |  39 |  13 |   8 |  14 |  15 |   9 |  16 |

There are no charge runs or patterns exceeding the given minimal lengths.

Run count statistics:

  +  runs >=   3:   0
  -  runs >=   3:   0
  *  runs >=   4:   0
  0  runs >=  20:   0

--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES

1. HIGH SCORING SEGMENTS.

__________________________________
High scoring hydrophobic segments:

   2.00 (LVIFM)   1.00 (AGYCW)   0.00 (BZX)  -2.00 (PH)  -4.00 (STNQ)
  -8.00 (KEDR)

 Expected score/letter:  -1.650
 - now scoring for hydrophobic segments
........40........80.......120.......160.......200
******;    Average information/letter:   0.452
 Minimal length of displayed segments set to:  15

M_0.01= 23.51  (cv= 13.41, lambda=  0.30526, k=  0.21941, x= 10.10;
                90% confidence interval for segment length:  23 +-  17)
M_0.05= 18.17  (x=  4.76)

# of segments (>=15 residues) exceeding M_0.05: none

____________________________________
High scoring transmembrane segments:

   5.00 (LVIF)   2.00 (AGM)   0.00 (BZX)  -1.00 (YCW)  -2.00 (ST)
  -6.00 (P)  -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED)

 Expected score/letter:  -3.483
 - now scoring for transmembrane segments
........40........80.......120.......160.......200
******;    Average information/letter:   0.513
 Minimal length of displayed segments set to:  15

M_0.01= 45.47  (cv= 26.27, lambda=  0.15585, k=  0.20016, x= 19.19;
                90% confidence interval for segment length:  20 +-  16)
M_0.05= 35.01  (x=  8.74);     M_0.30= 22.56  (x= -3.71)

# of segments (>=15 residues) exceeding M_0.30: none

2. SPACINGS OF C.

H2N-2-C-10-C-6-C-16-C-3-C-10
CC   at   53
  -4-C-1-COOH

--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.

A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length:  4

B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
   (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length:  8

--------------------------------------------------------------------------------

MULTIPLETS.

A. AMINO ACID ALPHABET.

1. Total number of amino acid multiplets:   3  (Expected range:   0-- 11)

2. Histogram of spacings between consecutive amino acid multiplets:
   (1-5) 2   (6-10) 1   (11-20) 0   (>=21) 1

3. Clusters of amino acid multiplets (cmin = 10/30 or 12/45 or 15/60):  none

B. CHARGE ALPHABET.

1. Total number of charge multiplets:   1  (Expected range:   0--  6)
   1 +plets (f+: 18.3%), 0 -plets (f-: 3.3%)
   Total number of charge altplets: 1 (Critical number: 4)

2. Histogram of spacings between consecutive charge multiplets:
   (1-5) 0   (6-10) 0   (11-20) 0   (>=21) 2

--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.

A. AMINO ACID ALPHABET (core:  4; !-core: 4)

Location        Period  Element         Copies  Core    Errors

There are no periodicities of the prescribed length.

B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core:  5; !-core: 5)
   and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core:  6; !-core: 8)

Location        Period  Element         Copies  Core    Errors

There are no periodicities of the prescribed length.

--------------------------------------------------------------------------------
SPACING ANALYSIS.

Not evaluated (sequence length < 100 aa, too short).


Back to homework 5
Mail to me:b821605@life.nthu.edu.tw