PROSITE: PDOC00021 (documentation)
{PDOC00021}
{PS00022; EGF_1}
{PS01186; EGF_2}
{BEGIN}
******************************
* EGF-like domain signatures *
******************************
A sequence of about thirty to forty amino-acid residues long found in the
sequence of epidermal growth factor (EGF) has been shown [1 to 6] to be
present, in a more or less conserved form, in a large number of other, mostly
animal proteins. The proteins currently known to contain one or more copies of
an EGF-like pattern are listed below.
- Adipocyte differentiation inhibitor (gene PREF-1) from mouse (6 copies).
- Agrin, a basal lamina protein that causes the aggregation of acetylcholine
receptors on cultured muscle fibers (4 copies).
- Amphiregulin, a growth factor (1 copy).
- Betacellulin, a growth factor (1 copy).
- Blastula proteins BP10 and Span from sea urchin which are thought to be
involved in pattern formation (1 copy).
- BM86, a glycoprotein antigen of cattle tick (7 copies).
- Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and
bone formation and which expresses metalloendopeptidase activity (1-2
copies). Homologous proteins are found in sea urchin - suBMP (1 copy) - and
in Drosophila - the dorsal-ventral patterning protein tolloid (2 copies).
- Caenorhabditis elegans developmental proteins lin-12 (13 copies) and glp-1
(10 copies).
- Caenorhabditis elegans APX-1 protein, a patterning protein (4.5 copies).
- Calcium-dependent serine proteinase (CASP) which degrades the extracellular
matrix proteins type I and IV collagen and fibronectin (1 copy).
- Cartilage matrix protein CMP (1 copy).
- Cartilage oligomeric matrix protein COMP (4 copies).
- Cell surface antigen 114/A10 (3 copies).
- Cell surface glycoprotein complex transmembrane subunit ASGP-2 from rat (2
copies).
- Coagulation associated proteins C, Z (2 copies) and S (4 copies).
- Coagulation factors VII, IX, X and XII (2 copies).
- Complement C1r components (1 copy).
- Complement C1s components (1 copy).
- Complement-activating component of Ra-reactive factor (RARF) (1 copy).
- Complement components C6, C7, C8 alpha and beta chains, and C9 (1 copy).
- Crumbs, an epithelial development protein from Drosophila (29 copies).
- Epidermal growth factor precursor (7-9 copies).
- Exogastrula-inducing peptides A, C, D and X from sea urchin (1 copy).
- Fat protein, a Drosophila cadherin-related tumor suppressor (5 copies).
- Fetal antigen 1, a probable neuroendocrine differentiation protein, which
is derived from the delta-like protein (DLK) (6 copies).
- Fibrillin 1 (47 copies) and fibrillin 2 (14 copies).
- Fibropellins IA (21 copies), IB (13 copies), IC (8 copies), II (4 copies)
and III (8 copies) from the apical lamina - a component of the
extracellular matrix - of sea urchin.
- Fibulin-1 and -2, two extracellular matrix proteins (9-11 copies).
- Giant-lens protein (protein Argos), which regulates cell determination and
axon guidance in the Drosophila eye (1 copy).
- Growth factor-related proteins from various poxviruses (1 copy).
- Gurken protein, a Drosophila developmental protein (1 copy).
- Heparin-binding EGF-like growth factor (HB-EGF), transforming growth factor
alpha (TGF-alpha), growth factors Lin-3 and Spitz (1 copy); the precursors
are membrane proteins, the mature form is located extracellular.
- Hepatocyte growth factor (HGF) activator (EC 3.4.21.-) (2 copies).
- LDL and VLDL receptors, which bind and transport low-density lipoproteins
and very low-density lipoproteins (3 copies).
- LDL receptor-related protein (LRP), which may act as a receptor for
endocytosis of extracellular ligands (22 copies).
- Leucocyte antigen CD97 (3 copies), cell surface glycoprotein EMR1 (6
copies) and cell surface glycoprotein F4/80 (7 copies).
- Limulus clotting factor C, which is involved in hemostasis and host defense
mechanisms in japanese horseshoe crab (1 copy).
- Meprin A alpha subunit, a mammalian membrane-bound endopeptidase (1 copy).
- Milk fat globule-EGF factor 8 (MFG-E8) from mouse (2 copies).
- Neuregulin GGF-I and GGF-II, two human glial growth factors (1 copy).
- Neurexins from mammals (3 copies).
- Neurogenic proteins Notch, Xotch and the human homolog Tan-1 (36 copies),
Delta (9 copies) and the similar differentiation proteins Lag-2 from
Caenorhabditis elegans (2 copies), Serrate (14 copies) and Slit (7 copies)
from Drosophila.
- Nidogen (also called entactin), a basement membrane protein from chordates
(2-6 copies).
- Ookinete surface proteins (24 Kd, 25 Kd, 28 Kd) from Plasmodium (4 copies).
- Pancreatic secretory granule membrane major glycoprotein GP2 (1 copy).
- Perforin, which lyses non-specifically a variety of target cells (1 copy).
- Proteoglycans aggrecan (1 copy), versican (2 copies), perlecan (at least 2
copies), brevican (1 copy) and chondroitin sulfate proteoglycan (gene PG-M)
(2 copies).
- Prostaglandin G/H synthase 1 and 2 (EC 1.14.99.1) (1 copy), which is found
in the endoplasmatic reticulum.
- S1-5, a human extracellular protein whose ultimate activity is probably
modulated by the environment (5 copies).
- Schwannoma-derived growth factor (SDGF), an autocrine growth factor as well
as a mitogen for different target cells (1 copy).
- Selectins. Cell adhesion proteins such as ELAM-1 (E-selectin), GMP-140
(P-selectin), or the lymph-node homing receptor (L-selectin) (1 copy).
- Serine/threonine-protein kinase homolog (gene Pro25) from Arabidopsis
thaliana, which may be involved in assembly or regulation of
light-harvesting chlorophyll A/B protein (2 copies).
- Sperm-egg fusion proteins PH-30 alpha and beta from guinea pig (1 copy).
- Stromal cell derived protein-1 (SCP-1) from mouse (6 copies).
- TDGF-1, human teratocarcinoma-derived growth factor 1 (1 copy).
- Tenascin (or neuronectin), an extracellular matrix protein from mammals
(14.5 copies), chicken (TEN-A) (13.5 copies) and the related proteins human
tenascin-X (18 copies) and tenascin-like proteins TEN-A and TEN-M from
Drosophila (8 copies).
- Thrombomodulin (fetomodulin), which together with thrombin activates
protein C (6 copies).
- Thrombospondin 1, 2 (3 copies), 3 and 4 (4 copies), adhesive glycoproteins
that mediate cell-to-cell and cell-to-matrix interactions.
- Thyroid peroxidase 1 and 2 (EC 1.11.1.8) from human (1 copy).
- Transforming growth factor beta-1 binding protein (TGF-B1-BP) (16 or 18
copies).
- Tyrosine-protein kinase receptors Tek and Tie (EC 2.7.1.112) (3 copies).
- Urokinase-type plasminogen activator (EC 3.4.21.73) (UPA) and tissue
plasminogen activator (EC 3.4.21.68) (TPA) (1 copy).
- Uromodulin (Tamm-horsfall urinary glycoprotein) (THP) (3 copies).
- Vitamin K-dependent anticoagulants protein C (2 copies) and protein S (4
copies) and the similar protein Z, a single-chain plasma glycoprotein of
unknown function (2 copies).
- 63 Kd sperm flagellar membrane protein from sea urchin (3 copies).
- 93 Kd protein (gene nel) from chicken (5 copies).
- Hypothetical 337.6 Kd protein T20G5.3 from Caenorhabditis elegans (44
copies).
The functional significance of EGF domains in what appear to be unrelated
proteins is not yet clear. However, a common feature is that these repeats are
found in the extracellular domain of membrane-bound proteins or in proteins
known to be secreted (exception: prostaglandin G/H synthase). The EGF domain
includes six cysteine residues which have been shown (in EGF) to be involved
in disulfide bonds. The main structure is a two-stranded beta-sheet followed
by a loop to a C-terminal short two-stranded sheet. Subdomains between the
conserved cysteines strongly vary in length as shown in the following
schematic representation of the EGF-like domain:
+-------------------+ +-------------------------+
| | | |
x(4)-C-x(0,48)-C-x(3,12)-C-x(1,70)-C-x(1,6)-C-x(2)-G-a-x(0,21)-G-x(2)-C-x
| | ************************************
+-------------------+
'C': conserved cysteine involved in a disulfide bond.
'G': often conserved glycine
'a': often conserved aromatic amino acid
'*': position of both patterns.
'x': any residue
The region between the 5th and 6th cysteine contains two conserved glycines of
which at least one is present in most EGF-like domains. We created two
patterns for this domain, each including one of these C-terminal conserved
glycine residues.
-Consensus pattern: C-x-C-x(5)-G-x(2)-C
[The 3 C's are involved in disulfide bonds]
-Sequences known to belong to this class detected by the pattern: A majority,
but not those that have very long or very short regions between the last 3
conserved cysteines of their EGF-like domain(s).
-Other sequence(s) detected in SWISS-PROT: 87 proteins, of which 27 can be
considered as possible candidates.
-Consensus pattern: C-x-C-x(2)-[GP]-[FYW]-x(4,8)-C
[The three C's are involved in disulfide bonds]
-Sequences known to belong to this class detected by the pattern: A majority,
but not those that have very long or very short regions between the last 3
conserved cysteines of their EGF-like domain(s).
-Other sequence(s) detected in SWISS-PROT: 83 proteins, of which 49 can be
considered as possible candidates.
-Note: The beta chain of the integrin family of proteins contains 2 cysteine-
rich repeats which were said to be dissimilar with the EGF pattern [7].
-Note: Laminin EGF-like repeats (see <PDOC00961>) are longer than the average
EGF module and contain a further disulfide bond C-terminal of the EGF-like
region. Perlecan and agrin contain both EGF-like domains and laminin-type
EGF-like domains.
-Note: the pattern do not detect all of the repeats of proteins with multiple
EGF-like repeats.
-Note: see <PDOC00913> for an entry describing specifically the subset of EGF-
like domains that bind calcium.
-Last update: November 1997 / Patterns and text revised.
[ 1] Davis C.G.
New Biol. 2:410-419(1990).
[ 2] Blomquist M.C., Hunt L.T., Barker W.C.
Proc. Natl. Acad. Sci. U.S.A. 81:7363-7367(1984).
[ 3] Barker W.C., Johnson G.C., Hunt L.T., George D.G.
Protein Nucl. Acid Enz. 29:54-68(1986).
[ 4] Doolittle R.F., Feng D.F., Johnson M.S.
Nature 307:558-560(1984).
[ 5] Appella E., Weber I.T., Blasi F.
FEBS Lett. 231:1-4(1988).
[ 6] Campbell I.D., Bork P.
Curr. Opin. Struct. Biol. 3:385-392(1993).
[ 7] Tamkun J.W., DeSimone D.W., Fonda D., Patel R.S., Buck C., Horwitz A.F.,
Hynes R.O.
Cell 46:271-282(1986).
{END}
If you have problems
or comments...
Back to the ExPASy molecular biology server home
page