Primary Structure MEGF8 is composed of either 2845 amino acids (Isoform 1) or 2778 amino acids (Isoform 2). Isoform 2 undergoes a 67 amino acid removal from 700-766, which accounts for its shortened length; otherwise, the two isoforms are identical. Using SAPS, a Statistical Analysis of Protein Sequence software, amino acid bias was able to be determined. Isoform one is rich in cysteine and glycine, and deficient in isoleucine and lysine. Isoform 2 of MEGF8 was found to have very high levels of cysteine, moderately high levels of glycine, and low levels of isoleucine and lysine. The high levels of cysteine residues contributes to the numerous
disulfide bonds found in the mature protein's folded structure. Overall, MEGF8 has a pH between 6.4 and 7.0, depending on the organism's sequence. Human MEGF8's pH is 6.4. This nearly neutral pH enables the protein to fold properly and inhibits denaturation. The twenty most conserved amino acids, found through a
multiple sequence alignment of 20 orthologs, were found to be located in the CUB and transmembrane domains.
Secondary Structure Prediction software PELE from UCSC Biology Workbench indicated that MEGF8 is primarily composed of beta-folded sheets, with occasional short alpha helix segments. PELE uses eight different prediction programs to compare and confirm predictions, enhancing the confidence level. The beta-folded sheets occur at many of the key domains, including the EGF-domains, kelch domains, and EGF-laminin domains. This information from PELE also corresponded with the secondary structure and 3D structure predictions made by
PHYRE2 Predicted Key Domains & Features MEGF8 is predicted to contain several different types of features, domains, and motifs that play a key role in the protein's function, structure, and location. These are listed in Table 1. Functions, found through SMART include: •
CUB domain: extracellular domain: present in proteins mostly known to be involved in development. • Epidermal Growth Factor Domain: a short peptide with a distinctive motif of six cysteines, which is found in many different proteins of diverse functions •
EGF-like domain: contains several sub-families of different functions according to location and protein; not specified for MEGF8. • Calcium
EGF-like domain: Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains. •
Kelch motif: Galactose oxidase, central domain; Found to cause formation of ß propeller tertiary structure of the protein. •
Leucine Zipper: A motif found in regulatory proteins, as predicted by PSORT II • Laminin
EGF-like domain: laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF). • PSI domain: domain found in
plexins,
semaphorins and
integrins.
Plexin are involved in the development of neural and epithelial tissues; semaphorins induce the collapse and paralysis of neuronal growth cones; and integrins may mediate adhesive or migratory functions of epithelial cells.
Predicted Domain & Motif Locations Tertiary Structure One of the key attributes of MEGF8's tertiary structure is its 7-bladed
beta propeller which is formed by the
kelch motif found in its D1k3ia3 structural domain, which was identified by
SCOP. SCOP also indicated that the beta-propeller in MEGF8 is a member of the
galactose oxidase super family. Each of the seven blades are made up of a four stranded beta-folded motifs. It is also important to note that although many phosphorylation sites are predicted at high confidence, several other topographic predictions (i.e. disulfide bonds, glycosylation, other extracellular features), do not support these predictions.
Predicted Post Translational Modifications ==Expression==