First isolation and classification Proteins were recognized as a distinct class of biological molecules in the eighteenth century by
Antoine Fourcroy and others. Members of this class (called the "albuminoids",
Eiweisskörper, or
matières albuminoides) were recognized by their ability to
coagulate or
flocculate under various treatments such as heat or acid; well-known examples at the start of the nineteenth century included albumen from
egg whites,
blood serum albumin,
fibrin, and
wheat gluten. The similarity between the cooking of egg whites and the curdling of milk was recognized even in ancient times; for example, the name
albumen for the egg-white protein was coined by
Pliny the Elder from the
Latin albus ovi (egg white). With the advice of
Jöns Jakob Berzelius, the Dutch chemist
Gerhardus Johannes Mulder carried out
elemental analyses of common animal and plant proteins. To everyone's surprise, all proteins had nearly the same
empirical formula, roughly C400H620N100O120 with individual sulfur and phosphorus atoms. Mulder published his findings in two papers (1837,1838) and hypothesized that there was one basic substance (
Grundstoff) of proteins, and that it was synthesized by plants and absorbed from them by animals in digestion. Berzelius was an early proponent of this theory and proposed the name "protein" for this substance in a letter dated 10 July 1838 The name protein that he propose for the organic oxide of
fibrin and
albumin, I wanted to derive from [the
Greek word] πρωτειος, because it appears to be the primitive or principal substance of animal nutrition. Mulder went on to identify the products of protein degradation such as the
amino acid,
leucine, for which he found a (nearly correct) molecular weight of 131
Da.
Purifications and measurements of mass The minimum molecular weight suggested by Mulder's analyses was roughly 9
kDa, hundreds of times larger than other molecules being studied. Hence, the chemical structure of proteins (their
primary structure) was an active area of research until 1949, when
Fred Sanger sequenced
insulin. The (correct) theory that proteins were linear polymers of
amino acids linked by
peptide bonds was proposed independently and simultaneously by
Franz Hofmeister and
Emil Fischer at the same conference in 1902. However, some scientists were sceptical that such long
macromolecules could be stable in solution. Consequently, numerous alternative theories of the protein
primary structure were proposed, e.g., the colloidal hypothesis that proteins were assemblies of small molecules, the
cyclol hypothesis of
Dorothy Wrinch, the diketopiperazine hypothesis of
Emil Abderhalden and the pyrrol/piperidine hypothesis of Troensgard (1942). Most of these theories had difficulties in accounting for the fact that the digestion of proteins yielded
peptides and
amino acids. Proteins were finally shown to be macromolecules of well-defined composition (and not colloidal mixtures) by
Theodor Svedberg using
analytical ultracentrifugation. The possibility that some proteins are non-covalent associations of such macromolecules was shown by
Gilbert Smithson Adair (by measuring the
osmotic pressure of
hemoglobin) and, later, by
Frederic M. Richards in his studies of ribonuclease S. The
mass spectrometry of proteins has long been a useful technique for identifying
posttranslational modifications and, more recently, for probing protein structure. Most proteins are difficult to
purify in more than milligram quantities, even using the most modern methods. Hence, early studies focused on proteins that could be purified in large quantities, e.g., those of
blood,
egg white, various
toxins, and digestive/metabolic enzymes obtained from
slaughterhouses. Many techniques of protein purification were developed during
World War II in a project led by
Edwin Joseph Cohn to purify blood proteins to help keep soldiers alive. In the late 1950s, the
Armour Hot Dog Co. purified 1 kg (= one million milligrams) of pure bovine pancreatic
ribonuclease A and made it available at low cost to scientists around the world. This generous act made RNase A the main protein for basic research for the next few decades, resulting in several Nobel Prizes.
Protein folding and first structural models The study of protein folding began in 1910 with a famous paper by
Harriette Chick and
C. J. Martin, in which they showed that the
flocculation of a protein was composed of two distinct processes: the
precipitation of a protein from solution was
preceded by another process called
denaturation, in which the protein became much less soluble, lost its enzymatic activity and became more chemically reactive. In the mid-1920s,
Tim Anson and
Alfred Mirsky proposed that denaturation was a reversible process, a correct hypothesis that was initially lampooned by some scientists as "unboiling the egg". Anson also suggested that denaturation was a two-state ("all-or-none") process, in which one fundamental molecular transition resulted in the drastic changes in solubility, enzymatic activity and chemical reactivity; he further noted that the free energy changes upon denaturation were much smaller than those typically involved in chemical reactions. In 1929,
Hsien Wu hypothesized that denaturation was protein unfolding, a purely conformational change that resulted in the exposure of amino acid side chains to the solvent. According to this (correct) hypothesis, exposure of aliphatic and reactive side chains to solvent rendered the protein less soluble and more reactive, whereas the loss of a specific conformation caused the loss of enzymatic activity. Although considered plausible, Wu's hypothesis was not immediately accepted, since so little was known of protein structure and enzymology and other factors could account for the changes in solubility, enzymatic activity and chemical reactivity. In the early 1960s,
Chris Anfinsen showed that the folding of
ribonuclease A was fully reversible with no external cofactors needed, verifying the "thermodynamic hypothesis" of protein folding that the folded state represents the global minimum of
free energy for the protein. The hypothesis of protein folding was followed by research into the physical interactions that stabilize folded protein structures. The crucial role of
hydrophobic interactions was hypothesized by
Dorothy Wrinch and
Irving Langmuir, as a mechanism that might stabilize her
cyclol structures. Although supported by
J. D. Bernal and others, this (correct) hypothesis was rejected along with the cyclol hypothesis, which was disproven in the 1930s by
Linus Pauling (among others). Instead, Pauling championed the idea that protein structure was stabilized mainly by
hydrogen bonds, an idea advanced initially by
William Astbury (1933). Remarkably, Pauling's incorrect theory about H-bonds resulted in his
correct models for the
secondary structure elements of proteins, the
alpha helix and the
beta sheet. The hydrophobic interaction was restored to its correct prominence by a famous article in 1959 by
Walter Kauzmann on denaturation, based partly on work by
Kaj Linderstrøm-Lang. The ionic nature of proteins was demonstrated by Bjerrum, Weber and
Arne Tiselius, but Linderstrom-Lang showed that the charges were generally accessible to solvent and not bound to each other (1949). The
secondary and low-resolution
tertiary structure of globular proteins was investigated initially by hydrodynamic methods, such as
analytical ultracentrifugation and
flow birefringence. Spectroscopic methods to probe protein structure (such as
circular dichroism, fluorescence, near-ultraviolet and infrared absorbance) were developed in the 1950s. The first atomic-resolution structures of proteins were solved by
X-ray crystallography in the 1960s and by
NMR in the 1980s. , the
Protein Data Bank has over 150,000 atomic-resolution structures of proteins. In more recent times,
cryo-electron microscopy of large
macromolecular assemblies has achieved atomic resolution, and computational
protein structure prediction of small protein
domains is approaching atomic resolution. ==See also==