1. Protein purification is a sequential process involving biochemical methods to
enrich or purify a single protein from a homogenate or extract. These biochemical
methods are exploiting the difference in proteins size, physico-chemical
property and binding affinity. Homogenates typically comprise cell and tissue
components, including DNA, cell membrane and other proteins.
The first choice in the purification process is
usually whether to purify the protein from a natural source or to express it in
an expression system using recombinant
DNA technology. If large amounts are needed, expressing the protein
in an expression system is usually worth the effort. For many low abundance
proteins recombinant expression is the only way to obtain the protein.
Recombinant expression allows the protein to be tagged, e.g. by a His-tag,
GST-tag, etc., to facilitate purification, which means that the purification
can be done in fewer steps.
An analytical purification generally utilizes three
properties to separate proteins. First, proteins may be purified according to
their isolectric points by running them through a pH graded gel or an ion
exchange column (isoelectric focusing). Second, proteins can be separated
according to their size or molecular weight via size exclusion chromatography
or by SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis)
analysis. Proteins are often purified by using 2D-PAGE and are then analysed by peptide mass fingerprinting to establish
the protein identity. Evaluating
purification yield
The most general method to monitor the purification
process is by running a SDS-PAGE of the different steps. This method only gives a
rough measure of the amounts of different proteins in the mixture, and it is
not able to distinguish between proteins with similar molecular
weight.
If the protein has a distinguishing spectroscopic
feature or an enzymatic
activity, this property can be used to detect to detect and quantify
the specific protein, and thus to select the fractions of the separation, that
contains the protein. If antibodies against the target protein are available, western
blotting and ELISA can specifically detect and quantify the amount of
desired protein
In order to evaluate the process of multistep
purification, the amount of the specific protein has to be compared to the
amount of total protein. The latter can be determined by the Bradford total protein assay or by
absorbance of light at 280 nm.
Purification of a tagged protein
By adding a tag to the protein, it is conferred with
a binding affinity, that it would not otherwise have. Usually the recombinant
protein is the only protein in the mixture with this affinity, and can be
separated on the basis of this property. The most common tag is the His-tag,
that has affinity towards nickel ions, and thus be immobilizing nickel ions in a resin,
you can make the protein bind the resin (which is packed in a column). Since
the protein is the only component with a His-tag, all other proteins will pass
through the column, and leave the His-tagged protein bound to the resin. The
protein is released from the column via elution, which in using His-tag is
achieved by flushing it with imidazole, which competes with the His-tags for
nickel binding.
Methods of protein purification
The methods used in protein purification, can roughly
be divided into analytical and preparative methods. The distinction is not
exact, but the deciding factor is the amount of protein, that can practically
be purified with that method. Analytical methods aim to detect and identify a
protein in a mixture, where as preparative methods aim to produce a large
amount of the protein for other another purpose, such as structural biology or for industrial use. In
general, the preparative methods can be used in analytical applications, but
not the other way around.
- Separation on the basis of Size
Molecular Exclusion
Chromatography:
Also known as gel permeation or gel filtration chromatography, this type of
chromatography lacks an attractive interaction (affinity) between the
stationary phase and solute. The liquid phase passes through a porous gel which
separates the molecules according to its size. The pores are normally small and
exclude the larger solute molecules, but allow smaller molecules to enter the
gel, causing them to flow through a larger volume. The molecules that are small
enough to enter the pores linger inside successive beads (stationary phase) as
they pass down the column while larger molecules remain in the solution flowing
between the beads and therefore move more rapidly, emerging from the column
first.
- Separation on the basis of Physical Properties
Various other forms of chromatography can be used.
Most common are methods based on the separation based on charge or
hydrophobicity.
Ion-Exchange Chromatography: Ion exchange
chromatography is commonly used in the purification of biological materials.
There are two types of exchange: cation
exchange in which the stationary phase carries a negative charge, and anion
exchange in which the stationary phase carries a positive charge. Charged molecules in the liquid phase pass through
the column until a binding site in the stationary phase appears. Separation by
this method is highly selective. Since the resins are fairly inexpensive and
high capacities can be used, this method of separation is applied
early in the overall process.
-
Hydrophobicity
High Performance Liquid Chromatography (HPLC) is a form of column chromatography used frequently in biochemistry
and analytical chemistry. The analyte is
forced through a column of the stationary phase in a liquid (mobile phase) at
high pressure, which decreases the time the separated components remain on the
stationary phase and thus the time they have to diffuse
within the column. This leads to narrower peaks in the resulting chromatogram
and hence to better resolution and sensitivity.
Solvents used include any miscible combination of
water or various organic liquids (the most common are methanol
and acetonitrile).
Water may contain buffers or salts to assist in the separation of the analyte
components.
Advances in the HPLC technology have brought about
the use of gradients in the mobile phase composition. A normal gradient might
be 5 - 50% methanol (depending on how hydrophobic the analyte is) over 25
minutes. The gradient separates the analyte mixtures as a function of how well
the changing solvent mobilizes the analyte. For instance, using a
water/methanol gradient, the more hydrophobic components will elute (come off
the column) under conditions of relatively high methanol; whereas the more
hydrophilic will elute under conditions of relatively low methanol. - Separation on the basis of affinity
Affinity Chromatography: Affinity
chromatography involves the use of packing which has been chemically modified
by attaching a compound with a specific affinity for the desired molecules. The
packing material used, called the affinity matrix, must be inert and easily
modified. Agarose is the most common substance used.
The ligands or "affinity tails" that are
inserted into the matrix can be genetically engineered to possess a specific
affinity. In a process similar to ion exchange chromatography, the desired
molecules are bound to the ligands on the matrix
until a solution of high salt
concentration (for eluting the protein of interest from the column) is
passed through the column. This causes desorption of the molecules from the ligands,
and they elute from the column. Fouling of the matrix can occur when a large
number of unpurified proteins are present, therefore,
this type of chromatography is usually implemented late in the process.
- Metal-binding
Polyhistidine-tags are often used for
affinity purification of polyhistidine-tagged recombinant proteins that are
expressed in Escherichia coli or other prokaryotic expression systems.
The bacterial cells are harvested by centrifugation and the resulting cell pellet
can be lysed with detergents or enzymes such as lysozyme. The raw lysate
contains at this stage the recombinant protein among several other proteins
derived from the bacteria and are incubated with affinity media. Both affinity
media have metal ions bound to it, either nickel or cobalt to which
the polyhistidine-tag binds with affinity. The resin is then washed with
phosphate buffer to remove proteins that do not specifically interact with the
cobalt or nickel ion. The protein can be eluted by the addition of imidazole
(since imidazole competes with His-tagged proteins in binding to the metal
ions). The purity and amount of protein can be assessed by SDS-PAGE
and western blotting.
- Denaturing-Condition Electrophoresis
SDS Polyacrylamide-Gel
electrophoresis
Proteins usually possess
a net positive or a negative charge, depending on the mixture of charged amino
acids they contain. When an electrical charge is applied to a solution
containing a protein molecule, the protein migrates at a rate that depends on
its net charge and on its size and shape. This is called electrophoresis.
SDS-PAGE uses highly cross-linked gel of polyacrylamide
as the inert matrix through which the proteins migrate. The gel is made by
polymerization from monomers; the pore size of the gel can be adjusted so that
it is small enough to retard the migration of the protein molecules.
SDS is a strong
negatively charged detergent that can bind to the hydrophobic region of the
protein molecules, causing the protein molecules to unfold into extended
polypeptide chains. By treatment of SDS, the proteins, now in the form of
extended polypeptide chains, bind large numbers of the negatively charged detergent
molecules which mask the protein’s intrinsic charge and cause it to migrate
toward the positive electrode when voltage is applied. (β-mercaptoethanol is also a detergent that can disrupt
disulfide bonds. This can be used in combination with SDS to disrupt protein
structure.)
Proteins of the same
size tend to move through the gel with similar speeds because,
-
the
native structure is completely unfolded by the SDS, making them identical in
shape
-
bind
the same amount of SDS therefore allowing them to have same amount of negative
charge
Two-dimensional Polyacrylamide-Gel
Electrophoresis
2DGE
combines two different separation procedures.
1) The
proteins are separated by their intrinsic charges. The sample is dissolved in a
small volume of a solution containing a nonionic (uncharged) detergent,
together with β-mercaptoethanol and denaturing
reagent, Urea. The polypeptide chains (denatured by the use of detergents) are
separated by the use of isoelectric focusing, which
uses the fact that the net charge on a protein molecule varies with the pH of
the surrounding solution (low pH induce protein to be net positive, high pH
allow proteins to be negatively charged.). In isoelectric
focusing, proteins are separated electrophoretically
in a narrow tube of polyacrylamide gel (the protein
mixture is subjected to an inert support in which a stable pH gradient has
previously been generated) in which a gradient of pH is established by a
mixture of special buffers.
The anode
region (positive charge) is at a lower pH than the cathode (negative charge)
and the pH range is chosen such that the proteins to be separated have their isoelectric points within this range. A protein which is in
a pH region below its pI will be positively charged
and so will migrate towards the cathode. However, as it migrates, so the pH
will decrease until the protein reaches a pH which is its pI.
At this point it has no net charge and so migration ceases. Should it overshoot
this point, it will enter a region of pH above its pI
and so become negatively charged. It will then reverse its direction of
migration and now migrate towards the anode. Therefore proteins become focused
into sharp stationary bands with each protein positioned at a point in the pH
gradient corresponding to its pI. The technique is
capable of extremely high resolution with proteins differing by a single charge
being fractionated into separate bands.
2)
SDS-PAGE is applied in the second step as explained above, separating the
proteins by their molecular weight.
Protein fingerprinting
The
protein of interest is digested with trypsin to
generate a mixture of polypeptide fragments, which is then fractionated in two
dimensions by electrophoresis and partition chromatography. Partition
chromatography separates peptides on the basis of their differential solubilities in water, which is preferentially bound to the
solid matrix.
Protein
fingerprinting is normally used to detect post-translational modification.
2. Protein
Sequencing
Protein sequencing - determining the amino acid sequences of its constituent peptides;
and also determining what conformation it adopts and whether it is complexed
with any non-peptide molecules.
The two major direct methods of protein sequencing are mass
spectrometry and the Edman
degradation reaction. It is also possible to generate an amino
acid sequence from the DNA
or mRNA
sequence encoding the protein.
The Edman degradation is a very important reaction for
protein sequencing, because it allows the ordered amino acid composition of a
protein to be discovered. Automated Edman sequencers are now in widespread use,
and are able to sequence peptides up to approximately 50 amino acids long. A
reaction scheme for sequencing a protein by the Edman degradation follows
1. Break any
disulfide bridges in the protein by oxidising
with performic acid.
2. Separate and
purify the individual chains of the protein complex, if there is more than one.
3. Determine the
amino acid composition of each chain.
4. Determine the
terminal amino acids of each chain.
5. Break each
chain into fragments under 50 amino acids long.
6. Separate and
purify the fragments.
7. Determine the
sequence of each fragment.
8. Repeat with a
different pattern of cleavage.
9. Construct the
sequence of the overall protein.
Peptides longer than about 50-70 amino acids cannot be sequenced
reliably by the Edman degradation. Because of this, long protein chains need to
be broken up into small fragments which can then be sequenced individually.
Digestion is done either by endopeptidases such as trypsin or pepsin or by
chemical reagents such as cyanogen
bromide. Different enzymes give different cleavage patterns, and the
overlap between fragments can be used to construct an overall sequence.
1) The enzyme trypsin cleaves proteins at the
carboxyl side of lysine and arginine (except when these two residues are
followed by proline). Trypsin is considered an endopeptidase, i.e., cleavage
occurs within the polypeptide chain rather than at the terminal amino acids
located at the ends of polypeptides.
2) Pepsin is a digestive protease released by the
chief cells in the stomach that functions to degrade food proteins into
peptides.
3) Cyanogen
bromide cuts peptide bonds next to methionine residues.
The Edman
degradation reaction
Edman degradation is
a method of sequencing amino acids in a peptide. In this method, the
amino-terminal residue is labeled and cleaved from the peptide without
disrupting other peptide bonds between other amino acid residues. Phenyl
isothiocyanate is reacted with uncharged terminal amino group to form a phenylthiocarbamoyl
derivative. Then, under mildly acidic conditions, this derivative of the
terminal amino acid is cleaved. The derivative that was cleaved is known as
phenylthiohydantoin (PTH)- amino acid that can be identified by using
chromatography. This procedure can then be repeated again to identify the next
amino acid. A major drawback to this technique is that the peptides being
sequenced in this manner cannot have more than 50 to 60 residues. This is
because the Edman degradation reaction is not 100% efficient, meaning that the
cleavage step does not occur every time. However, this problem can be resolved
by cleaving large peptides into smaller peptides before proceeding with the
reaction.
Limitations of the
Edman degradation
Because the Edman
degradation proceeds from the N-terminus of the protein, it will not work
if the N-terminal amino acid has been chemically modified or if it is concealed
within the body of the protein. It also requires the use of either guesswork or
a separate procedure to determine the positions of disulfide bridges.
B) Mass spectroscopy
The other major direct method by which the sequence of a
protein can be determined is mass spectrometry. This method has been gaining
popularity in recent years as new techniques and increasing computing power
have facilitated it. Peptides are also easier to prepare for mass spectrometry
than whole proteins, because they are more soluble.
The protein is digested by an endoprotease, and the
resulting solution is passed through a high pressure liquid chromatography
column. At the end of this column, the solution is sprayed out of a narrow
nozzle charged to a high positive potential into the mass spectrometer. The
charge on the droplets causes them to fragment until only single ions remain.
The peptides are then fragmented and the mass-charge ratios of the fragments
measured. (It is possible to detect which peaks correspond to multiply charged
fragments, because these will have auxiliary peaks corresponding to other
isotopes - the distance between these other peaks is inversely proportional to
the charge on the fragment). The mass spectrum is analysed by computer and
often compared against a database of previously sequenced proteins in order to
determine the sequences of the fragments. This process is then repeated with a
different digestion enzyme, and the overlaps in the sequences used to construct
a sequence for the protein.
How MS can work is that different molecules have
different masses, and this fact is used in a mass spectrometer to determine
what molecules are present in a sample. For example, table salt (NaCl), is
vaporized (turned into gas) and broken down (ionized) into electrically charged
particles, called ions, in the first part of the mass spectrometer. The sodium
ions and chloride ions have specific molecular weights. They also have a
charge, which means that they will be moved under the influence of an electric
field. These ions are then sent into an ion acceleration chamber and passed through
a slit in a metal sheet. A magnetic field is applied to the chamber, which
pulls on each ion equally and deflects them (makes them curve instead of
traveling straight) onto a detector. The lighter ions deflect farther than the
heavy ions, because the force on each ion is equal but their masses are not
(this is derived from the equation F = ma
which states that if the force remains the same, the mass and acceleration are
inversely proportional). The detector measures exactly how far each ion has
been deflected, and from this measurement, the ion's 'mass to charge ratio' can
be worked out. From this information it is possible to determine with a high
level of certainty what the chemical composition of the original sample was.
This example was of a sector instrument, however there
are many types of mass spectrometers that not only analyze the ions differently
but produce different types of ions; however they all use electric and magnetic
fields to change the path of ions in some way.
3. X-ray
crystallography
This is to understand the three-dimensional structure of
the protein of interest. X-ray crystallography is a technique in
crystallography in which the pattern produced by the diffraction of X-rays
through the closely spaced lattice of atoms in a crystal is recorded and then
analyzed to reveal the nature of that lattice. This generally leads to an
understanding of the material and molecular structure of a substance. The
spacings in the crystal lattice can be determined using Bragg's law (When
X-rays hit an atom, they make the electronic cloud move as does any
electromagnetic wave. The movement of these charges re-radiates waves with the
same frequency. These re-emitted X-rays interfere, giving constructive or
destructive interferences; this is the diffraction phenomenon.). The electrons
that surround the atoms, rather than the atomic nuclei themselves, are the
entities which physically interact with the incoming X-ray photons. This
technique is widely used in chemistry and biochemistry to determine the
structures of an immense variety of molecules. X-ray diffraction is commonly
carried out using single crystals of a material.
Crystallisation of proteins
In order to solve a protein crystal structure, you must
first crystallise the protein. This is because a single molecule in solution
has insufficent scattering power by itself. A crystal can be considered to be
an (effectively) infinite repeating array of the molecule of interest. The
constructive interference between diffracted X-rays that are in-phase reinforce
each other, so that the diffraction pattern becomes detectable. The geometric
conditions where diffraction occurs can be visualized using
(a) the wavelength of the incident
beams of light,
(b) the angle of diffraction for a
given reflection,
(c) the unit cell and reciprocal
unit cell of the crystal, and
(d) the distance between the crystal
and the film.
4. Protein-Protein
Interaction
A) Protein affinity chromatography is one method that can
be used to isolate and identify proteins that interact physically. To capture
interacting proteins, the target protein is attached to polymer beads that are
packed into a column. Cellular proteins are washed through the column and those
proteins that interact with target adhere to the affinity matrix (identical to
that explained above for affinity chromatography). These proteins can be eluted
and their identity determined by mass spectrometry.
B) Co-immunoprecipitation is another way of finding out
whether two proteins bind to each other.
The figure on the left shows the method used for
traditional IP, as the figure on the below shows the method for co-IP.
Immunoprecipitation, referred also as
"IP" is the technique of precipitating an antigen out of solution
using an antibody specific to that antigen. This process can be used to
identify protein complexes present in cell extracts by targeting any one of the
proteins believed to be in the complex. Insoluble antibody-binding proteins
isolated initially from bacteria, such as Protein A and Protein G, are used to
bring the antibody-antigen complexes out of solution. These can also be coupled
to sepharose beads that can easily be isolated out of solution. After washing,
the precipitate can then be analyzed using mass spectrometry, western blotting,
or any number of other methods for identifying constituents in the complex.
Co-immunoprecipitation (Co-IP) is a popular technique for
protein interaction discovery. Co-IP is conducted in essentially the same
manner as an IP. However, in a co-IP, the target antigen precipitated by
the antibody “co-precipitates” a binding partner/protein complex from a lysate, i.e., the interacting protein is bound to the
target antigen, which becomes bound by the antibody that becomes captured on
the Protein A or G gel support.
Protein A/G: Prepared with a recombinant streptococcal
Protein A/G which retains its high affinity for Fc
region of IgG and/or IgM
and lacks albumin and Fab binding sites and membrane
binding regions.
C)
Purification of protein complexes using a GST-tagged fusion protein is also a
way of finding a way to understand protein-protein interaction (GST-pull down
assay).
The
pull-down assay is an in vitro technique that consists of a fusion-tagged
"bait" protein for which a binding partner ("prey") is
being sought. GST-tagged bait protein is bound to an immobilized glutathione
support. In a typical pull-down assay, the immobilized bait protein is
introduced to a protein pool derived from a cell lysate.
After the prescribed washing steps, the “interactors”
are selectively eluted via the addition of glutathione (for competitive
binding). The interacting proteins are then detected in-gel.