1. Protein purification is a sequential process involving biochemical methods to enrich or purify a single protein from a homogenate or extract. These biochemical methods are exploiting the difference in proteins size, physico-chemical property and binding affinity. Homogenates typically comprise cell and tissue components, including DNA, cell membrane and other proteins.

Strategies

The first choice in the purification process is usually whether to purify the protein from a natural source or to express it in an expression system using recombinant DNA technology. If large amounts are needed, expressing the protein in an expression system is usually worth the effort. For many low abundance proteins recombinant expression is the only way to obtain the protein. Recombinant expression allows the protein to be tagged, e.g. by a His-tag, GST-tag, etc., to facilitate purification, which means that the purification can be done in fewer steps.

An analytical purification generally utilizes three properties to separate proteins. First, proteins may be purified according to their isolectric points by running them through a pH graded gel or an ion exchange column (isoelectric focusing). Second, proteins can be separated according to their size or molecular weight via size exclusion chromatography or by SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis) analysis. Proteins are often purified by using 2D-PAGE and are then analysed by peptide mass fingerprinting to establish the protein identity. Evaluating purification yield

The most general method to monitor the purification process is by running a SDS-PAGE of the different steps. This method only gives a rough measure of the amounts of different proteins in the mixture, and it is not able to distinguish between proteins with similar molecular weight.

If the protein has a distinguishing spectroscopic feature or an enzymatic activity, this property can be used to detect to detect and quantify the specific protein, and thus to select the fractions of the separation, that contains the protein. If antibodies against the target protein are available, western blotting and ELISA can specifically detect and quantify the amount of desired protein

In order to evaluate the process of multistep purification, the amount of the specific protein has to be compared to the amount of total protein. The latter can be determined by the Bradford total protein assay or by absorbance of light at 280 nm.

Purification of a tagged protein

By adding a tag to the protein, it is conferred with a binding affinity, that it would not otherwise have. Usually the recombinant protein is the only protein in the mixture with this affinity, and can be separated on the basis of this property. The most common tag is the His-tag, that has affinity towards nickel ions, and thus be immobilizing nickel ions in a resin, you can make the protein bind the resin (which is packed in a column). Since the protein is the only component with a His-tag, all other proteins will pass through the column, and leave the His-tagged protein bound to the resin. The protein is released from the column via elution, which in using His-tag is achieved by flushing it with imidazole, which competes with the His-tags for nickel binding.

Methods of protein purification

The methods used in protein purification, can roughly be divided into analytical and preparative methods. The distinction is not exact, but the deciding factor is the amount of protein, that can practically be purified with that method. Analytical methods aim to detect and identify a protein in a mixture, where as preparative methods aim to produce a large amount of the protein for other another purpose, such as structural biology or for industrial use. In general, the preparative methods can be used in analytical applications, but not the other way around.

- Separation on the basis of Size

Molecular Exclusion Chromatography: Also known as gel permeation or gel filtration chromatography, this type of chromatography lacks an attractive interaction (affinity) between the stationary phase and solute. The liquid phase passes through a porous gel which separates the molecules according to its size. The pores are normally small and exclude the larger solute molecules, but allow smaller molecules to enter the gel, causing them to flow through a larger volume. The molecules that are small enough to enter the pores linger inside successive beads (stationary phase) as they pass down the column while larger molecules remain in the solution flowing between the beads and therefore move more rapidly, emerging from the column first.

- Separation on the basis of Physical Properties

Various other forms of chromatography can be used. Most common are methods based on the separation based on charge or hydrophobicity.

Ion-Exchange Chromatography: Ion exchange chromatography is commonly used in the purification of biological materials. There are two types of exchange: cation exchange in which the stationary phase carries a negative charge, and anion exchange in which the stationary phase carries a positive charge. Charged molecules in the liquid phase pass through the column until a binding site in the stationary phase appears. Separation by this method is highly selective. Since the resins are fairly inexpensive and high capacities can be used, this method of separation is applied early in the overall process.

- Hydrophobicity

High Performance Liquid Chromatography (HPLC) is a form of column chromatography used frequently in biochemistry and analytical chemistry. The analyte is forced through a column of the stationary phase in a liquid (mobile phase) at high pressure, which decreases the time the separated components remain on the stationary phase and thus the time they have to diffuse within the column. This leads to narrower peaks in the resulting chromatogram and hence to better resolution and sensitivity.

Solvents used include any miscible combination of water or various organic liquids (the most common are methanol and acetonitrile). Water may contain buffers or salts to assist in the separation of the analyte components.

Advances in the HPLC technology have brought about the use of gradients in the mobile phase composition. A normal gradient might be 5 - 50% methanol (depending on how hydrophobic the analyte is) over 25 minutes. The gradient separates the analyte mixtures as a function of how well the changing solvent mobilizes the analyte. For instance, using a water/methanol gradient, the more hydrophobic components will elute (come off the column) under conditions of relatively high methanol; whereas the more hydrophilic will elute under conditions of relatively low methanol.   - Separation on the basis of affinity

Affinity Chromatography: Affinity chromatography involves the use of packing which has been chemically modified by attaching a compound with a specific affinity for the desired molecules. The packing material used, called the affinity matrix, must be inert and easily modified. Agarose is the most common substance used. The ligands or "affinity tails" that are inserted into the matrix can be genetically engineered to possess a specific affinity. In a process similar to ion exchange chromatography, the desired molecules are bound to the ligands on the matrix until a solution of high salt concentration (for eluting the protein of interest from the column) is passed through the column. This causes desorption of the molecules from the ligands, and they elute from the column. Fouling of the matrix can occur when a large number of unpurified proteins are present, therefore, this type of chromatography is usually implemented late in the process.

- Metal-binding

Polyhistidine-tags are often used for affinity purification of polyhistidine-tagged recombinant proteins that are expressed in Escherichia coli or other prokaryotic expression systems. The bacterial cells are harvested by centrifugation and the resulting cell pellet can be lysed with detergents or enzymes such as lysozyme. The raw lysate contains at this stage the recombinant protein among several other proteins derived from the bacteria and are incubated with affinity media. Both affinity media have metal ions bound to it, either nickel or cobalt to which the polyhistidine-tag binds with affinity. The resin is then washed with phosphate buffer to remove proteins that do not specifically interact with the cobalt or nickel ion. The protein can be eluted by the addition of imidazole (since imidazole competes with His-tagged proteins in binding to the metal ions). The purity and amount of protein can be assessed by SDS-PAGE and western blotting.

- Denaturing-Condition Electrophoresis

SDS Polyacrylamide-Gel electrophoresis

Proteins usually possess a net positive or a negative charge, depending on the mixture of charged amino acids they contain. When an electrical charge is applied to a solution containing a protein molecule, the protein migrates at a rate that depends on its net charge and on its size and shape. This is called electrophoresis. SDS-PAGE uses highly cross-linked gel of polyacrylamide as the inert matrix through which the proteins migrate. The gel is made by polymerization from monomers; the pore size of the gel can be adjusted so that it is small enough to retard the migration of the protein molecules.

SDS is a strong negatively charged detergent that can bind to the hydrophobic region of the protein molecules, causing the protein molecules to unfold into extended polypeptide chains. By treatment of SDS, the proteins, now in the form of extended polypeptide chains, bind large numbers of the negatively charged detergent molecules which mask the protein’s intrinsic charge and cause it to migrate toward the positive electrode when voltage is applied. (β-mercaptoethanol is also a detergent that can disrupt disulfide bonds. This can be used in combination with SDS to disrupt protein structure.)

Proteins of the same size tend to move through the gel with similar speeds because,

-         the native structure is completely unfolded by the SDS, making them identical in shape

-         bind the same amount of SDS therefore allowing them to have same amount of negative charge

Two-dimensional Polyacrylamide-Gel Electrophoresis

2DGE combines two different separation procedures.

1) The proteins are separated by their intrinsic charges. The sample is dissolved in a small volume of a solution containing a nonionic (uncharged) detergent, together with β-mercaptoethanol and denaturing reagent, Urea. The polypeptide chains (denatured by the use of detergents) are separated by the use of isoelectric focusing, which uses the fact that the net charge on a protein molecule varies with the pH of the surrounding solution (low pH induce protein to be net positive, high pH allow proteins to be negatively charged.). In isoelectric focusing, proteins are separated electrophoretically in a narrow tube of polyacrylamide gel (the protein mixture is subjected to an inert support in which a stable pH gradient has previously been generated) in which a gradient of pH is established by a mixture of special buffers.

The anode region (positive charge) is at a lower pH than the cathode (negative charge) and the pH range is chosen such that the proteins to be separated have their isoelectric points within this range. A protein which is in a pH region below its pI will be positively charged and so will migrate towards the cathode. However, as it migrates, so the pH will decrease until the protein reaches a pH which is its pI. At this point it has no net charge and so migration ceases. Should it overshoot this point, it will enter a region of pH above its pI and so become negatively charged. It will then reverse its direction of migration and now migrate towards the anode. Therefore proteins become focused into sharp stationary bands with each protein positioned at a point in the pH gradient corresponding to its pI. The technique is capable of extremely high resolution with proteins differing by a single charge being fractionated into separate bands.

2) SDS-PAGE is applied in the second step as explained above, separating the proteins by their molecular weight.

Protein fingerprinting

The protein of interest is digested with trypsin to generate a mixture of polypeptide fragments, which is then fractionated in two dimensions by electrophoresis and partition chromatography. Partition chromatography separates peptides on the basis of their differential solubilities in water, which is preferentially bound to the solid matrix.

Protein fingerprinting is normally used to detect post-translational modification.

 

2. Protein Sequencing

Protein sequencing - determining the amino acid sequences of its constituent peptides; and also determining what conformation it adopts and whether it is complexed with any non-peptide molecules.

The two major direct methods of protein sequencing are mass spectrometry and the Edman degradation reaction. It is also possible to generate an amino acid sequence from the DNA or mRNA sequence encoding the protein.

A) Edman degradation

The Edman degradation is a very important reaction for protein sequencing, because it allows the ordered amino acid composition of a protein to be discovered. Automated Edman sequencers are now in widespread use, and are able to sequence peptides up to approximately 50 amino acids long. A reaction scheme for sequencing a protein by the Edman degradation follows

1.  Break any disulfide bridges in the protein by oxidising with performic acid.

2.  Separate and purify the individual chains of the protein complex, if there is more than one.

3.  Determine the amino acid composition of each chain.

4.  Determine the terminal amino acids of each chain.

5.  Break each chain into fragments under 50 amino acids long.

6.  Separate and purify the fragments.

7.  Determine the sequence of each fragment.

8.  Repeat with a different pattern of cleavage.

9.  Construct the sequence of the overall protein.

Peptides longer than about 50-70 amino acids cannot be sequenced reliably by the Edman degradation. Because of this, long protein chains need to be broken up into small fragments which can then be sequenced individually. Digestion is done either by endopeptidases such as trypsin or pepsin or by chemical reagents such as cyanogen bromide. Different enzymes give different cleavage patterns, and the overlap between fragments can be used to construct an overall sequence.

Terminology

1) The enzyme trypsin cleaves proteins at the carboxyl side of lysine and arginine (except when these two residues are followed by proline). Trypsin is considered an endopeptidase, i.e., cleavage occurs within the polypeptide chain rather than at the terminal amino acids located at the ends of polypeptides.

2) Pepsin is a digestive protease released by the chief cells in the stomach that functions to degrade food proteins into peptides.

3) Cyanogen bromide cuts peptide bonds next to methionine residues.

The Edman degradation reaction

Edman degradation is a method of sequencing amino acids in a peptide. In this method, the amino-terminal residue is labeled and cleaved from the peptide without disrupting other peptide bonds between other amino acid residues. Phenyl isothiocyanate is reacted with uncharged terminal amino group to form a phenylthiocarbamoyl derivative. Then, under mildly acidic conditions, this derivative of the terminal amino acid is cleaved. The derivative that was cleaved is known as phenylthiohydantoin (PTH)- amino acid that can be identified by using chromatography. This procedure can then be repeated again to identify the next amino acid. A major drawback to this technique is that the peptides being sequenced in this manner cannot have more than 50 to 60 residues. This is because the Edman degradation reaction is not 100% efficient, meaning that the cleavage step does not occur every time. However, this problem can be resolved by cleaving large peptides into smaller peptides before proceeding with the reaction.

Limitations of the Edman degradation

Because the Edman degradation proceeds from the N-terminus of the protein, it will not work if the N-terminal amino acid has been chemically modified or if it is concealed within the body of the protein. It also requires the use of either guesswork or a separate procedure to determine the positions of disulfide bridges.

 

B) Mass spectroscopy

The other major direct method by which the sequence of a protein can be determined is mass spectrometry. This method has been gaining popularity in recent years as new techniques and increasing computing power have facilitated it. Peptides are also easier to prepare for mass spectrometry than whole proteins, because they are more soluble.

The protein is digested by an endoprotease, and the resulting solution is passed through a high pressure liquid chromatography column. At the end of this column, the solution is sprayed out of a narrow nozzle charged to a high positive potential into the mass spectrometer. The charge on the droplets causes them to fragment until only single ions remain. The peptides are then fragmented and the mass-charge ratios of the fragments measured. (It is possible to detect which peaks correspond to multiply charged fragments, because these will have auxiliary peaks corresponding to other isotopes - the distance between these other peaks is inversely proportional to the charge on the fragment). The mass spectrum is analysed by computer and often compared against a database of previously sequenced proteins in order to determine the sequences of the fragments. This process is then repeated with a different digestion enzyme, and the overlaps in the sequences used to construct a sequence for the protein.

How MS can work is that different molecules have different masses, and this fact is used in a mass spectrometer to determine what molecules are present in a sample. For example, table salt (NaCl), is vaporized (turned into gas) and broken down (ionized) into electrically charged particles, called ions, in the first part of the mass spectrometer. The sodium ions and chloride ions have specific molecular weights. They also have a charge, which means that they will be moved under the influence of an electric field. These ions are then sent into an ion acceleration chamber and passed through a slit in a metal sheet. A magnetic field is applied to the chamber, which pulls on each ion equally and deflects them (makes them curve instead of traveling straight) onto a detector. The lighter ions deflect farther than the heavy ions, because the force on each ion is equal but their masses are not (this is derived from the equation F = ma which states that if the force remains the same, the mass and acceleration are inversely proportional). The detector measures exactly how far each ion has been deflected, and from this measurement, the ion's 'mass to charge ratio' can be worked out. From this information it is possible to determine with a high level of certainty what the chemical composition of the original sample was.

This example was of a sector instrument, however there are many types of mass spectrometers that not only analyze the ions differently but produce different types of ions; however they all use electric and magnetic fields to change the path of ions in some way.

 

3. X-ray crystallography

This is to understand the three-dimensional structure of the protein of interest. X-ray crystallography is a technique in crystallography in which the pattern produced by the diffraction of X-rays through the closely spaced lattice of atoms in a crystal is recorded and then analyzed to reveal the nature of that lattice. This generally leads to an understanding of the material and molecular structure of a substance. The spacings in the crystal lattice can be determined using Bragg's law (When X-rays hit an atom, they make the electronic cloud move as does any electromagnetic wave. The movement of these charges re-radiates waves with the same frequency. These re-emitted X-rays interfere, giving constructive or destructive interferences; this is the diffraction phenomenon.). The electrons that surround the atoms, rather than the atomic nuclei themselves, are the entities which physically interact with the incoming X-ray photons. This technique is widely used in chemistry and biochemistry to determine the structures of an immense variety of molecules. X-ray diffraction is commonly carried out using single crystals of a material.

Crystallisation of proteins

In order to solve a protein crystal structure, you must first crystallise the protein. This is because a single molecule in solution has insufficent scattering power by itself. A crystal can be considered to be an (effectively) infinite repeating array of the molecule of interest. The constructive interference between diffracted X-rays that are in-phase reinforce each other, so that the diffraction pattern becomes detectable. The geometric conditions where diffraction occurs can be visualized using

(a) the wavelength of the incident beams of light,

(b) the angle of diffraction for a given reflection,

(c) the unit cell and reciprocal unit cell of the crystal, and

(d) the distance between the crystal and the film.

 

4. Protein-Protein Interaction

A) Protein affinity chromatography is one method that can be used to isolate and identify proteins that interact physically. To capture interacting proteins, the target protein is attached to polymer beads that are packed into a column. Cellular proteins are washed through the column and those proteins that interact with target adhere to the affinity matrix (identical to that explained above for affinity chromatography). These proteins can be eluted and their identity determined by mass spectrometry.

B) Co-immunoprecipitation is another way of finding out whether two proteins bind to each other.

The figure on the left shows the method used for traditional IP, as the figure on the below shows the method for co-IP.

Immunoprecipitation, referred also as "IP" is the technique of precipitating an antigen out of solution using an antibody specific to that antigen. This process can be used to identify protein complexes present in cell extracts by targeting any one of the proteins believed to be in the complex. Insoluble antibody-binding proteins isolated initially from bacteria, such as Protein A and Protein G, are used to bring the antibody-antigen complexes out of solution. These can also be coupled to sepharose beads that can easily be isolated out of solution. After washing, the precipitate can then be analyzed using mass spectrometry, western blotting, or any number of other methods for identifying constituents in the complex.

Co-immunoprecipitation (Co-IP) is a popular technique for protein interaction discovery. Co-IP is conducted in essentially the same manner as an IP. However, in a co-IP, the target antigen precipitated by the antibody “co-precipitates” a binding partner/protein complex from a lysate, i.e., the interacting protein is bound to the target antigen, which becomes bound by the antibody that becomes captured on the Protein A or G gel support.

Protein A/G: Prepared with a recombinant streptococcal Protein A/G which retains its high affinity for Fc region of IgG and/or IgM and lacks albumin and Fab binding sites and membrane binding regions.

C) Purification of protein complexes using a GST-tagged fusion protein is also a way of finding a way to understand protein-protein interaction (GST-pull down assay).

The pull-down assay is an in vitro technique that consists of a fusion-tagged "bait" protein for which a binding partner ("prey") is being sought. GST-tagged bait protein is bound to an immobilized glutathione support. In a typical pull-down assay, the immobilized bait protein is introduced to a protein pool derived from a cell lysate. After the prescribed washing steps, the “interactors” are selectively eluted via the addition of glutathione (for competitive binding). The interacting proteins are then detected in-gel.