Genomics and Proteomics
© Florence Folmer, March 2001
Rapid development of biological research and of the biotechnological industry has lead, in the last decade, to the advent of two closely related sub-disciplines of biological sciences, which, paired together, will reveal many secrets about human life, and about life in general. The two intended research areas are genomics and proteomics.
Genomics is the study concerned with the cloning and the characterisation of whole genomes. A genome can be defined as the complete DNA sequence of an organism, including nuclear DNA, organelle DNA, such as mitochondrial DNA, plasmid DNA, and viral DNA. Genomics can be divided into two areas, namely structural genomics, which characterise the physical nature of whole genomes, and functional genomics, which characterised the overall patterns of gene expression.
Structural genomics consists in several steps leading, eventually, to the physical map of the studied genome. The first step consists in the assignment of genes to chromosomes, using techniques including linkage to standard markers, in situ hybridisation, pulsed field electrophoresis, and human-rodent cell hybridisation. The arrangement of loci along a chromosome can then be determined by using various types of meiotic recombination mapping. Heterozygous loci that can be used as molecular marker loci in mapping can be provided by restriction fragment length polymorphism (RFLP) or by simple sequence length polymorphism (SSLP). The DNA fragments of the genome of interest are cloned in vectors, such as cosmids, YAC's (yeast artificial chromosomes), or BAC's (bacterial artificial chromosomes), thereby producing a set of overlapping clones, called contigs, that encompass the entire genome, or an entire chromosome of the genome. The contigs are positioned along the chromosomes by anchoring them with STS (sequence tagged site) markers, or by in situ hybridisation (FISH). The sequence of the genomic DNA is obtained by shotgun sequencing.
Functional genomics consists in the analysis of the large-scale sequence data obtained by structural genomics. In particular, it uses ORF (open reading frame) analysis, gene knockouts, and DNA microarrays to probe gene functions and gene interactions, and to determine which genes are actually transcribed (Griffiths et al., 1999). A third, recently established and rapidly expanding branch of genomics is pharmacogenomics ("the genomics of drug response" (March, 2000)). Pharmacogenomics focuses on the interplay of genotype and drug efficiency, and on the design of personalised medicine based on the individual patients' genotype. Indeed, variability in the human genome, including single nucleotide polymorphism (SNP), has been shown to affect the pharmacokinetics and pharmacodynamics of cytochrome p-450 isoenzymes, dihydropyrimidine, dehydrogenase, thiopurine methyltransferase, serotonin transporters, angiotensin-converting enzymes, and many other active drug components (Rioux, 2000), and a better understanding of these effects may allow clinicians to find the right drug for the right patient (March, 2000). Genomes which have been completely sequenced to date include the ones of the bacteria Escherichia coli, the fruitfly Drosophila melanogaster, the baker's yeast Saccharomyces cerevisae, the nematode Caernorhabditis elegans, the plant Arabidopsis thaliana, the mouse, and the human (Homo sapiens) (Griffiths et al., 1999).
Proteomics consists in the study of the proteome. The term proteome was introduced for the first time by Marc Willens in 1994 and it can be defined as the complete set of proteins encoded by the genome of a given organism (Neumann, 2001). Proteomics is very closely related to functional genomics, and it is even considered by some authors, including Stephens (2001), as a subdivision of functional genomics, next to "cytomics", which is defined as the research area that encompasses the dynamics of the living cell. Proteomics focuses on the structure, on the function, and on the regulation of the proteins encoded by the genome. It investigates, for instance, the posttranslational modifications like phosphorylation, glycosylation, acetylation, and sulphatation, three-dimensional folding, degradation, and interactions of proteins. Furthermore, proteomics studies the (sub-)cellular or tissue-specific localisation of proteins, their relative abundances, and their abundance changes in response to stimuli, in order to identify their function within the whole organism (Giometti, 2001; Stephens, 2001). The procedures involved in proteomics include the extraction of the proteins from a given tissue, followed by the separation and quantitation of the proteins by two-dimensional gel electrophoresis (isoelectric focusing followed by separation based on size differences on an SDS-Page polyacrylamide gel). The proteins extracted from the two-dimensional gel electrophoresis are digested with protein-specific proteolytic enzymes, and the resulting polypeptides are determined by mass spectrophotometry combined with the comparison with other polypeptides stored in databases Giometti, 2001). DNA microarrays are used to monitor genome-wide transcriptional responses of cells at the mRNA level. Edman Sequence-tags, X-ray photography, crystallisation, and powerful bioinformatics software can be used to characterise the proteins and to determine their amino acid sequence and their posttranslational modifications. X-ray photography is also used to track the mechanisms by which proteins interact with each other, or with their environment. Studies based on X-ray photography have revealed, for instance, how Cholera toxin interacts with the epithelial cells of the bowel lining, and this understanding has been successfully applied in the design of Cholera treatments. In 2000, one hundred genome-derived protein structures were identified at Argonne's National Structural Biology Centre in the United States of America, including the structure of CD4 T-helper cell receptors, of the Cholera toxin produced by the bacteria Vibrio cholerae, and of the detoxifying enzyme cyanase found in bacteria (Fields, 2001).
Both genomics and proteomics have several advantages and disadvantages. The advantages and the disadvantages of the former approach, in particular, are illustrated in the following section.
From the practical point of view, genomics has the advantage of being less complex than proteomics, the genome being more "static" and also significantly smaller than the proteome (on average, 3 different proteins are encoded by a single gene (Fields, 2001)). Genomics has become a relatively straight forward method, with the advent of extremely powerful biotechnological tools developed to fit genomical studies in an optimal way, and with the routinely based sequencing of the human genome in the Human Genome programme (Fields, 2001). In terms of applications, genomics has numerous advantages. It can, indeed, be applied in pharmacology, toxicology, medical diagnostics, agriculture, environmental sciences, and many other fields, ranging all the way to sociology. In pharmacology, genomics can be used to develop innovative powerful drugs, such as treatments for cancer, Cholera, Malaria, or AIDS. Genomics allows to detect new targets for drugs, as well as new sources of drugs. In medicine, a complete understanding of the human genome can be used to diagnose and to treat genetic diseases, 5000 of which having been identified so far, including cystic fibrosis, phenylketoneurea, adrenoleukodystrophy, and chromosome 21-related diseases, including Down syndrome, epilepsy, Alzheimer's disease, Lon Gehrig's disease. The comparison of a patientís genes with the human genome may lead to the diagnosis of a disease at a very early stage, thereby allowing early applications of treatments, diets, or early warnings for potential health risks. In agriculture, genomics can be used to control and to improve the quality of crops. Knowledge derived from microbial genomics can be used in the production of genetically modified food. The comparison of the human genome with the genome of other species may redefine the place of humans in nature, and lead to more respect from humans for their environment . Human genomics may as well allow us to show the genetic pre-requisites that make us different from all other animals, especially in regard with cultural evolution and with geographic expansion. Last but not least, genomics may allow the human society to sort out racial problems by showing that people who look very different from each other may have sets of genes which are more closely related to each other than people who, superficially, look very much alike (Fields, 2001).
The disadvantages of genomics are that genomics, with all its efforts, cannot reveal everything about life. The genome is static, while proteins, representing the actual cellular state, have to adapt dynamically to the environment. Proteins can move throught he body and interact with other proteins to lead to a reaction. Hence, to any genome, there is an infinite number of proteoms, each representing a snapshot of the cellular situation. A single gene can encode multiple different proteins, either by alternative splicing of the mRNA transcript or by varying the translational start and stop sites, or by frameshifting. Posttranscriptional modifications, such as phosphorylation, glycosylation, degradation, and protein folding are vital for enzymatic activity, cell signalling, cellular interactions, or immune responses, but they cannot be captured by genomics. It is often the structure of proteins, such as drugs, that determine the latterís activity (Fields, 2001). Only proteomics can reveal the efficacity or effects on the distribution of a new tratment in the body, tissue, or cells, and determine optimal concentration, toxic effects,... (Stephens, 2001). From the sociological point of view, the genomic approach (and proteomics as well) has the disadvantage that, by trying to decrypt all the secrets of nature, we may end up in reducing the richness and the diversity of life, i.e. by preventing the development of savants in our society, as a consequence of early diagnosed autism. Genomics will most probably be linked to problems in the fields of legislation, health insurance, and patenting rights in the near future (Fields, 2001).
Fields, S. 2001. Proteomics in genomeland. Science, 291:1221-1224.
Griffiths, A., Gelbart, W., Miller, J., Lewontin, R. 1999. Modern genetic analysis. New York: W.H. Freeman & Co., pp. 373-412.
Giometti, C.S. 2001. Argonne Functional genomics Link (http://www.ipd.anl.gov) (last visited on 13/03/01)
March, R. 2000. Pharmacogenomics: The genomics of drug response. Yeast, 17:16-21.
Neumann, T. 2001. Adventis Research & Development, Department of Biological Chemistry & Proteomics website (http://www.proteomics.com/intro.htm) (last visited on 13/03/01)
Rioux, P. 2000. Clinical trials in pharmacogenetics and pharmacogenomics: Methods and applications. American Journal of Health System Pharmacology, 57:887-898.
Stephens, F. 2001. Argonne Functional genomics Link (http://www.ipd.anl.gov) (last visited on 13/03/01)