I am very sorry for posting this note so late.

 

I am responsible for retrovirus note. And the retrovirus book is about 800 pages long. However, I think we are only responsible for the first 7 chapters, which are only 300 pages.

 

Even 300 pages are ridiculous to read through so I will only mention important points and/or processes I think we will need to learn

 

Remember you can access the book online at anytime if you want to know all the painful details.


Models for retroviral structure. (Top) The immature HIV-1 virion. The Gag and Gag-Pol proteins are shown with different colors to suggest the domains corresponding to the mature proteins formed from these precursors. The SU and TM components of Env are shown jutting out from the lipid membrane, as are HLA host proteins selectively incorporated into HIV particles. (Bottom) The mature HIV-1 virion. The major Gag and Pol proteins and the cone-shaped core characteristic of viruses of this genus are shown. Vpr is an HIV accessory protein.

 

Models for Virion Structure

The mature virion is composed of "shells" of individual Gag proteins: MA, which forms the outer shell, lies just underneath the lipid membrane and makes contact with it via the amino-terminal myristylated and positively charged segment. The myristate moiety is presumably buried in the lipid. Jutting through the membrane is the TM component of Env, with its internal domain contacting MA in an unspecified way. The external portion of TM is bound to the SU component of Env. The SU/TM complex is shown as a dimer to indicate an oligomeric status. Farther inside the MA layer is a shell of CA protein that forms the capsid, defined as the outer layer of the core. The shape of the core is conical in HIV, but cores from other retroviruses may have different shapes. In the center of the core is the complex of NC protein and RNA. Associated with this complex are smaller numbers of RT and IN molecules. Both of these proteins bind to nucleic acids and thus may be expected to be in contact with the ribonucleoprotein complex. The third enzyme in virions, PR, is shown because as a mature protein it is a dimer. The location of PR in virions is not known, but since PR must have had access to all the cleavage sites on the Gag and Gag-Pol proteins before maturation, some PR molecules are shown inside the core and some outside. Finally, the model shows several small peptides (short bars) that are derived from the spacer regions between CA and NC, between NC and p6, and amino-terminal to PR. Not drawn in the model are tRNAs or other small host RNAs. Some host-cell proteins are shown incorporated in the viral membrane, as demonstrated for the MHC class I proteins for HIV-1.

The immature virion is composed of exactly the same polypeptide sequences. However, before processing has occurred, its structure is quite different. Only three types of proteins make up the particle, in addition to the lipid envelope and the RNA. Env is the same as in the mature virion, although its contacts with Gag underneath the membrane may be different. Gag is shown as an elongated tooth-like object. The MA domain with its myristate abuts the membrane, consistent both with its ability to be crosslinked to lipid in immature MLV and with the expectation that the myristate would interact with the lipid environment of the membrane. The NC domain at the other end of the molecule is rendered as a bicuspate shape, reflecting the presence of two Cys-His motifs and two basic RNA-binding sites. The RNA in the center of the virion is not drawn in a distinct way. Presumably, it is bound to many of the NC domains, but perhaps not to all of them. CA is drawn not as a ball but as an elongated connector between MA and NC to suggest possible conformational changes that might occur upon proteolytic processing. The p6 domain is shown arbitrarily as bending back to contact CA. Gag-Pol is rendered as a dimer. Although there is no direct evidence for the multimeric status of Gag-Pol in immature virions, as mature proteins, IN, RT, and PR are dimeric. The Gag-Pol protein that is their precursor is likely to be dimeric as well. In particular, the PR domains must dimerize to initiate proteolytic processing.

 

Genetic organization of generalized provirus. The proviral DNA as it is inserted into host DNA is shown at the top, with the long terminal repeats (LTRs) composed of U3, R, and U5 elements at each end abutting cellular sequences. Sequences in the LTR that are important for transcription, for example, enhancers, the promoter, and the poly(A) addition signal, are marked. The gag, pro, pol, and env sequences are located invariably in the positions shown in all retroviruses. Accessory genes are located as shown, and also overlapping env and U3 and each other, and occasionally in other locations. The RNA that is the primary transcriptional product is shown on the second line. Sequences that are important for replication and gene expression are shown in the approximate locations in which they are typically found. (PBS) Primer-binding site; (y) encapsidation sequence; (SD) splice donor site; (FS) frameshift site; (SA) splice acceptor site; (PPT) polypurine tract; (PA) polyadenylation signal; (AAA) poly(A) tail. The spliced messenger RNA for the Env protein is shown on the third line. Retroviruses with accessory genes have other spliced mRNAs and thus other splice donor and splice acceptor sites as well.

The primary transcript of retroviral DNA is modified in several ways and closely resembles a cellular mRNA. It is "capped" at its 5[prime prime or minute]end (bearing a methylated GDP attached to the first encoded nucleotide by a 5[prime prime or minute]-5[prime prime or minute]linkage), polyadenylated at its 3[prime prime or minute]end (bearing a poly[A] "tail" of about 50 --200 noncoding A residues) and methylated at several specific sites internally. Some of the primary transcripts are spliced to give subgenomic mRNAs. Unlike most cellular mRNAs, in which all introns are efficiently spliced out, newly synthesized retroviral RNA must be diverted into two populations. One population remains unspliced, to serve as the genomic RNA and as mRNA for gag and pol. The other population is spliced, fusing the 5[prime prime or minute]portion of the genomic RNA to the downstream genes, most commonly env. The intron between the splice donor and splice acceptor sites (SD and SA) that is removed by splicing contains the gag, pro, and pol genes. This splicing event creates the mRNA for envelope protein.

Translation of the retroviral pro and pol genes is controlled by cis-acting sequences around the gag-pro and pro-pol borders. In each case, a small fraction of the ribosomes translating gag continues on to translate the downstream genes, thereby generating a fusion protein with Gag at its amino terminus. In most retroviruses, the gag termination codon is bypassed by frameshifting: Ribosomes stall and then shift their reading frame back one nucleotide before continuing into the downstream gene. In both types of readthrough, essential consensus secondary structures have been identified. In retroviruses in which the pro gene is in a reading frame by itself (including M-PMV, MMTV, and HTLV), there are two frameshift signals, one before pro and the other before pol. In this case, frameshifting at each site is very efficient ---up to 30% ---to ensure that enough of the Gag-Pro-Pol protein is made after the two serial shifts. Frameshifting and readthrough apparently have evolved as simple strategies to provide the proper ratios of Gag, Gag-Pro, and Gag-Pro-Pol polypeptides in the infected cell. A consequence of this strategy is that the enzymatic proteins required by the virus are fused to the Gag polyprotein, which provides a direct way to incorporate enzymes into the virion during assembly.

In viral assembly, the genomic RNA is identified as a RNA to be packaged by virtue of a complex sequence in the leader region, called y (from the original work on MLV) or E (for encapsidation, in the spleen necrosis virus [SNV] system). It is not known how retroviruses incorporate two genomic RNAs. In the mature virion, these two molecules are believed to be held together primarily by a sequence called the "dimer linkage," which is in the leader region or in some cases in gag. The location of the dimer linkage is based both on early work in which purified virion RNA was partially denatured and then examined by electron microscopy and on the dimerization of in-vitro-synthesized RNAs. Among the small RNAs included in retroviral virions is the primer for initiation of reverse transcription, a specific cellular tRNA discovered in the initial experiments on reverse transcriptase in permeabilized virions. In all retroviruses, the primer tRNA is associated with the genome by base pairing between its 3[prime prime or minute]-terminal 18 nucleotides and the complementary "primer-binding site" (PBS) sequence on the genome. Retroviruses have evolved to use one of several different tRNAs as primers. Just as a tRNA is used as a primer for synthesis of the first DNA strand (i.e., "minus" strand), an RNA fragment derived from the genome itself is used as the primer for the second DNA strand (i.e., "plus" strand). This RNA is the polypurine tract (PPT) the sequence immediately preceding U3. During reverse transcription, the PPT sequence survives the RNase H activity of the reverse transcriptase, remaining to be used as a primer. Some retroviruses have a second PPT-like sequence, positioned near the middle of the genome, that can serve as a second site for initiation of plus-strand synthesis. Although proper reverse transcription requires PBS, PPT, and R sequences and virion proteins, it can occur normally in the complete absence of gag, pro, pol, and env genes. This is the basis of retroviral vectors, which are constructed to deliver genes of choice into cells infected in culture or in animals or people.

 

 

Genetic organization of prototypic retroviruses. Prototype examples from the several genera of retroviruses are shown. An open rectangle indicates the open reading frame for the gene marked. If the rectangles are offset vertically, their reading frames are different. Horizontal lines connecting two rectangles indicate that this segment is spliced out. (ALV) Avian leukemia virus; (MLV) murine leukemia virus; (HIV) human immunodeficiency virus type 1; (M-PMV) Mason-Pfizer monkey virus; (MMTV) mouse mammary tumor virus; (HTLV) human T-cell leukemia virus; (WDSV) walleye dermal sarcoma virus; (HFV) human foamy virus.

The Gag protein is the precursor to the internal structural protein of all retroviruses. Expression of gag alone leads to assembly of immature virus-like particles that bud from the plasma membrane. In virion assembly, Gag proteins must interact with each other, with components in the plasma membrane, with the genomic RNA, and probably with Env proteins and with cellular proteins as well. Fundamental to understanding the function of Gag is the fact that this protein is organized into regions, which are proteolytically liberated as the separate mature Gag proteins during viral maturation. Because proteolytic cleavage occurs late in assembly, during or after the last stages of budding, virions contain equimolar mixtures of the mature proteins. All Gag proteins are organized in the same order from the amino terminus to the carboxyl terminus, with domains that are cleaved into the following proteins: (NH2)-MA-X-CA-NC-Y-(COOH) X and Y represent segments that each may be cleaved into one or more small proteins or peptides or may be absent altogether. Thus, the "minimal" Gag protein is the unit MA-CA-NC. Examples of the structural organization of Gag proteins of prototypic retroviruses are given in figure below

Organization of Gag proteins. Schematic representations of Gag proteins are drawn for examples from each retroviral genus. Vertical solid lines mark cleavage sites for the viral protease. The sequences representing the mature proteins MA, CA, NC, and PR are indicated, along with the older naming of these proteins based on their approximate molecular weight. (Ac) Acetylation at the amino terminus; (Myr) myristylation at the amino terminus.

MA Protein

In all retroviruses, the amino-terminal domain of Gag gives rise to the MA protein (membrane-associated, or matrix). The finding that most Gag proteins are modified by myristylation at their amino termini provided a major clue to MA function. Myristate is a 14-carbon fatty acid that is added cotranslationally to many cellular proteins associated with membranes and also to some proteins that remain. The consensus sequence for myristylation is Met-Gly-X-X..Ser/Thr. After the initiating methionyl residue is removed, the fatty acid is linked via an amide bond to the free amino group of the glycyl residue. Prevention of myristylation of the M-PMV Gag protein does not prevent formation of the immature particles in the cytoplasm, but rather prevents their transport to, or stable association with, the plasma membrane.

MA proteins of at least some viruses can bind RNA in vitro. In HIV-1, MA has been found to accompany the newly synthesized viral DNA into the nucleus and, along with Vpr ,may be a factor that directs migration of the preintegration complex. There is a nuclear localization signal in HIV MA, mapping to the same highly basic stretch of amino acids near the amino terminus, that has been inferred to be important for membrane interaction.

CA Protein

The exact structural function of CA in the mature viral particle has not been elucidated, but the protein is believed to form a shell surrounding the ribonucleoprotein complex that contains the genomic RNA. The capsid together with the components it encloses are then referred to as the "core." These two terms frequently have been used interchangeably, but such usage promotes confusion between the proteins that form the shell and the proteins and RNA inside the shell. The biological function of the capsid shell is not known.

NC Protein

The nucleocapsid (NC) protein is a small basic protein, typically about 60 --90 amino acid residues long. This protein is tightly bound to the genomic RNA. In all retroviruses except those of the spumavirus group, NC has one or two characteristic motifs made of regularly spaced cysteine and histidine residues. The retroviral Cys-His motif has the structure CX2CX4 HX4C (here abbreviated CCHC), where most of the residues designated by Xs are not conserved either among retroviruses or between the two motifs of a single NC. An exception is the common placement of an aromatic residue between the first two C residues. The CCHC motif is similar to other short cysteineand histidine-containing structures, called "zinc fingers," that coordinate a Zn++ ion and that have a role in binding of certain proteins to nucleic acids. Indeed, NC has been shown to bind Zn++ ions tightly both in vitro and in virions. Typically, clusters of lysine or arginine residues follow the CCHC motifs. Deletions or major alterations of the CCHC result in the absence of viral RNA in virions or alterations of the specificity of RNA packaging. Thus, this NC motif probably interacts with the "packaging sequences" near the 5[prime prime or minute]end of retroviral genomic RNAs when it is still part of the Gag

Other Gag Proteins and Peptides

In addition to the proteins discussed above, many retroviral gag genes encode polypeptide segments that lie between MA and CA, between CA and NC, and/or downstream from NC. In most cases, the functions of these segments are poorly understood.

In HIV-1 and other lentiviruses, a polypeptide of approximately 60 amino acids is cleaved from the Gag protein downstream from NC in a region partially overlapping the pro reading frame. This "p6" domain appears to have a role in release of virus in the final steps of budding and in incorporation of the Vpr and Vpx proteins into the virion. Viral particles from mutants with p6 deleted or altered remain tethered to the plasma membrane

Proteins Derived from pol and pro

All infectious retroviruses carry three enzymes, reverse transcriptase (RT) and integrase (IN) and protease (PR). The RT protein also contains an additional enzymatic activity, RNase H, which has been mapped to a separate, contiguous portion of the polypeptide, and the conventional designation "RT" always implies the protein with both reverse transcriptase and RNase H activities. The enzymes form domains on the Gag-Pro or Gag-Pro-Pol precursor polypeptide. In most genera, all enzymes are translated together as a Gag-Pro-Pol precursor, which is processed late in assembly to yield the mature forms of the enzymes. Whether expression of pro and pol is by frameshifting or termination suppression, approximately 5% as much RT and IN on a molecular basis is synthesized and packaged into a virion as Gag protein.

Many viruses encode proteases, which typically have roles in processing of the primary translation product and maturation of the viral particle. Retroviral proteases are homodimers, with each subunit corresponding to approximately half of the cellular enzyme. As a consequence, dimerization is crucial for enzymatic activity and is likely to be involved in the regulation of proteolysis of Gag and Pol proteins, and therefore essential for proper virion formation. Premature activation of PR in the infected cell leads to premature cleavage of Gag, thus aborting the assembly

Proteins Derived from env

Like all animal viruses that carry a lipid envelope, the surface of retroviral virions is studded with glycoproteins (envelope or Env proteins), whose function is to mediate the adsorption to and the penetration of host cells susceptible to infection. All retroviruses contain two different types of Env proteins, now called SU and TM, that are derived from a common precursor polypeptide. Like cellular proteins destined for secretion, the nascent Env polypeptide binds to a signal recognition particle via its amino-terminal leader segment and then becomes associated with the membrane of the endoplasmic reticulum (ER). There, further translation extrudes most of the polypeptide through the membrane into the lumen of the ER. The protein remains anchored in the membrane by a hydrophobic segment near the carboxyl terminus that spans the membrane once, leaving the carboxy-terminal "tail" of Env in the cytoplasmic compartment. Once in the ER, Env forms the oligomer found in virions, a trimer in the case of ASLV and a multimer of inadequately determined size in HIV. After cleavage of the leader sequence, Env is transported by vesicular traffic through the Golgi apparatus to the plasma membrane, in the process becoming N-glycosylated at the consensus sequences Asn-X-Ser or Asn-X-Thr. Env is cleaved while in the Golgi by a cellular protease, either furin or a related enzyme, to yield the mature SU and TM found in virions. Although uncleaved Env proteins are able to bind to the receptor, the cleavage event is necessary to activate the fusion potential of the protein, which is required for entry of the virus into the host cell. SU and TM remain attached to each other by noncovalent interactions, and in some

Once at the plasma membrane, the SU/TM oligomers are incorporated into the budding viral particle. The cytoplasmic "tail" distal to the membrane-spanning segment of TM remains on the internal side of the viral membrane. It has been suggested that the cytoplasmic tail of TM is in contact with the MA domain of the Gag protein in virions, but the nature of this contact is unknown. Analogy suggests that retroviral Env proteins also are likely to be incorporated into virions by this means. The Env protein is the primary determinant of the type of cell that a retrovirus can infect, because it recognizes the cell surface protein that is the viral receptor. All enveloped viruses have glycoproteins that bind specifically to receptors on the host-cell membrane. In some cases, these receptors are common and in others, they are rare. An example of the former is sialic acid, found on numerous cell surface proteins, which is recognized by influenza hemagglutinin. Several retroviral receptors have been identified, the best studied of which is the CD4 protein found on helper T lymphocytes and macrophages. CD4 is required for binding and entry of HIV although this virus also requires a second, quite different membrane protein for entry.

Other Virus-encoded Proteins in Virions

Products of most retroviral accessory genes are not incorporated into virions. The same is true for the products of oncogenes. The only accessory proteins found in substantial amounts in the viral particle are the related lentiviral products Vpx, present in most SIV strains but not HIV-1, and Vpr, found in all primate lentiviruses. The products of the remaining five HIV and SIV accessory genes are thought to act from within the infected cell, although some evidence suggests that the Vif and Nef proteins may be present in virions as well. However, the numbers of molecules of these proteins are so low ---less than about 1% of the Gag molecules ---that contamination is difficult to rule out.

The protein products of vpr and vpx are found in large quantities in virions, approaching those of Gag. Vpr expressed by itself has a complex distribution in cells, but much of it is in the nucleus. However, when particles are produced at the plasma membrane, Vpr is efficiently recruited into them, and only Gag protein is needed for this recruitment. Vpr also seems to have a role later in infection, in affecting transit of the cell through the cell cycle.

Although other accessory proteins are not incorporated into virions in substantial amounts, three HIV-1 proteins besides Vpr appear to affect the structure, morphogenesis, or biological function of the mature viral particle and therefore deserve mention here. The vif gene is needed for efficient replication of the virus in primary CD4 cells and in some but not all established cell lines. Vif affects the infectivity of released viral particles; i.e., the requirement for this protein depends on the cells from which the virus is released, rather than on the cells being infected. The cell lines that support replication of vif-defective HIV-1 thus appear to be able to supply a "Vif-like" function. In the restrictive cells, it is not the level of virion formation that is reduced in the absence of Vif, but rather the specific infectivity of the particles. This phenotype suggests that the virions are modified in some way by Vif. The vpu gene is found in HIV-1 and very closely related viruses but not in other primate or nonprimate lentiviruses. The product of this gene is a small integral membrane protein. Vpu downregulates the levels of the CD4 receptor by accelerating its destruction. This activity is carried out in association with the endoplasmic reticulum or other membranes internal to the cell, and it requires association between SU and CD4. However, Vpu also promotes release of the budding virion at the plasma membrane. Vpu action is not specific for HIV-1, since it enhances release of other lentiviruses as well as MLV. The mechanisms underlying this final stage in budding are not known for any enveloped virus.

The product of the HIV nef gene is also membrane-associated. Nef has complex effects on signal transduction pathways in the cell and, like Vpu, leads to loss of the CD4 receptor, in this case directly from the cell surface. An additional result of Nef expression is the increased specific infectivity of viral particles.

Viral Entry and Receptors

The process of retroviral entry into a target cell represents the first step in the viral infection cycle. It is characterized by a complex series of events that are initiated through the binding of the viral surface glycoproteins to specific receptor molecules on the cell's outer membrane. This interaction is thought to trigger a conformational change in the viral glycoprotein, which then mediates fusion of the lipid bilayers of the cell and viral membranes and allows the genetic material of the virus to be introduced into the host-cell cytoplasm.

The envelope glycoprotein complex of retroviruses includes two polypeptides, an external, glycosylated hydrophilic polypeptide (SU) and a membrane-spanning protein (TM), that together form an oligomeric knob or knobbed spike on the surface of the virion. Both polypeptides are encoded by the env gene and are synthesized in the form of a polyprotein precursor that is proteolytically cleaved during its transport to the surface of the cell. These proteins are not required for the assembly of enveloped viral particles, but they do have an essential role in the entry process. The SU domain binds to a specific receptor molecule on the target cell. This binding event appears to activate the membrane fusion-inducing potential of the TM protein and, by a process that remains largely undefined, the viral and cell membranes then fuse. The specificity of the SU/receptor interaction defines the host range and tissue tropism of a retrovirus; viral particles lacking envelope glycoproteins are noninfectious, and cells lacking a receptor are nonpermissive for viral entry. Viruses may bind weakly to resistant cells through relatively nonspecific interactions, but, in the absence of a specific receptor molecule, they are unable to initiate the infection process.

The receptors for retroviral entry that have been identified and characterized to date appear to be distinct for the different major viral subgroups, and there is no clear association between their normal cell function and their receptor activity. For human and simian immunodeficiency viruses (HIV and SIV), the CD4 antigen found on T-helper cells, macrophages, and a few other cells is a high-affinity receptor molecule involved in cell-cell recognition, whereas the receptors for mammalian C-type viruses, including the ecotropic (MLV-E) and amphotropic (MLV-A) murine leukemia viruses, the gibbon ape leukemia virus (GALV), and feline leukemia virus subgroup B (FeLV-B), are three different membrane transporter molecules. For the subgroup A avian sarcoma/ leukosis viruses (ASLV-A), the receptor is a small, plasma-membrane protein of unknown function containing a single copy of a sequence repeated multiple times in the receptor for low-density lipoprotein The receptor for the closely related subgroup B ASLV is an unrelated protein bearing resemblance to cytokine receptors . For bovine leukemia virus (BLV), a novel receptor protein has been cloned with no similarity to other receptors and no known cell function.

Binding of the viral glycoprotein to its cognate receptor is not by itself necessarily sufficient to trigger viral entry. Activation of the fusion potential of the TM protein requires a functional association between the receptor and SU. The mechanistic aspects of this process are not yet clearly defined, but they may involve conformational changes within the oligomeric glycoprotein complex, as shown for HIV-1 and SIV. The molecular events involved in the merging of the apposing lipid bilayers also remain to be defined. Nevertheless, it is likely that several glycoprotein/receptor oligomers must associate within the plane of the membrane for an effective "fusion-pore" to form.

Overview of Reverse Transcription

Reverse transcription begins when the viral particle enters the cytoplasm of a target cell. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex that has not been well characterized. The process of reverse transcription generates, in the cytoplasm, a linear DNA duplex via an intricate series of steps. This DNA is colinear with its RNA template, but it contains terminal duplications known as the long terminal repeats (LTRs) that are not present in viral RNA. Extant models for reverse transcription propose that two specialized template switches known as strand-transfer reactions or "jumps" are required to generate the LTRs.

Retroviral DNA synthesis is absolutely dependent on the two distinct enzymatic activities of RT: a DNA polymerase that can use either RNA or DNA as a template, and a nuclease, termed ribonuclease H (RNase H), that is specific for the RNA strand of RNA:DNA duplexes. Although a role for other proteins cannot be ruled out, and it is likely that certain viral proteins (e.g., nucleocapsid, NC) increase the efficiency of reverse transcription, all of the enzymatic functions required to complete the series of steps involved in the generation of a retroviral DNA can be attributed to either the DNA polymerase or the RNase H of RT. The process of retroviral DNA synthesis is believed to follow the scheme outlined in figure below:

1. Minus-strand DNA synthesis is initiated using the 3[prime prime or minute]end of a partially unwound transfer RNA which is annealed to the primer-binding site (PBS) in genomic RNA, as a primer. Minus-strand DNA synthesis proceeds until the 5[prime prime or minute]end of genomic RNA is reached, generating a DNA intermediate of discrete length termed minus-strand strong-stop DNA ( --sssDNA). Since the binding site for the tRNA primer is near the 5[prime prime or minute] end of viral RNA,  --sssDNA is relatively short, on the order of 100 --150 bases

2. Following RNase-H-mediated degradation of the RNA strand of the RNA: --sssDNA duplex, the first strand transfer causes  --sssDNA to be annealed to the 3[prime prime or minute]end of a viral genomic RNA. This transfer is mediated by identical sequences known as the repeated (R) sequences, which are present at the 5[prime prime or minute] and 3[prime prime or minute]ends of the RNA genome. The 3[prime prime or minute]end of  --sssDNA was copied from the R sequences at the 5[prime prime or minute]end of the viral genome and therefore contains sequences complementary to R. After the RNA template has been removed,  --sssDNA can anneal to the R sequences at the 3[prime prime or minute]end of the RNA genome. The annealing reaction appears to be facilitated by the NC.

3. Once the  --sssDNA has been transferred to the 3[prime prime or minute]R segment on viral RNA, minus-strand DNA synthesis resumes, accompanied by RNase H digestion of the template strand. This degradation is not complete, however.

4. The RNA genome contains a short polypurine tract (PPT) that is relatively resistant to RNase H degradation. A defined RNA segment derived from the PPT primes plus-strand DNA synthesis. Plus-strand synthesis is halted after a portion of the primer tRNA is reverse-transcribed, yielding a DNA called plus-strand strong-stop DNA (+sssDNA). Although all strains of retroviruses generate a defined plus-strand primer from the PPT, some viruses generate additional plus-strand primers from the RNA genome.

5. RNase H removes the primer tRNA, exposing sequences in +sssDNA that are complementary to sequences at or near the 3[prime prime or minute]end of plus-strand DNA.

6. Annealing of the complementary PBS segments in +sssDNA and minus-strand DNA constitutes the second strand transfer.

7. Plus- and minus-strand syntheses are then completed, with the plus and minus strands of DNA each serving as a template for the other strand.

 

Completion of Integration-competent DNA

 

Once the second jump has occurred, elongation of the plus and minus strands can continue. When RT extends the minus strand on the plus-strand template, the minus-strand DNA from which the plus-strand was copied must be displaced. RT can carry out displacement synthesis in vitro under appropriate, and it seems reasonable to assume that RT (perhaps in concert with NC) carries out this displacement reaction in vivo. The DNA copy of the viral genome is completed when RT copies the plus and minus strands entirely. The final product is a blunt-ended linear duplex DNA. This linear product can have a variety of different fates: normal integration, aberrant integrase-mediated circularization (known as autointegration), or joining of the ends by a host ligation activity, forming one circular DNA product with one LTR and another circular DNA product with two LTRs. It has been proposed that the 1-LTR circles could arise through errors during reverse transcription; however, the data indicate that 1-LTR circles are formed in the nucleus after reverse transcription has been completed, suggesting that the 1-LTR circles are formed by host enzymes that can mediate homologous recombination.

 

Outline of the Integration Process

 

1. The viral DNA molecule at the completion of its synthesis is a blunt-ended linear molecule whose termini, corresponding to the boundaries of the long terminal repeats, are specified by the primers for plus- and minus-strand DNA synthesis. Viral DNA synthesis begins in the cytoplasm of the infected cell and may be completed before (typically, in the case of MLV) or after (typically, in the case of Rous sarcoma virus [RSV]) entry into the nucleus. The linear viral DNA is the proximal precursor to the integrated provirus and is contained in a specific nucleoprotein complex. This preintegration complex is derived in part from the virion core particle and retains a subset of the virion proteins. The preintegration complex probably also contains specific cellular proteins.

2. Soon after completion of viral DNA synthesis, usually while still in the cytoplasm, a viral enzyme, integrase, cleaves the 3[prime prime or minute]termini of the viral DNA, eliminating the terminal two (or, rarely, three) bases from each 3[prime prime or minute]end, The resulting recessed 3-OH groups provide the sites of attachment of the provirus to host DNA and thus ultimately define the ends of the integrated provirus.
3. The viral nucleoprotein complex enters the nucleus. This step probably precedes 3[prime prime or minute]-end processing in the RSV life cycle, and usually follows the end-processing step for MLV. Oncoretroviruses gain access to the nucleus during mitosis, when the nuclear membrane is disassembled. HIV, and probably other lentiviruses, can likewise enter the nucleus during mitosis, but in addition they can enter the nucleus during interphase, by active transport through the nuclear pore, probably mediated by signals in the viral MA protein and Vpr.

4. Upon entry into the nucleus, the preintegration complex encounters the host DNA. Although specific target sequences are not required for integration, the host genome is not uniformly used as a target. Highly bent DNA sites, such as are found at specific positions in nucleosomes, are strongly preferred. Host-cell DNA-binding proteins may occlude potential target sites, preventing their use. In some cases, cellular proteins that bind to host DNA may be recognized by the viral integration machinery, directing integration to specific sites. Ongoing cellular DNA synthesis or transcription of the target DNA sequences are not required.

5. Binding of host DNA by the integrase-viral DNA complex is followed by a concerted, integrase-catalyzed reaction in which the 3[prime prime or minute]-OH groups at the viral DNA ends are used to attack phosphodiester bonds on opposite strands of the target DNA, at positions staggered by four to six bases in the 5[prime prime or minute]direction, and therefore on the same face of the double helix, separated by the major groove. In this direct transesterification reaction, the energy of the broken phosphodiester bonds in the target DNA is used for formation of new bonds joining the viral 3 ends to the target DNA.

6. DNA synthesis, perhaps guided by viral proteins or carried out by the viral reverse transcriptase, extends from the host DNA 3[prime prime or minute]-OH groups that flank the host-viral DNA junctions, filling in the gaps that flank the viral DNA and displacing the (usually) mismatched viral 5[prime prime or minute]ends. Following a ligation step, proviral integration is complete.

 

Three distinct pathways for retroviral assembly. C-type viruses and lentiviruses appear to assemble the internal structures of their particles concurrently with envelopment at the plasma membrane. IAPs are similar except that they bud exclusively into internal membranes. In contrast, D-type and B-type viruses assemble immature particles in the cytoplasm prior to envelopment at the plasma membrane. Spumaviruses, such as HFV, also assemble immature proteins in the cytoplasm but do not undergo an obvious maturation step after budding.

 

Variations of the Morphogenetic Pathways of Retroviruses

Electron microscopy has revealed a wealth of information about how retroviruses assemble. Although there are many subtle variations, two major patterns have been observed. In the first type, the Gag-containing polyproteins assemble within the cytoplasm to form an obvious and stable structure, often called an intracytoplasmic A-type particle (ICAP) or A particle. ICAPs are subsequently transported to the plasma membrane where the envelope is acquired during budding. The best characterized examples of viruses using this pathway are mouse mammary tumor virus (MMTV) and Mason-Pfizer monkey virus (M-PMV), the prototypes for the type-B and type-D viruses, respectively, as well as spumaviruses. Retroviruses using the other pathway do not appear to assemble their Gag and Gag-Pro-Pol proteins into ICAPs, at least on the basis of electron microscopy. Instead, macromolecular aggregates of these molecules are first evident as discrete, electrondense patches intimately associated with the cytoplasmic face of the plasma membrane. As budding proceeds at these sites, dome-shaped structures arise with the concurrent formation of the immature viral core and the acquisition of the envelope. For the viruses such as avian sarcoma/leukosis virus (ASLV) and murine leukemia virus (MLV), this pattern of assembly has been designated type C, and it is the pattern most often seen for retroviruses, including human immunodeficiency virus type 1 (HIV-1) and the other lentiviruses.

Regardless of the pathway utilized, the internal appearance of all immature retroviral particles is fundamentally the same and characterized by spherical structures with an electron-lucent center. These are similar to the ICAPs of the type-B/D viruses, except that they are surrounded by a lipid bilayer. The Gag (and Gag-Pro-Pol) proteins contained in such particles are intact, but during the late stages of budding, or immediately thereafter, they are rapidly cleaved by the viral PR to generate the smaller species characteristic of the mature virion. Mutants with inactive PR invariably release particles of immature morphology. Processing of the Gag and Gag-Pro-Pol molecules causes the internal structure of the particle to condense into an electrondense core, the shape and location of which are characteristic of the retroviral type. Spumaviruses differ from the standard retroviral pattern in that processing of Gag proteins into the separate domains does not occur and extracellular virions retain an "immature" morphology.