I am very sorry for posting this note so late.
I am responsible for retrovirus note. And the retrovirus book is about 800 pages long. However, I think we are only responsible for the first 7 chapters, which are only 300 pages.
Even 300 pages are ridiculous to read through so I will only mention important points and/or processes I think we will need to learn
Remember you can access the
book online at anytime if you want to know all the painful details.
Models for retroviral structure. (Top) The immature HIV-1 virion. The Gag and Gag-Pol proteins are shown with different colors to suggest the domains corresponding to the mature proteins formed from these precursors. The SU and TM components of Env are shown jutting out from the lipid membrane, as are HLA host proteins selectively incorporated into HIV particles. (Bottom) The mature HIV-1 virion. The major Gag and Pol proteins and the cone-shaped core characteristic of viruses of this genus are shown. Vpr is an HIV accessory protein.
The mature virion is composed of "shells" of individual Gag proteins: MA, which forms the outer shell, lies just underneath the lipid membrane and makes contact with it via the amino-terminal myristylated and positively charged segment. The myristate moiety is presumably buried in the lipid. Jutting through the membrane is the TM component of Env, with its internal domain contacting MA in an unspecified way. The external portion of TM is bound to the SU component of Env. The SU/TM complex is shown as a dimer to indicate an oligomeric status. Farther inside the MA layer is a shell of CA protein that forms the capsid, defined as the outer layer of the core. The shape of the core is conical in HIV, but cores from other retroviruses may have different shapes. In the center of the core is the complex of NC protein and RNA. Associated with this complex are smaller numbers of RT and IN molecules. Both of these proteins bind to nucleic acids and thus may be expected to be in contact with the ribonucleoprotein complex. The third enzyme in virions, PR, is shown because as a mature protein it is a dimer. The location of PR in virions is not known, but since PR must have had access to all the cleavage sites on the Gag and Gag-Pol proteins before maturation, some PR molecules are shown inside the core and some outside. Finally, the model shows several small peptides (short bars) that are derived from the spacer regions between CA and NC, between NC and p6, and amino-terminal to PR. Not drawn in the model are tRNAs or other small host RNAs. Some host-cell proteins are shown incorporated in the viral membrane, as demonstrated for the MHC class I proteins for HIV-1.
The immature virion is composed of exactly the same polypeptide sequences. However, before processing has occurred, its structure is quite different. Only three types of proteins make up the particle, in addition to the lipid envelope and the RNA. Env is the same as in the mature virion, although its contacts with Gag underneath the membrane may be different. Gag is shown as an elongated tooth-like object. The MA domain with its myristate abuts the membrane, consistent both with its ability to be crosslinked to lipid in immature MLV and with the expectation that the myristate would interact with the lipid environment of the membrane. The NC domain at the other end of the molecule is rendered as a bicuspate shape, reflecting the presence of two Cys-His motifs and two basic RNA-binding sites. The RNA in the center of the virion is not drawn in a distinct way. Presumably, it is bound to many of the NC domains, but perhaps not to all of them. CA is drawn not as a ball but as an elongated connector between MA and NC to suggest possible conformational changes that might occur upon proteolytic processing. The p6 domain is shown arbitrarily as bending back to contact CA. Gag-Pol is rendered as a dimer. Although there is no direct evidence for the multimeric status of Gag-Pol in immature virions, as mature proteins, IN, RT, and PR are dimeric. The Gag-Pol protein that is their precursor is likely to be dimeric as well. In particular, the PR domains must dimerize to initiate proteolytic processing.
Genetic organization of generalized provirus. The proviral DNA as it is inserted into host DNA is shown at the top, with the long terminal repeats (LTRs) composed of U3, R, and U5 elements at each end abutting cellular sequences. Sequences in the LTR that are important for transcription, for example, enhancers, the promoter, and the poly(A) addition signal, are marked. The gag, pro, pol, and env sequences are located invariably in the positions shown in all retroviruses. Accessory genes are located as shown, and also overlapping env and U3 and each other, and occasionally in other locations. The RNA that is the primary transcriptional product is shown on the second line. Sequences that are important for replication and gene expression are shown in the approximate locations in which they are typically found. (PBS) Primer-binding site; (y) encapsidation sequence; (SD) splice donor site; (FS) frameshift site; (SA) splice acceptor site; (PPT) polypurine tract; (PA) polyadenylation signal; (AAA) poly(A) tail. The spliced messenger RNA for the Env protein is shown on the third line. Retroviruses with accessory genes have other spliced mRNAs and thus other splice donor and splice acceptor sites as well.
The primary transcript of retroviral DNA is
modified in several ways and closely resembles a cellular mRNA. It is
"capped" at its 5end
(bearing a methylated GDP attached to the first encoded nucleotide by a 5
-5
linkage),
polyadenylated at its 3
end
(bearing a poly[A] "tail" of about 50
200
noncoding A residues) and methylated at several specific sites internally. Some
of the primary transcripts are spliced to give subgenomic mRNAs. Unlike most
cellular mRNAs, in which all introns are efficiently spliced out, newly
synthesized retroviral RNA must be diverted into two populations. One
population remains unspliced, to serve as the genomic RNA and as mRNA for gag
and pol. The other population is spliced, fusing the 5
portion
of the genomic RNA to the downstream genes, most commonly env. The
intron between the splice donor and splice acceptor sites (SD and SA) that is
removed by splicing contains the gag, pro, and pol genes.
This splicing event creates the mRNA for envelope protein.
Translation of the retroviral pro and pol
genes is controlled by cis-acting sequences around the gag-pro
and pro-pol borders. In each case, a small fraction of the ribosomes
translating gag continues on to translate the downstream genes, thereby
generating a fusion protein with Gag at its amino terminus. In most
retroviruses, the gag termination codon is bypassed by frameshifting:
Ribosomes stall and then shift their reading frame back one nucleotide before
continuing into the downstream gene. In both types of readthrough, essential
consensus secondary structures have been identified. In retroviruses in which
the pro gene is in a reading frame by itself (including M-PMV, MMTV, and
HTLV), there are two frameshift signals, one before pro and the other
before pol. In this case, frameshifting at each site is very efficientup
to 30%
to
ensure that enough of the Gag-Pro-Pol protein is made after the two serial
shifts. Frameshifting and readthrough apparently have evolved as simple
strategies to provide the proper ratios of Gag, Gag-Pro, and Gag-Pro-Pol polypeptides
in the infected cell. A consequence of this strategy is that the enzymatic
proteins required by the virus are fused to the Gag polyprotein, which provides
a direct way to incorporate enzymes into the virion during assembly.
In viral assembly, the genomic RNA is identified
as a RNA to be packaged by virtue of a complex
sequence in the leader region, called y
(from the original work on MLV) or E (for encapsidation, in the spleen
necrosis virus [SNV] system). It is not known how retroviruses incorporate two
genomic RNAs. In the mature virion, these two molecules are believed to be held
together primarily by a sequence called the "dimer linkage," which is
in the leader region or in some cases in gag. The location of the dimer
linkage is based both on early work in which purified virion RNA was partially
denatured and then examined by electron microscopy and on the dimerization of
in-vitro-synthesized RNAs. Among the small RNAs included in retroviral virions
is the primer for initiation of reverse transcription, a specific cellular tRNA
discovered in the initial experiments on reverse transcriptase in permeabilized
virions. In all retroviruses, the primer tRNA is associated with the genome by
base pairing between its 3-terminal
18 nucleotides and the complementary "primer-binding site" (PBS)
sequence on the genome. Retroviruses have evolved to use one of several
different tRNAs as primers. Just as a tRNA is used as
a primer for synthesis of the first DNA strand (i.e., "minus"
strand), an RNA fragment derived from the genome itself is used as the primer
for the second DNA strand (i.e., "plus" strand). This RNA is the polypurine
tract (PPT) the sequence immediately preceding U3. During reverse
transcription, the PPT sequence survives the RNase H activity of the reverse
transcriptase, remaining to be used as a primer. Some retroviruses have a
second PPT-like sequence, positioned near the middle of the genome,
that can serve as a second site for initiation of plus-strand synthesis.
Although proper reverse transcription requires PBS, PPT, and R sequences and
virion proteins, it can occur normally in the complete absence of gag, pro,
pol, and env genes. This is the basis of retroviral vectors,
which are constructed to deliver genes of choice into cells infected in culture
or in animals or people.
Genetic organization of prototypic retroviruses. Prototype examples from the several genera of retroviruses are shown. An open rectangle indicates the open reading frame for the gene marked. If the rectangles are offset vertically, their reading frames are different. Horizontal lines connecting two rectangles indicate that this segment is spliced out. (ALV) Avian leukemia virus; (MLV) murine leukemia virus; (HIV) human immunodeficiency virus type 1; (M-PMV) Mason-Pfizer monkey virus; (MMTV) mouse mammary tumor virus; (HTLV) human T-cell leukemia virus; (WDSV) walleye dermal sarcoma virus; (HFV) human foamy virus.
The Gag protein is the precursor to the internal structural protein of all retroviruses. Expression of gag alone leads to assembly of immature virus-like particles that bud from the plasma membrane. In virion assembly, Gag proteins must interact with each other, with components in the plasma membrane, with the genomic RNA, and probably with Env proteins and with cellular proteins as well. Fundamental to understanding the function of Gag is the fact that this protein is organized into regions, which are proteolytically liberated as the separate mature Gag proteins during viral maturation. Because proteolytic cleavage occurs late in assembly, during or after the last stages of budding, virions contain equimolar mixtures of the mature proteins. All Gag proteins are organized in the same order from the amino terminus to the carboxyl terminus, with domains that are cleaved into the following proteins: (NH2)-MA-X-CA-NC-Y-(COOH) X and Y represent segments that each may be cleaved into one or more small proteins or peptides or may be absent altogether. Thus, the "minimal" Gag protein is the unit MA-CA-NC. Examples of the structural organization of Gag proteins of prototypic retroviruses are given in figure below
Organization of Gag proteins. Schematic representations of Gag proteins are drawn for examples from each retroviral genus. Vertical solid lines mark cleavage sites for the viral protease. The sequences representing the mature proteins MA, CA, NC, and PR are indicated, along with the older naming of these proteins based on their approximate molecular weight. (Ac) Acetylation at the amino terminus; (Myr) myristylation at the amino terminus.
MA Protein
In all retroviruses,
the amino-terminal domain of Gag gives rise to the MA protein (membrane-associated,
or matrix). The finding that most Gag proteins are
modified by myristylation at their amino termini provided a major clue to MA
function. Myristate is a 14-carbon fatty acid that is added cotranslationally to many cellular proteins associated with membranes and also
to some proteins that remain. The consensus sequence for myristylation is Met-Gly-X-X..Ser/Thr. After
the initiating methionyl residue is removed, the fatty acid is linked via an
amide bond to the free amino group of the glycyl residue. Prevention of
myristylation of the M-PMV Gag protein does not prevent formation of the
immature particles in the cytoplasm, but rather prevents their transport to, or
stable association with, the plasma membrane.
MA proteins of at least
some viruses can bind RNA in vitro. In HIV-1, MA has been found to accompany
the newly synthesized viral DNA into the nucleus and, along with Vpr ,may be a factor that directs migration of the
preintegration complex. There is a nuclear localization signal in HIV MA,
mapping to the same highly basic stretch of amino acids near the amino terminus, that has been inferred to be important for membrane
interaction.
CA Protein
The exact structural
function of CA in the mature viral particle has not been elucidated, but the
protein is believed to form a shell surrounding the ribonucleoprotein complex
that contains the genomic RNA. The capsid together with the components it
encloses are then referred to as the "core."
These two terms frequently have been used interchangeably, but such usage
promotes confusion between the proteins that form the shell and the proteins
and RNA inside the shell. The biological function of the capsid shell is not
known.
NC Protein
The nucleocapsid (NC)
protein is a small basic protein, typically about 6090
amino acid residues long. This protein is tightly bound to the genomic RNA. In
all retroviruses except those of the spumavirus group, NC has one or two
characteristic motifs made of regularly spaced cysteine and histidine residues.
The retroviral Cys-His motif has the structure CX2CX4 HX4C
(here abbreviated CCHC), where most of the residues designated by Xs are not
conserved either among retroviruses or between the two motifs of a single NC.
An exception is the common placement of an aromatic residue between the first
two C residues. The CCHC motif is similar to other short cysteineand
histidine-containing structures, called "zinc fingers," that
coordinate a Zn++ ion and that have a role in binding of certain
proteins to nucleic acids. Indeed, NC has been shown to bind Zn++
ions tightly both in vitro and in virions. Typically, clusters of lysine or
arginine residues follow the CCHC motifs. Deletions or major alterations of the
CCHC result in the absence of viral RNA in virions or alterations of the
specificity of RNA packaging. Thus, this NC motif probably interacts with the
"packaging sequences" near the 5
end
of retroviral genomic RNAs when it is still part of the Gag
Other Gag Proteins and Peptides
In addition to the
proteins discussed above, many retroviral gag genes encode polypeptide
segments that lie between MA and CA, between CA and NC, and/or downstream from
NC. In most cases, the functions of these segments are poorly understood.
In HIV-1 and other
lentiviruses, a polypeptide of approximately 60 amino acids is cleaved from the
Gag protein downstream from NC in a region partially overlapping the pro
reading frame. This "p6" domain appears to have a role in release of
virus in the final steps of budding and in incorporation of the Vpr and Vpx
proteins into the virion. Viral particles from mutants with p6 deleted or
altered remain tethered to the plasma membrane
Proteins Derived
from pol and pro
All infectious
retroviruses carry three enzymes, reverse transcriptase (RT) and integrase (IN)
and protease (PR). The RT protein also contains an additional enzymatic
activity, RNase H, which has been mapped to a separate, contiguous portion of
the polypeptide, and the conventional designation "RT" always implies
the protein with both reverse transcriptase and RNase H activities. The enzymes
form domains on the Gag-Pro or Gag-Pro-Pol precursor polypeptide. In most
genera, all enzymes are translated together as a Gag-Pro-Pol precursor, which
is processed late in assembly to yield the mature forms of the enzymes. Whether
expression of pro and pol is by frameshifting or termination
suppression, approximately 5% as much RT and IN on a molecular basis is
synthesized and packaged into a virion as Gag protein.
Many viruses encode
proteases, which typically have roles in processing of the primary translation
product and maturation of the viral particle. Retroviral proteases are
homodimers, with each subunit corresponding to approximately half of the
cellular enzyme. As a consequence, dimerization is crucial for enzymatic
activity and is likely to be involved in the regulation of proteolysis of Gag
and Pol proteins, and therefore essential for proper virion formation. Premature
activation of PR in the infected cell leads to premature cleavage of Gag, thus
aborting the assembly
Proteins Derived
from env
Like all animal viruses
that carry a lipid envelope, the surface of retroviral virions is studded with
glycoproteins (envelope or Env proteins), whose function is to mediate the
adsorption to and the penetration of host cells susceptible to infection. All
retroviruses contain two different types of Env proteins, now called SU and TM, that are derived from a common precursor polypeptide.
Like cellular proteins destined for secretion, the nascent Env polypeptide
binds to a signal recognition particle via its amino-terminal leader segment
and then becomes associated with the membrane of the endoplasmic reticulum
(ER). There, further translation extrudes most of the polypeptide through the
membrane into the lumen of the ER. The protein remains anchored in the membrane
by a hydrophobic segment near the carboxyl terminus that spans the membrane
once, leaving the carboxy-terminal "tail" of Env in the cytoplasmic
compartment. Once in the ER, Env forms the oligomer found in virions, a trimer
in the case of ASLV and a multimer of inadequately determined size in HIV.
After cleavage of the leader sequence, Env is transported by vesicular traffic
through the Golgi apparatus to the plasma membrane, in the process becoming N-glycosylated
at the consensus sequences Asn-X-Ser or Asn-X-Thr. Env is cleaved while in the
Golgi by a cellular protease, either furin or a related enzyme, to yield the
mature SU and TM found in virions. Although uncleaved Env proteins are able to
bind to the receptor, the cleavage event is necessary to activate the fusion
potential of the protein, which is required for entry of the virus into the
host cell. SU and TM remain attached to each other by noncovalent interactions,
and in some
Once at the plasma
membrane, the SU/TM oligomers are incorporated into the budding viral particle.
The cytoplasmic "tail" distal to the membrane-spanning segment of TM
remains on the internal side of the viral membrane. It has been suggested that
the cytoplasmic tail of TM is in contact with the MA domain of the Gag protein
in virions, but the nature of this contact is unknown. Analogy suggests that
retroviral Env proteins also are likely to be incorporated into virions by this
means. The Env protein is the primary determinant of the type of cell that a
retrovirus can infect, because it recognizes the cell surface protein that is
the viral receptor. All enveloped viruses have glycoproteins that bind
specifically to receptors on the host-cell membrane. In some cases, these
receptors are common and in others, they are rare. An example of the former is
sialic acid, found on numerous cell surface proteins, which is recognized by
influenza hemagglutinin. Several retroviral receptors have been identified, the
best studied of which is the CD4 protein found on
helper T lymphocytes and macrophages. CD4 is required for binding and entry of
HIV although this virus also requires a second, quite different membrane
protein for entry.
Other Virus-encoded Proteins in
Virions
Products of most
retroviral accessory genes are not incorporated into virions. The same is true
for the products of oncogenes. The only accessory proteins found in substantial
amounts in the viral particle are the related lentiviral products Vpx, present
in most SIV strains but not HIV-1, and Vpr, found in all primate lentiviruses.
The products of the remaining five HIV and SIV accessory genes are thought to
act from within the infected cell, although some evidence suggests that the Vif
and Nef proteins may be present in virions as well. However, the numbers of
molecules of these proteins are so lowless
than about 1% of the Gag molecules
that
contamination is difficult to rule out.
The protein products of
vpr and vpx are found in large quantities in virions, approaching
those of Gag. Vpr expressed by itself has a complex
distribution in cells, but much of it is in the nucleus. However, when
particles are produced at the plasma membrane, Vpr is efficiently recruited
into them, and only Gag protein is needed for this recruitment. Vpr also seems
to have a role later in infection, in affecting transit of the cell through the
cell cycle.
Although other
accessory proteins are not incorporated into virions in substantial amounts, three
HIV-1 proteins besides Vpr appear to affect the structure, morphogenesis, or
biological function of the mature viral particle and therefore deserve mention
here. The vif gene is needed for efficient replication of the virus in
primary CD4 cells and in some but not all established cell lines. Vif affects
the infectivity of released viral particles; i.e., the requirement for this
protein depends on the cells from which the virus is released, rather than on
the cells being infected. The cell lines that support replication of vif-defective
HIV-1 thus appear to be able to supply a "Vif-like" function. In the
restrictive cells, it is not the level of virion formation that is reduced in
the absence of Vif, but rather the specific infectivity of the particles. This
phenotype suggests that the virions are modified in some way by Vif. The vpu
gene is found in HIV-1 and very closely related viruses but not in other
primate or nonprimate lentiviruses. The product of this gene is a small
integral membrane protein. Vpu downregulates the levels of
the CD4 receptor by accelerating its destruction. This activity is
carried out in association with the endoplasmic reticulum or other membranes
internal to the cell, and it requires association between SU and CD4. However,
Vpu also promotes release of the budding virion at the plasma membrane. Vpu
action is not specific for HIV-1, since it enhances release of other
lentiviruses as well as MLV. The mechanisms underlying this final stage in
budding are not known for any enveloped virus.
The product of the HIV nef
gene is also membrane-associated. Nef has complex effects on signal
transduction pathways in the cell and, like Vpu, leads to loss of the CD4
receptor, in this case directly from the cell surface. An additional result of
Nef expression is the increased specific infectivity of viral particles.
The process of retroviral entry into a target cell represents the first step in the viral infection cycle. It is characterized by a complex series of events that are initiated through the binding of the viral surface glycoproteins to specific receptor molecules on the cell's outer membrane. This interaction is thought to trigger a conformational change in the viral glycoprotein, which then mediates fusion of the lipid bilayers of the cell and viral membranes and allows the genetic material of the virus to be introduced into the host-cell cytoplasm.
The envelope glycoprotein complex of retroviruses includes two polypeptides, an external, glycosylated hydrophilic polypeptide (SU) and a membrane-spanning protein (TM), that together form an oligomeric knob or knobbed spike on the surface of the virion. Both polypeptides are encoded by the env gene and are synthesized in the form of a polyprotein precursor that is proteolytically cleaved during its transport to the surface of the cell. These proteins are not required for the assembly of enveloped viral particles, but they do have an essential role in the entry process. The SU domain binds to a specific receptor molecule on the target cell. This binding event appears to activate the membrane fusion-inducing potential of the TM protein and, by a process that remains largely undefined, the viral and cell membranes then fuse. The specificity of the SU/receptor interaction defines the host range and tissue tropism of a retrovirus; viral particles lacking envelope glycoproteins are noninfectious, and cells lacking a receptor are nonpermissive for viral entry. Viruses may bind weakly to resistant cells through relatively nonspecific interactions, but, in the absence of a specific receptor molecule, they are unable to initiate the infection process.
The receptors for retroviral entry that have been identified and characterized to date appear to be distinct for the different major viral subgroups, and there is no clear association between their normal cell function and their receptor activity. For human and simian immunodeficiency viruses (HIV and SIV), the CD4 antigen found on T-helper cells, macrophages, and a few other cells is a high-affinity receptor molecule involved in cell-cell recognition, whereas the receptors for mammalian C-type viruses, including the ecotropic (MLV-E) and amphotropic (MLV-A) murine leukemia viruses, the gibbon ape leukemia virus (GALV), and feline leukemia virus subgroup B (FeLV-B), are three different membrane transporter molecules. For the subgroup A avian sarcoma/ leukosis viruses (ASLV-A), the receptor is a small, plasma-membrane protein of unknown function containing a single copy of a sequence repeated multiple times in the receptor for low-density lipoprotein The receptor for the closely related subgroup B ASLV is an unrelated protein bearing resemblance to cytokine receptors . For bovine leukemia virus (BLV), a novel receptor protein has been cloned with no similarity to other receptors and no known cell function.
Binding of the viral glycoprotein to its cognate
receptor is not by itself necessarily sufficient to trigger viral entry.
Activation of the fusion potential of the TM protein requires a functional
association between the receptor and SU. The mechanistic aspects of this
process are not yet clearly defined, but they may involve conformational
changes within the oligomeric glycoprotein complex, as shown for HIV-1 and SIV.
The molecular events involved in the merging of the apposing lipid bilayers
also remain to be defined. Nevertheless, it is likely that several
glycoprotein/receptor oligomers must associate within the plane of the membrane
for an effective "fusion-pore" to form.
Reverse transcription begins when the viral particle enters the cytoplasm of a target cell. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex that has not been well characterized. The process of reverse transcription generates, in the cytoplasm, a linear DNA duplex via an intricate series of steps. This DNA is colinear with its RNA template, but it contains terminal duplications known as the long terminal repeats (LTRs) that are not present in viral RNA. Extant models for reverse transcription propose that two specialized template switches known as strand-transfer reactions or "jumps" are required to generate the LTRs.
Retroviral DNA synthesis is absolutely dependent on the two distinct enzymatic activities of RT: a DNA polymerase that can use either RNA or DNA as a template, and a nuclease, termed ribonuclease H (RNase H), that is specific for the RNA strand of RNA:DNA duplexes. Although a role for other proteins cannot be ruled out, and it is likely that certain viral proteins (e.g., nucleocapsid, NC) increase the efficiency of reverse transcription, all of the enzymatic functions required to complete the series of steps involved in the generation of a retroviral DNA can be attributed to either the DNA polymerase or the RNase H of RT. The process of retroviral DNA synthesis is believed to follow the scheme outlined in figure below:
1. Minus-strand
DNA synthesis is initiated using the 3end
of a partially unwound transfer RNA which is annealed to the primer-binding
site (PBS) in genomic RNA, as a primer. Minus-strand DNA synthesis proceeds
until the 5
end
of genomic RNA is reached, generating a DNA intermediate of discrete length
termed minus-strand strong-stop DNA (
sssDNA).
Since the binding site for the tRNA primer is near the 5
end of viral RNA,
sssDNA
is relatively short, on the order of 100
150
bases
2. Following
RNase-H-mediated degradation of the RNA strand of the RNA:sssDNA
duplex, the first strand transfer causes
sssDNA
to be annealed to the 3
end
of a viral genomic RNA. This transfer is mediated by identical sequences known
as the repeated (R) sequences, which are present at the 5
and 3
ends
of the RNA genome. The 3
end
of
sssDNA
was copied from the R sequences at the 5
end
of the viral genome and therefore contains sequences complementary to R. After
the RNA template has been removed,
sssDNA
can anneal to the R sequences at the 3
end
of the RNA genome. The annealing reaction appears to be facilitated by the NC.
3. Once
the sssDNA
has been transferred to the 3
R
segment on viral RNA, minus-strand DNA synthesis resumes, accompanied by RNase
H digestion of the template strand. This degradation is not complete, however.
4. The
RNA genome contains a short polypurine tract (PPT) that is relatively resistant
to RNase H degradation. A defined RNA segment derived from the PPT primes
plus-strand DNA synthesis. Plus-strand synthesis is halted after a portion of
the primer tRNA is reverse-transcribed, yielding a DNA called plus-strand
strong-stop DNA (+sssDNA). Although all strains of retroviruses generate a
defined plus-strand primer from the PPT, some viruses generate additional
plus-strand primers from the RNA genome.
5. RNase
H removes the primer tRNA, exposing sequences in +sssDNA that are complementary
to sequences at or near the 3end
of plus-strand DNA.
6. Annealing
of the complementary PBS segments in +sssDNA and minus-strand DNA constitutes
the second strand transfer.
7. Plus-
and minus-strand syntheses are then completed, with the plus and minus strands
of DNA each serving as a template for the other strand.
Once the second jump has occurred, elongation of the plus and minus strands can continue. When RT extends the minus strand on the plus-strand template, the minus-strand DNA from which the plus-strand was copied must be displaced. RT can carry out displacement synthesis in vitro under appropriate, and it seems reasonable to assume that RT (perhaps in concert with NC) carries out this displacement reaction in vivo. The DNA copy of the viral genome is completed when RT copies the plus and minus strands entirely. The final product is a blunt-ended linear duplex DNA. This linear product can have a variety of different fates: normal integration, aberrant integrase-mediated circularization (known as autointegration), or joining of the ends by a host ligation activity, forming one circular DNA product with one LTR and another circular DNA product with two LTRs. It has been proposed that the 1-LTR circles could arise through errors during reverse transcription; however, the data indicate that 1-LTR circles are formed in the nucleus after reverse transcription has been completed, suggesting that the 1-LTR circles are formed by host enzymes that can mediate homologous recombination.
1. The
viral DNA molecule at the completion of its synthesis is a blunt-ended linear
molecule whose termini, corresponding to the boundaries of the long terminal
repeats, are specified by the primers for plus- and minus-strand DNA synthesis.
Viral DNA synthesis begins in the cytoplasm of the infected cell and may be
completed before (typically, in the case of MLV) or after (typically, in the
case of Rous sarcoma virus [RSV]) entry into the nucleus. The linear viral DNA
is the proximal precursor to the integrated provirus and is contained in a
specific nucleoprotein complex. This preintegration complex is derived in part
from the virion core particle and retains a subset of the virion proteins. The
preintegration complex probably also contains specific cellular proteins.
2. Soon after completion of viral DNA synthesis,
usually while still in the cytoplasm, a viral enzyme, integrase, cleaves the 3termini
of the viral DNA, eliminating the terminal two (or, rarely, three) bases from
each 3
end,
The resulting recessed 3-OH groups provide the sites of attachment of the
provirus to host DNA and thus ultimately define the ends of the integrated
provirus.
3. The viral nucleoprotein complex enters the
nucleus. This step probably precedes 3-end processing
in the RSV life cycle, and usually follows the end-processing step for MLV.
Oncoretroviruses gain access to the nucleus during mitosis, when the nuclear
membrane is disassembled. HIV, and probably other lentiviruses, can likewise
enter the nucleus during mitosis, but in addition they can enter the nucleus
during interphase, by active transport through the nuclear pore, probably
mediated by signals in the viral MA protein and Vpr.
4. Upon entry into the nucleus, the
preintegration complex encounters the host DNA. Although specific target
sequences are not required for integration, the host genome is not uniformly
used as a target. Highly bent DNA sites, such as are found at specific
positions in nucleosomes, are strongly preferred.
Host-cell DNA-binding proteins may occlude potential target sites, preventing
their use. In some cases, cellular proteins that bind to host DNA may be
recognized by the viral integration machinery, directing integration to
specific sites. Ongoing cellular DNA synthesis or transcription
of the target DNA sequences are not required.
5. Binding of host DNA by the integrase-viral
DNA complex is followed by a concerted, integrase-catalyzed reaction in which
the 3-OH
groups at the viral DNA ends are used to attack phosphodiester bonds on
opposite strands of the target DNA, at positions staggered by four to six bases
in the 5
direction,
and therefore on the same face of the double helix, separated by the major
groove. In this direct transesterification reaction, the energy of the broken
phosphodiester bonds in the target DNA is used for formation of new bonds joining
the viral 3 ends to the target DNA.
6. DNA synthesis, perhaps guided by viral
proteins or carried out by the viral reverse transcriptase, extends from the
host DNA 3-OH
groups that flank the host-viral DNA junctions, filling in the gaps that flank
the viral DNA and displacing the (usually) mismatched viral 5
ends.
Following a ligation step, proviral integration is complete.
Three distinct pathways for retroviral assembly. C-type viruses and lentiviruses appear to assemble the internal structures of their particles concurrently with envelopment at the plasma membrane. IAPs are similar except that they bud exclusively into internal membranes. In contrast, D-type and B-type viruses assemble immature particles in the cytoplasm prior to envelopment at the plasma membrane. Spumaviruses, such as HFV, also assemble immature proteins in the cytoplasm but do not undergo an obvious maturation step after budding.
Electron microscopy has revealed a wealth of information about how retroviruses assemble. Although there are many subtle variations, two major patterns have been observed. In the first type, the Gag-containing polyproteins assemble within the cytoplasm to form an obvious and stable structure, often called an intracytoplasmic A-type particle (ICAP) or A particle. ICAPs are subsequently transported to the plasma membrane where the envelope is acquired during budding. The best characterized examples of viruses using this pathway are mouse mammary tumor virus (MMTV) and Mason-Pfizer monkey virus (M-PMV), the prototypes for the type-B and type-D viruses, respectively, as well as spumaviruses. Retroviruses using the other pathway do not appear to assemble their Gag and Gag-Pro-Pol proteins into ICAPs, at least on the basis of electron microscopy. Instead, macromolecular aggregates of these molecules are first evident as discrete, electrondense patches intimately associated with the cytoplasmic face of the plasma membrane. As budding proceeds at these sites, dome-shaped structures arise with the concurrent formation of the immature viral core and the acquisition of the envelope. For the viruses such as avian sarcoma/leukosis virus (ASLV) and murine leukemia virus (MLV), this pattern of assembly has been designated type C, and it is the pattern most often seen for retroviruses, including human immunodeficiency virus type 1 (HIV-1) and the other lentiviruses.
Regardless of the pathway utilized, the internal appearance of all immature retroviral particles is fundamentally the same and characterized by spherical structures with an electron-lucent center. These are similar to the ICAPs of the type-B/D viruses, except that they are surrounded by a lipid bilayer. The Gag (and Gag-Pro-Pol) proteins contained in such particles are intact, but during the late stages of budding, or immediately thereafter, they are rapidly cleaved by the viral PR to generate the smaller species characteristic of the mature virion. Mutants with inactive PR invariably release particles of immature morphology. Processing of the Gag and Gag-Pro-Pol molecules causes the internal structure of the particle to condense into an electrondense core, the shape and location of which are characteristic of the retroviral type. Spumaviruses differ from the standard retroviral pattern in that processing of Gag proteins into the separate domains does not occur and extracellular virions retain an "immature" morphology.