DNA replication is a biological process that occurs in all living organisms and copies their DNA. DNA replication during mitosis is the basis for biological inheritance. The process of DNA replication starts when one double-stranded DNA molecule produces two identical copies of the molecule. Each strand of the original double-stranded DNA molecule serves as template for the production of the complementary strand, a process referred to as semiconservative replication. Cellular proofreading and error-checking mechanisms ensure near perfect fidelity for DNA replication.12
In a cell, DNA replication begins at specific locations, or origin of replication, in the genome.3 Unwinding of DNA at the origin, and synthesis of new strands, forms a replication fork. A number of proteins are associated with the fork and assist in the initiation and continuation of DNA synthesis. Most prominently, DNA polymerase synthesizes the new DNA by adding matching nucleotides to the template strand.
DNA replication can also be performed in vitro (artificially, outside a cell). DNA polymerases isolated from cells and artificial DNA primers can be used to initiate DNA synthesis at known sequences in a template DNA molecule. The polymerase chain reaction (PCR), a common laboratory technique, cyclically apply such artificial synthesis to amplify a specific target DNA fragment from a pool of DNA.
DNA usually exists as a double-stranded structure, with both strands coiled together to form the characteristic double-helix. Each single strand of DNA is a chain of four types of nucleotides. Nucleotides in DNA contain a deoxyribose sugar, a phosphate, and a nucleobase. The four types of nucleotide correspond to the four nucleobases adenine, cytosine, guanine, and thymine, commonly notated as A,C, G and T. These nucleotides form phosphodiester bonds, creating the phosphate-deoxyribose backbone of the DNA double helix with the nucleobases pointing inward. Nucleotides (bases) are matched between strands through hydrogen bonds to form base pairs. Adenine pairs with thymine (two hydrogen bonds), and cytosine pairs with guanine (three hydrogen bonds) because a purine must pair with a pyrimidine.
DNA strands have a directionality, and the different ends of a single strand are called the "3' (three-prime) end" and the "5' (five-prime) end" with the direction of the naming going 5 prime to the 3 prime region. The strands of the helix are anti-parallel with one being 5 prime to 3 then the opposite strand 3 prime to 5. These terms refer to the carbon atom in deoxyribose to which the next phosphate in the chain attaches. Directionality has consequences in DNA synthesis, because DNA polymerase can synthesize DNA in only one direction by adding nucleotides to the 3' end of a DNA strand.
The pairing of bases in DNA through hydrogen bonding means that the information contained within each strand is redundant. The nucleotides on a single strand can be used to reconstruct nucleotides on a newly synthesized partner strand.4
DNA polymerases are a family of enzymes that carry out all forms of DNA replication.6 However, a DNA polymerase can only extend an existing DNA strand paired with a template strand; it cannot begin the synthesis of a new strand. To begin synthesis, a short fragment of DNA or RNA, called a primer, must be created and paired with the template DNA strand.
DNA polymerase then synthesizes a new strand of DNA by extending the 3' end of an existing nucleotide chain, adding new nucleotides matched to the template strand one at a time via the creation of phosphodiester bonds. The energy for this process of DNA polymerization comes from two of the three total phosphates attached to each unincorporated base. (Free bases with their attached phosphate groups are called nucleoside triphosphates.) When a nucleotide is being added to a growing DNA strand, two of the phosphates are removed and the energy produced creates a phosphodiester bond that attaches the remaining phosphate to the growing chain. The energetics of this process also help explain the directionality of synthesis—if DNA were synthesized in the 3' to 5' direction, the energy for the process would come from the 5' end of the growing strand rather than from free nucleotides.
In general, DNA polymerases are extremely accurate, making less than one mistake for every 107 nucleotides added.7 Even so, some DNA polymerases also have proofreading ability; they can remove nucleotides from the end of a strand in order to correct mismatched bases. If the 5' nucleotide needs to be removed during proofreading, the triphosphate end is lost. Hence, the energy source that usually provides energy to add a new nucleotide is also lost.
The rate of DNA replication in a living cell was first measured as the rate of phage T4 DNA elongation in phage-infected E. coli.8 During the period of exponential DNA increase at 30°C, the rate was 749 nucleotides per second. The mutation rate per base pair per replication during phage T4 DNA synthesis is 1.7 per 10−8.9 Thus DNA replication is both impressively fast and accurate.
DNA Replication, like all biological polymerization processes, proceeds in three enzymatically catalyzed and coordinated steps: initiation, elongation and termination.
For a cell to divide, it must first replicate its DNA.10 This process is initiated at particular points in the DNA, known as "origins", which are targeted by proteins that initiate DNA synthesis.3 Origins contain DNA sequences recognized by replication initiator proteins (e.g., DnaA in E. coli and the Origin Recognition Complex in yeast).11 Sequences used by initiator proteins tend to be "AT-rich" (rich in adenine and thymine bases), because A-T base pairs have two hydrogen bonds (rather than the three formed in a C-G pair). AT-rich sequences are easier to unzip since less energy is required to break relatively fewer hydrogen bonds.12 Once the origin has been located, these initiators recruit other proteins and form the pre-replication complex, which unzips, or separates, the DNA strands at the origin.
All known DNA replication systems require a free 3' hydroxyl group before synthesis can be initiated (Important note: DNA is read in 3' to 5' direction whereas a new strand is synthesised in the 5' to 3' direction—this is often confused). Four distinct mechanisms for synthesis have been described.
- All cellular life forms and many DNA viruses, phages and plasmids use a primase to synthesize a short RNA primer with a free 3′ OH group which is subsequently elongated by a DNA polymerase.
- The retroelements (including retroviruses) employ a transfer RNA that primes DNA replication by providing a free 3′ OH that is used for elongation by the reverse transcriptase.
- In the adenoviruses and the φ29 family of bacteriophages, the 3' OH group is provided by the side chain of an amino acid of the genome attached protein (the terminal protein) to which nucleotides are added by the DNA polymerase to form a new strand.
- In the single stranded DNA viruses — a group that includes the circoviruses, the geminiviruses, the parvoviruses and others — and also the many phages and plasmids that use the rolling circle replication (RCR) mechanism, the RCR endonuclease creates a nick the genome strand (single stranded viruses) or one of the DNA strands (plasmids). The 5′ end of the nicked strand is transferred to a tyrosine residue on the nuclease and the free 3′ OH group is then used by the DNA polymerase for new strand synthesis.
The first is the best known of these mechanisms and is used by the cellular organisms. In this mechanism, once the two strands are separated, primase adds an RNA primers to the template strands. The leading strand receives one RNA primer while the lagging strand receives several. The leading strand is extended from the primer in one motion by DNA polymerase, while the lagging strand is extended discontinuously from each primer, forming Okazaki fragments. RNase removes the primer RNA fragments, and another DNA Polymerase enters to fill the gaps. When this is complete, a single nick on the leading strand and several nicks on the lagging strand can be found. Ligase works to fill these nicks in, thus completing the newly replicated DNA molecule.
The primase used in this process differs significantly between bacteria and archaea/eukaryotes. Bacteria use a primase belonging to the DnaG protein superfamily which contains a catalytic domain of the TOPRIM fold type. The TOPRIM fold contains an α/β core with four conserved strands in a Rossmann-like topology. This structure is also found in the catalytic domains of topoisomerase Ia, topoisomerase II, the OLD-family nucleases and DNA repair proteins related to the RecR protein.
The primase used by archaea and eukaryotes in contrast contains a highly derived version of the RNA recognition motif (RRM). This primase is structurally similar to many viral RNA dependent RNA polymerases, reverse transcriptases, cyclic nucleotide generating cyclases and DNA polymerases of the A/B/Y families that are involved in DNA replication and repair. All these proteins share a catalytic mechanism of di-metal-ion-mediated nucleotide transfer, whereby two acidic residues located at the end of the first strand and between the second and third strands of the RRM-like unit respectively, chelate two divalent cations.
As DNA synthesis continues, the original DNA strands continue to unwind on each side of the bubble, forming a replication fork with two prongs. In bacteria, which have a single origin of replication on their circular chromosome, this process eventually creates a "theta structure" (resembling the Greek letter theta: θ). In contrast, eukaryotes have longer linear chromosomes and initiate replication at multiple origins within these.13
|Enzyme||Function in DNA replication|
|DNA Helicase||Also known as helix destabilizing enzyme. Unwinds the DNA double helix at the Replication Fork.|
|DNA Polymerase||Builds a new duplex DNA strand by adding nucleotides in the 5' to 3' direction. Also performs proof-reading and error correction.|
|DNA clamp||A protein which prevents DNA polymerase III from dissociating from the DNA parent strand.|
|Single-Strand Binding (SSB) Proteins||Bind to ssDNA and prevent the DNA double helix from re-annealing after DNA helicase unwinds it thus maintaining the strand separation.|
|Topoisomerase||Relaxes the DNA from its super-coiled nature.|
|DNA Gyrase||Relieves strain of unwinding by DNA helicase; this is a specific type of topisomerase|
|DNA Ligase||Re-anneals the semi-conservative strands and joins Okazaki Fragments of the lagging strand.|
|Primase||Provides a starting point of RNA (or DNA) for DNA polymerase to begin synthesis of the new DNA strand.|
|Telomerase||Lengthens telomeric DNA by adding repetitive nucleotide sequences to the ends of eukaryotic chromosomes.|
The replication fork is a structure that forms within the nucleus during DNA replication. It is created by helicases, which break the hydrogen bonds holding the two DNA strands together. The resulting structure has two branching "prongs", each one made up of a single strand of DNA. These two strands serve as the template for the leading and lagging strands, which will be created as DNA polymerase matches complementary nucleotides to the templates; the templates may be properly referred to as the leading strand template and the lagging strand templates
The leading strand is the template strand of the DNA double helix so that the replication fork moves along it in the 3' to 5' direction. This allows the newly synthesized strand complementary to the original strand to be synthesized 5' to 3' in the same direction as the movement of the replication fork.
On the leading strand, a polymerase "reads" the DNA and adds nucleotides to it continuously. This polymerase is DNA polymerase III (DNA Pol III) in prokaryotes and presumably Pol ε715 in yeasts. In human cells the leading and lagging strands are synthesized by Pol α and Pol δ within the nucleus and Pol γ in the mitochondria. Pol ε can substitute for Pol δ in special circumstances.16
The lagging strand is the strand of the template DNA double helix that is oriented so that the replication fork moves along it in a 5' to 3' manner. Because of its orientation, opposite to the working orientation of DNA polymerase III, which moves on a template in a 3' to 5' manner, replication of the lagging strand is more complicated than that of the leading strand.
On the lagging strand, primase "reads" the DNA and adds RNA to it in short, separated segments. In eukaryotes, primase is intrinsic to Pol α.17 DNA polymerase III or Pol δ lengthens the primed segments, forming Okazaki fragments. Primer removal in eukaryotes is also performed by Pol δ.18 In prokaryotes, DNA polymerase I "reads" the fragments, removes the RNA using its flap endonuclease domain (RNA primers are removed by 5'-3' exonuclease activity of polymerase I [weaver, 2005]), and replaces the RNA nucleotides with DNA nucleotides (this is necessary because RNA and DNA use slightly different kinds of nucleotides). DNA ligase joins the fragments together.
As helicase unwinds DNA at the replication fork, the DNA ahead is forced to rotate. This process results in a build-up of twists in the DNA ahead.19 This build-up would form a resistance that would eventually halt the progress of the replication fork. DNA Gyrase is an enzyme that temporarily breaks the strands of DNA, relieving the tension caused by unwinding the two strands of the DNA helix; DNA Gyrase achieves this by adding negative supercoils to the DNA helix.20
Bare single-stranded DNA tends to fold back on itself and form secondary structures; these structures can interfere with the movement of DNA polymerase. To prevent this, single-strand binding proteins bind to the DNA until a second strand is synthesized, preventing secondary structure formation.21
Clamp proteins form a sliding clamp around DNA, helping the DNA polymerase maintain contact with its template, thereby assisting with processivity. The inner face of the clamp enables DNA to be threaded through it. Once the polymerase reaches the end of the template or detects double-stranded DNA, the sliding clamp undergoes a conformational change that releases the DNA polymerase. Clamp-loading proteins are used to initially load the clamp, recognizing the junction between template and RNA primers.2:274-5
Within eukaryotes, DNA replication is controlled within the context of the cell cycle. As the cell grows and divides, it progresses through stages in the cell cycle; DNA replication occurs during the S phase (synthesis phase). The progress of the eukaryotic cell through the cycle is controlled by cell cycle checkpoints. Progression through checkpoints is controlled through complex interactions between various proteins, including cyclins and cyclin-dependent kinases.22
The G1/S checkpoint (or restriction checkpoint) regulates whether eukaryotic cells enter the process of DNA replication and subsequent division. Cells that do not proceed through this checkpoint remain in the G0 stage and do not replicate their DNA.
Replication of chloroplast and mitochondrial genomes occurs independent of the cell cycle, through the process of D-loop replication.
Most bacteria do not go through a well-defined cell cycle but instead continuously copy their DNA; during rapid growth, this can result in the concurrent occurrences of multiple rounds of replication.23 In E. coli, the best-characterized bacteria, DNA replication is regulated through several mechanisms, including: the hemimethylation and sequestering of the origin sequence, the ratio of ATP to ADP, and the levels of protein DnaA. All these control the process of initiator proteins binding to the origin sequences.
Because E. coli methylates GATC DNA sequences, DNA synthesis results in hemimethylated sequences. This hemimethylated DNA is recognized by the protein SeqA, which binds and sequesters the origin sequence; in addition, DnaA (required for initiation of replication) binds less well to hemimethylated DNA. As a result, newly replicated origins are prevented from immediately initiating another round of DNA replication.24
ATP builds up when the cell is in a rich medium, triggering DNA replication once the cell has reached a specific size. ATP competes with ADP to bind to DnaA, and the DnaA-ATP complex is able to initiate replication. A certain number of DnaA proteins are also required for DNA replication — each time the origin is copied, the number of binding sites for DnaA doubles, requiring the synthesis of more DnaA to enable another initiation of replication.
Eukaryotes initiate DNA replication at multiple points in the chromosome, so replication forks meet and terminate at many points in the chromosome; these are not known to be regulated in any particular way. Because eukaryotes have linear chromosomes, DNA replication is unable to reach the very end of the chromosomes, but ends at the telomere region of repetitive DNA close to the end. This shortens the telomere of the daughter DNA strand. This is a normal process in somatic cells. As a result, cells can only divide a certain number of times before the DNA loss prevents further division. (This is known as the Hayflick limit.) Within the germ cell line, which passes DNA to the next generation, telomerase extends the repetitive sequences of the telomere region to prevent degradation. Telomerase can become mistakenly active in somatic cells, sometimes leading to cancer formation.
Additionally, to aid termination, the progress of the DNA replication fork must stop or be blocked. Essentially, there are two methods that organisms do this, firstly, it is to have a termination site sequence in the DNA, and secondly, it is to have a protein which binds to this sequence to physically stop DNA replication proceeding. This is named the DNA replication terminus site-binding protein or in other words, Ter protein.
Because bacteria have circular chromosomes, termination of replication occurs when the two replication forks meet each other on the opposite end of the parental chromosome. E coli regulate this process through the use of termination sequences that, when bound by the Tus protein, enable only one direction of replication fork to pass through. As a result, the replication forks are constrained to always meet within the termination region of the chromosome.25
Researchers commonly replicate DNA in vitro using the polymerase chain reaction (PCR). PCR uses a pair of primers to span a target region in template DNA, and then polymerizes partner strands in each direction from these primers using a thermostable DNA polymerase. Repeating this process through multiple cycles produces amplification of the targeted DNA region. At the start of each cycle, the mixture of template and primers is heated, separating the newly synthesized molecule and template. Then, as the mixture cools, both of these become templates for annealing of new primers, and the polymerase extends from these. As a result, the number of copies of the target region doubles each round, increasing exponentially.26
- Imperfect DNA replication results in mutations. Berg JM, Tymoczko JL, Stryer L, Clarke ND (2002). "Chapter 27: DNA Replication, Recombination, and Repair". Biochemistry. W.H. Freeman and Company. ISBN 0-7167-3051-0.
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). "Chapter 5: DNA Replication, Repair, and Recombination". Molecular Biology of the Cell. Garland Science. ISBN 0-8153-3218-1.
- Berg JM, Tymoczko JL, Stryer L, Clarke ND (2002). "Chapter 27, Section 4: DNA Replication of Both Strands Proceeds Rapidly from Specific Start Sites". Biochemistry. W.H. Freeman and Company. ISBN 0-7167-3051-0.
- Alberts, B., et.al., Molecular Biology of the Cell, Garland Science, 4th ed., 2002, pp. 238-240 ISBN 0-8153-3218-1
- Allison, Lizabeth A. Fundamental Molecular Biology. Blackwell Publishing. 2007. p.112 ISBN 978-1-4051-0379-4
- Berg JM, Tymoczko JL, Stryer L, Clarke ND (2002). Biochemistry. W.H. Freeman and Company. ISBN 0-7167-3051-0. Chapter 27, Section 2: DNA Polymerases Require a Template and a Primer
- McCulloch SD, Kunkel TA (January 2008). "The fidelity of DNA synthesis by eukaryotic replicative and translesion synthesis polymerases". Cell Research 18 (1): 148–61. doi:10.1038/cr.2008.4. PMID 18166979.
- McCarthy D, Minner C, Bernstein H, Bernstein C (1976). "DNA elongation rates and growing point distributions of wild-type phage T4 and a DNA-delay amber mutant". J Mol Biol 106 (4): 963–81. PMID 789903.
- Drake JW (1970) The Molecular Basis of Mutation. Holden-Day, San Francisco ISBN 0816224501 ISBN 978-0816224500
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell. Garland Science. ISBN 0-8153-3218-1. Chapter 5: DNA Replication Mechanisms
- Weigel C, Schmidt A, Rückert B, Lurz R, Messer W (November 1997). "DnaA protein binding to individual DnaA boxes in the Escherichia coli replication origin, oriC". The EMBO Journal 16 (21): 6574–83. doi:10.1093/emboj/16.21.6574. PMC 1170261. PMID 9351837.
- Lodish H, Berk A, Zipursky LS, Matsudaira P, Baltimore D, Darnell J (2000). Molecular Cell Biology. W. H. Freeman and Company. ISBN 0-7167-3136-3.12.1. General Features of Chromosomal Replication: Three Common Features of Replication Origins
- Huberman JA, Riggs AD (1968). "On the mechanism of DNA replication in mammalian chromosomes". J Mol Biol 32 (2): 327–341. PMID 5689363.
- Griffiths A.J.F., Wessler S.R., Lewontin R.C., Carroll S.B. (2008). Introduction to Genetic Analysis. W. H. Freeman and Company. ISBN 0-7167-6887-9.[ Chapter 7: DNA: Structure and Replication. pg 283-290 ]
- Pursell, Z.F. et al. (2007). "Yeast DNA Polymerase ε Participates in Leading-Strand DNA Replication". Science 317 (5834): 127–130. doi:10.1126/science.1144067. PMC 2233713. PMID 17615360.
- Hansen, Barbara (2011). Biochemistry and Medical Genetics: Lecture Notes. Kaplan Medical. p. 21.
- Elizabeth R. Barry; Stephen D. Bell (12/2006). "DNA Replication in the Archaea". Microbiology and Molecular Biology Reviews 70 (4): 876–887. doi:10.1128/MMBR.00029-06. PMC 1698513. PMID 17158702.
- Distinguishing the pathways of primer removal during Eukaryotic Okazaki fragment maturation Contributor Author Rossi, Marie Louise. Date Accessioned: 2009-02-23T17:05:09Z. Date Available: 2009-02-23T17:05:09Z. Date Issued: 2009-02-23T17:05:09Z. Identifier Uri: http://hdl.handle.net/1802/6537. Description: Dr. Robert A. Bambara, Faculty Advisor. Thesis (PhD) - School of Medicine and Dentistry, University of Rochester. UR only until January 2010. UR only until January 2010.
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell. Garland Science. ISBN 0-8153-3218-1. DNA Replication Mechanisms: DNA Topoisomerases Prevent DNA Tangling During Replication
- DNA gyrase: structure and function. [Crit Rev Biochem Mol Biol. 1991] - PubMed - NCBI
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell. Garland Science. ISBN 0-8153-3218-1. DNA Replication Mechanisms: Special Proteins Help to Open Up the DNA Double Helix in Front of the Replication Fork
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell. Garland Science. ISBN 0-8153-3218-1. Intracellular Control of Cell-Cycle Events: S-Phase Cyclin-Cdk Complexes (S-Cdks) Initiate DNA Replication Once Per Cycle
- Tobiason DM, Seifert HS (2006). "The Obligate Human Pathogen, Neisseria gonorrhoeae, Is Polyploid". PLoS Biology 4 (6): e185. doi:10.1371/journal.pbio.0040185. PMC 1470461. PMID 16719561.
- Slater S, Wold S, Lu M, Boye E, Skarstad K, Kleckner N (September 1995). "E. coli SeqA protein binds oriC in two different methyl-modulated reactions appropriate to its roles in DNA replication initiation and origin sequestration". Cell 82 (6): 927–36. doi:10.1016/0092-8674(95)90272-4. PMID 7553853.
- TA Brown (2002). Genomes. BIOS Scientific Publishers. ISBN 1-85996-228-126.96.36.199. Termination of replication
- Saiki, RK; Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA (1988). "Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase". Science 239 (4839): 487–91. doi:10.1126/science.2448875. PMID 2448875.