Mismatch Repair

Organisms are capable of repairing mismatched base pairs in their DNA. Mismatched base pairs in DNA can arise by several processes. One of the most important is by replication errors. In this case, the correct base of the mismatched base pair is located in the parental strand of the newly replicated DNA, and proper correction of the mismatch contributes to the maintenance of the fidelity of the genetic information. Here is a summary of mismatch repair of newly synthesized DNA in E. coli (Figure K1).

Quick summary of proteins involved in mismatch repair

E. coli Proteins involved in Mismatch Repair
 Protein

 Function
dam methylase (DNA adenine methylase) Methylates adenine to create 6-methyladenine in the sequence GATC. In wildtype E. coli cells, the DNA is normally fully methylated (both strands are methylated). Newly synthesized strands during DNA replication are temporarily (that is, for a short period of time) unmethylated, and DNA with one strand methylated and the other strand unmethylated is called hemi-methylated. dam methylase quickly methylates hemi-methylated DNA. Fully unmethylated DNA only occurs in cells lacking dam methylase activity, and those cells have increased rates of mutation (mutator phenotype) and increased recombination frequencies.
MutH Endonuclease that cleaves unmethylated strand just 5' to the G in the sequence GATC (that is, N | GATC) leaving a 3'-OH and 5'-P at the cleavage site. Requires MutL and MutS to activate latent endonuclease activity.
MutL Adds to complex of MutS at mismatch in ATP dependent (but not hydrolysis dependent) step. Acts as a "molecular matchmaker" and uses ATP hydrolysis to bring MutS and MutH together and to stimulate MutH endonuclease activity. Also binds to and loads helicase II.
MutS Binds to all mismatches except C-C; also binds to small insertion or deletion mismatches in which one strand contains one, two, or three extra nucleotides; heteroduplexes with four extra nucleotides are weakly repaired, but larger heterologies do not appear to be recognized.
helicase II Also known as the mutU/uvrD gene product. Requires MutS and MutL to load on at the endonucleolytic cleavage site ("nick"). Unwinds the incised strand to make it sensitive to the appropriate single-strand specific exonuclease activity.
exonuclease VII Hydrolyzes single-stranded DNA in the 5'-to-3' direction.
RecJ Hydrolyzes single-stranded DNA in the 5'-to-3' direction.
exonuclease I Hydrolyzes single-stranded DNA in the 3'-to-5' direction.
DNA polymerase III holoenzyme The replicative DNA polymerase in E. coli.
SSB Single strand binding protein.
DNA ligase Uses NAD+ to form phosphodiester bonds at "nicks".

The interaction of E. coli MutS and MutL with heteroduplex DNA has been visualized by electron microscopy. In a reaction dependent on ATP hydrolysis, complexes between a MutS dimer and a DNA heteroduplex are converted to protein-stabilized, alpha-shaped loop structures with the mismatch in most cases located within the DNA loop (Figure 25.16). Loop formation depends on ATP hydrolysis. These observations suggest a translocation mechanism in which a MutS dimer bound to a mismatch subsequently leaves this site by ATP-dependent movement that is in most cases bidirectional from the mispair. The rate of MutS-mediated DNA loop growth is enhanced by MutL, and when both proteins are present, both are found at the base of alpha-loop structures, and both can remain associated with excision intermediates produced in later stages of the reaction.

Review of Mismatch Repair

For use of BB451/551 students only. Exerpted with modifications from P. Modrich and R. Lahue (1996) Mismatch repair in replication fidelity, genetic recombination, and cancer biology. Annu. Rev. Biochem. 65: 101-133.

Mismatch repair stabilizes the cellular genome by correcting DNA replication errors and by blocking recombination events between divergent sequences. The reaction responsible for strand-specific correction of mispaired bases has been highly conserved during evolution, and homologs of bacterial mutS and mutL, which play key roles in mismatch recognition and initiation of repair, have been identified in yeast and mammalian cells. Inactivation of genes encoding these activities results in a large increase in spontaneous mutability, and in the case of mice and men, predisposition to tumor development.

Bacteria and eukaryotic cells possess several distinct mismatch repair pathways, but we intend to focus on the MutS- and MutL-dependent, so-called long-patch system. This pathway is characterized by broad mismatch specificity and is believed to be responsible for correcting DNA biosyntheic errors and processing recombination heteroduplexes that contain mismatched base pairs.

The prototypic long-patch mismatch correction system is the E. coli methyl-directed pathway, the inactivation of which results in a strong mutator phenotype. Biological observations indicating action of the system on newly replicated DNA confirmed the suspicion that this pathway stabilizes the bacterial genome by correcting DNA biosynthetic errors. Figure AT illustrates the mechanism of the mismatch-provoked methyl-directed excision reaction.

The strand specificity necessary for repair of DNA biosynthetic errors is provided by patterns of adenine methylation in GATC sequences. Since this is a postsynthetic modification, recently synthesized sequences exist in a transiently unmodified state, and the absence of methylation on newly synthesized DNA targets correction to this strand. The methyl-directed pathway has broad mismatch specificity. Although the efficiency of repair of certain transversion mismatches can depend on sequence context, the only base-base mispair for which correction has not been reported is C-C. The system also repairs small insertion/deletion mismatches in which one strand contains one, two, or three extra nucleotides; heteroduplexes with four extra nucleotides are weakly repaired, but larger heterologies do not appear to be recognized.

A single GATC sequence is sufficient to direct mismatch repair. The distance separating the strand signal and the mismatch can be substantial: A GATC site can direct correction of a mispair a kilobase (kb) away, but the strength of the strand signal is greatly reduced when separation distance exceeds two kb. Methyl-directed repair initiates via a mismatch-provoked incision of the unmethylated strand at a GATC sequence in a reaction that requires the mutS, mutL, and mutH gene products and is dependent on ATP hydrolysis. MutS, which exists in solution as oligomers of a 95 kD polypeptide, binds to the mismatch, and MutL, a homodimer of a 68 kD polypeptide, adds to this complex in a reaction that depends on ATP but not on ATP hydrolysis. Interaction of MutS and MutL with the heteroduplex activates a latent endonuclease associated with the 25 kD MutH protein, which cleaves the unmethylated strand at a GATC site. The resulting strand break apparently serves as the primary signal that directs correction to the unmethylated strand, because model heteroduplexes that contain a strand-specific incision, but lack a GATC sequence, are subject to a nick-directed MutH-independent reaction that otherwise appears identical to methyl-directed repair. This finding suggests that strand discontinuities, other than those generated by GATC cleavage, might serve to target mismatch repair to new DNA strands, and evidence for involvement of supplemental signals in mutation avoidance is available. The nature of the additional strand signal(s) has not been determined, but the 3'-terminus of the leading strand or discontinuities on the lagging strand might serve in this regard.

Excision, which is strictly exonucleolytic, initiates at the strand break and proceeds toward the mismatch to terminate at several discrete sites beyond the mispair. A surprising feature of methyl-directed repair is that the strand signal may reside on either side of the mismatch, reflecting a bidirectional capability of the system. Excision initiating from either side of the mismatch requires MutS, MutL, and the mutU/uvrD gene product helicase II, but distinct exonucleases are required in the two cases. Excision from the 5' side of the mispair depends on RecJ exonuclease or exonuclease VII, both of which possess 5'-to-3' hydrolytic activity, whereas excision from the 3' side depends on exonuclease I, which hydrolyzes DNA with 3'-to-5' directionality. Inasmuch as each of these exonucleases is single-strand specific, helicase II is thought to unwind the incised strand in order to render it sensitive to the appropriate exonuclease. Since helicase II is loaded into the heteroduplex at the site of the strand break in a reaction dependent on MutS and MutL, excision may involve concerted unwinding and hydrolysis.

The last step of the methyl-directed reaction is gap repair by DNA polymerase III holoenzyme and ligation of the repair product. The requirement for polymerase III holoenzyme in the purified system is quite specific because DNA polymerase I, T7 DNA polymerase, T4 DNA polymerase, and AMV reverse transcriptase will not substitute. The molecular basis of this specificity for the replication polymerase is not understood but could be indicative of a special affiliation of the mismatch repair system with the replication apparatus.

Textbook description of mismatch repair (pages 942-944)

Mismatches, or non-Watson­Crick base pairs in a DNA duplex, can arise through replication errors, through deamination of 5-methylcytosine in DNA to yield thymine, or through recombination between DNA segments that are not completely homologous. We best understand the correction of replication errors, so that is what we describe here.

If DNA polymerase introduces an incorrect nucleotide, creating a non-Watson­Crick base pair, the error is normally corrected by 3' exonucleolytic proofreading. If the error is not corrected immediately, the fully replicated DNA will contain a mismatch at that site. This error can be corrected by another process, called mismatch repair. In E. coli the proteins that participate include the products of genes mutH, mutL, and mutS. (Mutations in any of these genes generate a mutator phenotype, meaning that they enhance spontaneous mutagenesis at many loci; hence, the name mut.) Another required gene product, originally called MutU, has now been identified as DNA helicase II (also identified as the product of the uvrD gene).

The mismatch correction system scans newly replicated DNA, looking for both mismatched bases and single-base insertions or deletions. When it finds a mismatch, part of one strand containing the mismatched region is cut out and replaced (Figure 25.16). How does the mismatch repair system recognize the right strand to repair? If it chose either strand randomly, it would choose incorrectly half the time and there would be no gain in replication accuracy. The answer is that the mismatch repair enzymes identify the newly replicated strand, because for a short period that DNA is unmethylated. In E. coli the sequence GATC is crucial, because that is the site methylated soon after replication, by action of the product of the dam gene (DNA adenine methylase). The mismatch repair enzymes look for GATC sequences that are not methylated. Recognition of an unmethylated GATC can target that strand for mismatch correction at a site as far as 1 kbp or more away from the GATC site, in either direction. Once the methylation system has acted on all GATC sites in the daughter strand, it is too late for the mismatch repair system to recognize the more recently synthesized DNA strand, and any advantage in total DNA replication fidelity is lost. When the system functions properly, it has the effect of increasing overall replication fidelity by about 100-fold-from about 1 error in 108 base pairs replicated to about 1 in 1010.

The protein products of genes mutH, mutL, and mutS have been isolated by cloning and overexpression of the genes (see Tools of Biochemistry 25B). Repair of mismatches in vitro occurs in the presence of these purified proteins, plus helicase II and the E. coli single-strand DNA-binding protein (SSB). The MutH protein binds specifically to DNA at the GATC sequence, and it has a nuclease activity that cleaves just 5' to the G in an unmethylated GATC sequence. The MutS protein binds to DNA specifically at the site of the mismatch. The MutL protein, which has been termed a "molecular matchmaker," somehow uses the energy of ATP hydrolysis to bring together the MutH and MutS proteins and stimulate the MutH nuclease activity.

Current attention is focused on the mechanism of signal transduction, as well as the repair process itself. How do the proteins communicate with each other over the several hundred base pairs that may lie between the GATC sequence and the site of the mismatch? A mechanism akin to type I restriction endonucleases has been suggested. Binding of one protein or the other would stimulate a DNA-looping process, with the protein remaining bound at the first site until the protein bound at the other site was reached.

Eukaryotic cells also possess mismatch repair systems. The nature of the strand selection mechanism remains to be clarified. It is not a methyladenine residue, since this substance is not found in eukaryotic DNA. Great excitement occurred in late 1993, with reports that certain forms of human colon cancer are triggered by genetic defects in a mismatch repair protein related to the E. coli MutS protein.