Generating plasmid clones of DNA sequences is one of the most common and important methods in biology research. These methods are indispensable, because, with modest effort, propagation of a single DNA plasmid that was transformed into E. coli provides an indispensable tool capable of producing millions of exact copies, or clones, of that DNA. However, it has also been shown that the fidelity of DNA replication in E. coli is not perfect . Secondary structure, and damage to the DNA through oxidation and other sources can lead to errors in replication . When these errors occur early during colony propagation, they can become a significant proportion of the plasmid sequence population by the time colonies are selected for sequence purification.
Sanger Sequencing vs. Next Generation Sequencing
Because Sanger sequencing relies on a consensus signal generated from a population of molecules, small variations within the population can be obscured within the signal. In contrast, next generation sequencing (NGS) uses massively parallel analysis of sequence traces for thousands of individual molecules. This makes it possible to detect small variances within populations of molecules.
Examples of sequencing results from samples with contaminating subpopulations can be seen in Figures 1 and 2. In Figure 1, the Sanger trace data for a synthetic gene allows detection of a mixed population of plasmid sequences, visible as strong overlapping peaks of different bases (Figure 1B). A quantitative analysis of the same plasmid preparation using NGS shows that this subpopulation is actually ~17% of the total population (Figure 1A).
Figure 2 shows data from a cloned and purified plasmid that contains a subpopulation with a sequence mutation(s) comprising ~7% of the total sequence population. While detectable by NGS analysis (Figure 2A), this contamination is not visible by Sanger sequencing of the same DNA preparation (Figure 2B). While many subpopulations are identifiable by Sanger sequencing, surprisingly large subpopulations go undetected. Minor subpopulations, however, are easily identified in NGS data.
Next Generation Sequencing (NGS) is more sensitive for detection of subpopulations, but Sanger sequencing is still useful
While Sanger sequencing has been the standard for sequence confirmation for decades, these examples illustrate the superior ability of NGS analysis to detect subpopulations within cloned plasmid preparations that would otherwise go undetected by Sanger sequencing.
It is important to note that Sanger sequencing continues to be a useful tool for many labs, and for molecular cloning applications. However, as with any method, it is necessary to understand what the technology limitations are, and to verify important results through replication and/or with other methods.