The CRISPR-Cas9 system is at the forefront of genome editing research. This nuclease system is often used to target large genomes and cannot retain the high specificity seen in results from smaller genomes. This leads to a need for a protocol to identify and quantify on- and off-target editing. The traditional method used to evaluate editing efficiency is a DNA mismatch detection assay. This method is purely qualitative. Another method is Sanger sequencing combined with a software analysis program. Depending on the number of edits, confirmation of every edit by Sanger sequencing could require a large number of primers, reactions, and interpretations, which would be highly impractical. Next generation sequencing (NGS) can provide more detailed and quantitative information about the editing events; however, its perceived high cost can be prohibitive. The rhAmpSeq Amplicon Sequencing System provides a cost-effective method for NGS detection of editing events. The rhAmpSeq rapid library preparation protocol allows amplification of up to 5000 targets in a single PCR reaction, which enables simultaneously testing of multiple variables. The rhAmpSeq system detects edits with high specificity.
rhAmpSeq amplicon sequencing is a rapid and accurate targeted sequencing method that harnesses the power of RNase H2–dependent (rhAmp) PCR technology. Proprietary blocked primers (rhAmp primers), containing RNA bases, hybridize to their target. The resulting DNA:RNA heteroduplex is recognized by the thermostable RNase H2 enzyme. Perfectly-matched primers are activated by RNase cleavage at the DNA:RNA duplex. The rhAmp primers are then extended to create amplicons containing the target regions of interest. These amplicons can then be sequenced by next generation sequencing on Illumina instruments. rhAmpSeq technology can be used to confirm the efficiency and specificity of CRISPR-Cas9 edits under varying experimental conditions in a single, multiplex reaction. Here, we evaluate how sequencing coverage impacts accurate quantification of editing.
Decreasing the number of reads in a sequencing experiment can save time and money, but at the cost of accuracy. The required accuracy of the results depends on the experiment. Since we are testing coverage required to determine editing accuracy, we chose to evaluate targets that had varying fractions of edited DNA. For example, an editing level of 0.6% means that 0.6% of the total DNA has been edited. The percent of DNA edited impacts the level of accuracy needed for the experiment. To determine the relationship between accuracy and sequence coverage, 27 targets were sequenced with the rhAmpSeq system. The total number of reads per sample varied (1936–109,771 reads). From the total number of reads, 4–1936 read pairs were subsampled to observe the editing at each level. Several different editing levels were chosen to test how many edits could be detected at a given coverage amount. Table 1 shows 3 editing levels. The whole range of editing levels can be seen in Figure 1 and 4 editing levels are displayed in Figure 2, for clarity. As the number of subsampled reads at the editing site increased (coverage), overall standard deviation and the percent variability decreased. The smaller the percent of total DNA edited is, the more observations, or reads, are needed to accurately quantify them (Figure 2). In this example, you can see that to observe editing at 0.6%, you will require ~1000 reads at each target. For any given sequencing experiment, the percent of DNA that is edited will determine the required coverage.
Table 1. Recommended coverage to achieve a standard deviation <40% of the editing mean.
|Percent of DNA edited when total available reads are used||Coverage (number of subsampled read pairs)||Relative standard deviation (SD)|
Figure 1. Target coverage is inversely proportional to the standard deviation. As the total reads sampled at the given target increases, the standard deviation from the editing mean decreases. If the percent of DNA edited and the reads are both low, standard deviation will be high. If the percent of DNA edited is low, standard deviation will decrease as coverage increases. If the percent of DNA edited is high, standard deviation will remain relatively low, regardless of the number of reads.
Figure 2. Normalized standard deviation in true editing levels decrease as the number of reads increases. The data represent 4 observed true editing levels that were analyzed. The horizontal line represents the standard deviation as 0.4 of the mean for the subsampled read pairs.
From these data, we find that to achieve a standard deviation of 40% of the editing mean, the required read depth will vary depending on the percent of targeted DNA molecules that have been edited. If only 0.6% of the total DNA is edited, we recommend 1024 reads. If the fraction of DNA molecules edited is increased to 3.04%, only 256 reads are required. If a third of the targeted DNA molecules are edited, only 16 reads are required. To find the recommended coverage at a different limit, please contact email@example.com.