PCR and qPCR
Support and Educational Content

Consider SNPs when designing PCR and qPCR assays

The rapidly increasing number of known SNPs

Widespread adoption of next generation sequencing (NGS) has led to an exponential increase in cataloged sequence data. One consequence of this has been a dramatic increase in the overall number of identified single nucleotide polymorphisms (SNPs; see sidebar, SNPs defined). As of Nov 7, 2016, Build 149 of the NCBI dbSNP reference database listed 558 million submitted SNPs (subSNP) for Homo sapiens, of which 154 million were referenced (refSNP) [1]. This represents >19X increase in the number of subSNPs over 10 years, when there were 28 million subSNPs (2006, Build 126); and ~13X increase in the number of refSNPs (Figure 1). Based on the number of refSNPs in Build 149, and a genome size of 3.4 x 109 bp [2], the human genome should contain a SNP approximately once every 22 bases. Other common model systems show a similarly high frequency of SNPs (Table 1).

Human SNPs over the past 5 and 10 years

* NCBI dbSNP Build 149 (Nov 7, 2016); www.ncbi.nlm.nih.gov/dbvar/content/org_summary/ (accessed Dec 19, 2016).

www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi (accessed Dec 19, 2016).

refSNP, or reference SNP cluster, is defined as a SNP or group of SNPs that map to a specific genomic sequence region. The SNPs of an existing build are all refSNPs. In creating a new build, the refSNPs from the prior build and new subSNPs are both compared to updated genome sequence data to minimize duplications among refSNPs and subSNPs. This process will assign subSNPs to existing refSNP clusters or new refSNPs.

§ subSNP stands for “submitted SNP” and is defined as a SNP submitted since the last build that was found to be distinct from refSNPs after multiple cycles of BLAST analyses.

Figure 1. Dramatic increase in the number of human SNPs over the past 5 and 10 years.


Table 1. High frequency of SNPs in common model systems.

Species NCBI dbSNP build* subSNP (million) refSNP (million) Genome size (bp)§ SNPs per base
Homo sapiens (human) Build 149 (Nov 7, 2016) 557.9 154.2 3.40 x 109 1 in 22
Bos taurus (cow) Build 148 (Jun 24, 2016) 293.8 100.2 3.62 x 109 1 in 36
Mus musculus (mouse) Build 146 (Nov 24, 2015) 135.7 80.4 3.23 x 109 1 in 40
Sus scrofa (pig) Build 145 (Jul 31, 2015) 135.5 60.4 3.13 x 109 1 in 52
Drosophila melanogaster (fruit fly) Build 148 (Jun 24, 2016) 5.2 5.2 0.176 x 109 1 in 34

* Taken from NCBI dbSNP; www.ncbi.nlm.nih.gov/dbvar/content/org_summary/(accessed Dec 19, 2016).

† subSNP stands for “submitted SNP” and is defined as a SNP submitted since the last build that was found to be distinct from refSNPs after multiple cycles of BLAST analyses.

‡ refSNP, or reference SNP cluster, is defined as a SNP or group of SNPs that map to a specific genomic sequence region. The SNPs of an existing build are all refSNPs. In creating a new build the refSNPs from the prior build and new subSNPs are both compared to updated genome sequence data to minimize duplications among refSNPs and subSNPs. This process will assign subSNPs to existing refSNP clusters or new refSNPs.

§ Gregory, T.R. (2005). Animal Genome Size Database; www.genomesize.com (accessed Dec 19, 2016); where genome size (bp) = (0.978 x 109) x DNA content (pg)

SNPs—defined

Single nucleotide polymorphisms or “SNPs” are the single base positions within a stretch of DNA that differ in sequence among individuals or a population. They define distinct alleles or mutations, and can occur within coding or noncoding sequences. The effect of a SNP can be variable. When they occur within coding sequences, they do not always affect the translated amino acid sequence—that is, some SNPs are silent. However, even when SNPs occur outside coding regions, they may still affect gene expression by altering gene splicing, transcription factor binding, or mRNA degradation.

SNPs play an extremely valuable role in vivo in evolving species diversity and, in vitro, as a tool for individual and species identification. SNPs occur at different frequencies and within different populations, and SNP profiles can serve as a powerful diagnostic tool for certain diseases. We, human beings, share 99.9% of the same DNA sequences. Of the 0.1% differences among us, >80% are due to SNPs [3]. Other sequence variations include insertions, deletions, and transpositions.

Taking SNPs into account when designing PCR/qPCR assays

Given the high frequency of SNP occurrence, it is unrealistic to try to avoid SNPs altogether when designing PCR/qPCR assays. However, it is important to consider their specific positioning, if located within a primer or probe sequence. Performing PCR using primers and probe sequences that overlie SNP sites can dramatically impact a reaction or can have little to no impact at all. Specifically, the position of SNPs underlying a primer or probe can influence primer and probe Tm, efficiency of polymerase extension, and even target specificity. To obtain the most accurate data, it therefore becomes important to know how your assay designs overlie SNPs and manage this positioning.

Positional effects. SNPs that occur in primer and probe binding sites can destabilize oligonucleotide binding and reduce target specificity. Mismatches can affect the hybridization of oligos, reducing the Tm of an oligonucleotide by as much as 5–18°C (Figure 2). The degree of effect on Tm depends on the mismatch position, type of mismatch (e.g., A/A, A/C, G/T), as well as the surrounding environment/sequence [4].

When probes hybridize, the destabilizing effects are highest for mismatches located in the interior of the duplex [5,6,7]. Mismatches at the terminus or penultimate position are less discriminatory (1 or 2 base pairs from the terminus) [5,8]. Use the free, online IDT OligoAnalyzer® tool (available at www.idtdna.com/scitools) to make such predictions.

Example oligo sequence

A. Example oligonucleotide sequence.


Effect of a C-A mismatch

B. Effect of a C-A mismatch.


Effect of a C-T mismatch

C. Effect of a C-T mismatch.


Effect of a C-C mismatch

D. Effect of a C-C mismatch.

Figure 2. Significant decrease in probe or primer melting temperature from a single mismatch. The example shows how a single mismatch can alter probe or primer melting temperature, affecting the efficiency of the PCR and, ultimately, the interpretation of experimental results. These particular mismatches create non-standard base pairing that should not disrupt the helix. However, a single mismatch can substantially decrease melting temperature—by over 8°C (compare the Tm values highlighted with green and red arrows). The screen shots show output from the free, online OligoAnalyzer® tool available at www.idtdna.com/scitools.

SNPs underlying primers can exert additional positional effects on polymerase binding. (Note that the OligoAnalyzer tool does not take polymerase destabilization effects into account.) In their article entitled Single-nucleotide polymorphisms and other mismatches reduce performance of quantitative PCR assays [9], Lefever and colleagues demonstrated that the impact of a given mismatch correlated with its distance from the 3’ end of primers. Those mismatches closest to the 3’ end—typically within the last 5 bases—had the most dramatic effect on amplification. Mismatches at the terminal 3’ base had the strongest shift of Cq from perfect matches, altering Cq by as much as 5–7 cycles (a 32- to 128-fold difference, dependent on the master mix used; Figure 3).

CC-SNPS Fig3

Figure 3. Mismatches at the 3’ end of primers reduce qPCR performance. The data show the difference in Cq (ΔCq) between perfect match and mismatch primers as a function of the position of a single mismatch, using 5 different master mixes (A, B, C, D, and E). p values were calculated using one-way analysis of variance (ANOVA). The shift due to a SNP at the 3’ end of a primer varies up to 7 Cq, representing a 128-fold change in gene expression, dependent on the master mix used. (Data adapted from Lefever et al. [3], with permission of the publisher.)

Base composition effects. Lefever and colleagues also showed that reactions containing purine/purine and pyrimidine/pyrimidine mismatches at the 3’ terminal position in the primer produced larger ΔCq values (mismatch vs. perfect match) and reduced end-point fluorescence values, with A/G and C/C showing the largest Cq differences compared to perfect matches [9].

Their data demonstrated that the shift in Cq between a perfect-matched oligo/target and an oligo/target with a single mismatch decreased with increasing distance of the mismatch from the 3’ end. Single mismatches located more than 5 nucleotides from the 3’ end could still have a moderate effect on qPCR amplification. Further experiments by this group showed that the reduction in Tm and shift in Cq were exacerbated when SNPs occur in both primers (forward and reverse) or when more than one mismatch occurs within a given primer.

The free, online OligoAnalyzer® tool allows researchers to set mismatches and then calculate Tm. Use this tool to also examine potential hairpin and dimer formation. The DECODED article, Using the OligoAnalyzer program, provides guidance on how to identify these characteristics.

Effect on qPCR amplification. In many cases, a single SNP may not prevent amplification, but can cause inefficient annealing and amplification [5]. This can lead to a delay shift in Cq and underestimation of the amount of gene expression or even copy number loss in SNP-containing sequences.

Using a modified single-base extension assay, Wu and colleagues [10] investigated how the type and position of a mismatch affected extension efficiency during the initial PCR cycle. They concluded that mismatches within the last 3–4 bases of the 3’ end of the primer blocked primer extension. Wu et al. attributed the low extension efficiency to reduced binding of the DNA polymerase. While other research groups have contested this finding, describing a similar affinity of DNA polymerase for correctly paired and mispaired duplexes [11], Lefever and colleagues [9] confirm and extend the results from Wu et al.

Safeguard your experiments

Researchers often adopt primer and probe sequences identified in prior publications. It can be tempting to use legacy published or “lab-validated” RT-PCR assay designs. However, given the continual addition of new sequence information, it is important to reevaluate and understand the location of SNPs relative to primer and probe sequences in your PCR/qPCR assays. The following are tips for managing SNP impact on your assay results:

  • To obtain an up-to-date list of possible SNPs in your sequence, scroll down to the Alignments section of your BLAST search results page, and click on Graphics at the top left. At the top right of the sequence graphic, click on Tracks and select the Variation tab. From there you can select the type of SNPs for which you want information.
  • If the “rs” number—the Reference SNP cluster ID (accession number) that refers to a specific SNP—is known, check SNP information in NCBI dbSNP (www.ncbi.nlm.nih.gov/snp).
  • If a SNP is identified, check whether the frequency of the SNP (minor allele frequency, or MAF) is relevant in your population.
  • When you cannot avoid a SNP underlying your probe sequence, use the free, online IDT OligoAnalyzer Tool to predict the Tm of mismatched probe sequences.
  • In cases where a SNP underlies a primer sequence, minimize or eliminate SNP effects by positioning the SNP towards the 5’ end of the primer. For help with such designs, contact our technical support group at applicationsupport@idtdna.com, or by phone, using the local phone number on our Contact page.
  • For genotyping experiments where relevant SNPs occur adjacent to your SNP of interest, avoid allele dropout by using mixed bases (Ns) or inosines in the primer or probe to cover the adjacent site(s).
  • Since genomic information is constantly in flux, it is important to recheck previously used primer and probe sequences for underlying SNPs.

Adopting a new paradigm in assay design

SNPs are now a regular occurrence, with more discovered every day. It is no longer practical, or even possible, to avoid them when designing PCR/qPCR assays. This means we must adjust our thinking about experimental design, and design our PCR/qPCR assays intelligently, with SNPs in mind.

References

  1. NCBI Variation Summary. www.ncbi.nlm.nih.gov/dbvar/content/org_summary/. (Accessed Dec. 19, 2016).
  2. Gregory TR. (2005) Animal Genome Size Database. www.genomesize.com/. (Accessed Dec 19, 2016).
  3. Piazza A. (2012) Theory of evolution and genetics. In: Fasolo A Ed. The Theory of Evolution and Its Impact. Italy:Springer-Verlag. p. 119.
  4. Owczarzy R, Tataurov AV, et al. (2008) IDT SciTools: a suite for analysis and design of nucleic acid oligomers. Nucl Acids Res, 36 (suppl 2):W163–169.
  5. Letowski J, Brousseau R, Masson L. (2004). Designing better probes: effect of probe size, mismatch position and number on hybridization in DNA oligonucleotide microarrays. J Microbiol Meth, 57:269–278.
  6. You Y, Moreira BG, et al. (2006) Design of LNA probes that improve mismatch discrimination. Nucl Acids Res, 34:e60.
  7. SantaLucia J Jr, Hick D. (2004) The thermodynamics of DNA structural motifs. Annu Rev Biophys Biomol Struct, 33:415–440.
  8. Urakawa H, Noble PA, et al. (2001) Single-base-pair discrimination of terminal mismatches by using oligonucleotide microarrays and neural network analyses. Appl Environ Microbiol, 68(1):235–244.
  9. Lefever S, Pattyn F, et al. (2013) Single-nucleotide polymorphisms and other mismatches reduce performance of quantitative PCR assays. Single-nucleotide polymorphisms and other mismatches reduce performance of quantitative PCR assays. Clin Chem, 59(10):1470–1480.
  10. Wu J-H, Hong P-Y, Liu W-T. (2009) Quantitative effects of position and type of single mismatch on single base primer extension. J Microbiol Methods, 77:267–275.
  11. Huang MM, Arnheim N, Goodman MF. (1992) Extension of base mispairs by Taq DNA polymerase: implications for single nucleotide discrimination in PCR. Nucleic Acids Res, 20:4567–4573.

Product focus: qPCR reagents—everything but your sample

All the reagents you need for successful qPCR assays are available through IDT.

Related reading

Learn more about qPCR assay design from these articles:


Review other DECODED Online newsletter articles on PCR and qPCR applications.


You can also browse our DECODED Online newsletter for additional application reviews, lab tips, and citation summaries to facilitate your research.


Author: Ellen Prediger, PhD, is a senior scientific writer at IDT.

© 2017 Integrated DNA Technologies. All rights reserved. Trademarks contained herein are the property of Integrated DNA Technologies, Inc. or their respective owners. For specific trademark and licensing information, see www.idtdna.com/trademarks.


Predesigned qPCR Assays

Probe-based qPCR assays for quantification of human, mouse, and rat gene expression. Order in plates or tubes.

Search human, mouse, or rat genes ≫


Related Articles

Designing PCR Primers and Probes

General guidelines for designing primers and probes and for choosing target locations for PCR amplification.

Read more ≫

Steps for a Successful qPCR Experiment

Considerations for 5′ nuclease assay design and experimental setup to help you obtain accurate and consistent results.

Read more ≫

Interpreting Melt Curves: An Indicator, Not a Diagnosis

Examining PCR melt curve data to determine what it can/cannot tell us about resulting PCR amplicons.

Read more ≫

Epigenetic Biomarkers for Prostate Cancer

Scientists use methylation and expression analysis methods to evaluate epigenetic markers for early, noninvasive detection of aggressive prostate cancer. IDT PrimeTime® qPCR Assays, ZEN™ Double-Quenched Probes, and gBlocks® Gene Fragments facilitate this research.

Read more ≫

Optimizing Multiplex qPCR for Detecting Infectious Diseases and Biothreat Agents in the Field—ZEN™ Double-Quenched Probes bring down the background

Tetracore researchers developing large sets of robust probe-based qPCR assays discuss the need to: use probe dyes compatible on common PCR instruments, maintain low background with multiple probes, and reformulate assays to address viral mutation.

Read more ≫