Target enrichment facilitates focused next generation sequencing

Understand the rationale and benefits of enriching subsets of the genome (target enrichment by hybrid capture) prior to sequencing. Use this strategy for genotyping, identifying splice variants and indels, and profiling genomic recombination events as well as viral and transposon integration sites.

Jan 15, 2014

Revised/updated Nov 17, 2016

Although the decreasing cost of DNA sequencing has enabled more mainstream research applications [1], the upfront cost of sequencing large numbers of samples is still prohibitive for many research labs. This leads researchers to enrich subsets of the genome (target enrichment by hybrid capture) before sequencing (Figure 1), which reduces cost and allows them to focus sequencing efforts on genomic regions relevant to their study. In addition, focusing on specific genomic regions enables multiplexing, the sequencing of many samples simultaneously. Certain applications can benefit from target enrichment as well—these include genotyping, identification of splice variants and indels, and profiling of genomic recombination and viral and transposon integration sites.


Figure 1. Target capture using xGen Lockdown Probes.


Target enrichment increases throughput

Genotyping many targets in many samples

A typical genotyping experiment of 100 samples performed by PCR requires many individually optimized PCR assays. Developing these assays is time consuming and prone to error due to the high number of individual reactions. Furthermore, an important discovery in the course of the experiment can necessitate expanding the sample size. Finally, genotyping with PCR can fail if mutations exist in the priming sites. Typically, primers are optimized using a few samples and then applied to a larger sample set. Unfortunately, this practice is based on the assumption that priming sites are constant across all samples. This might be true for a small sample set, or for regions that are critical for function, but as the number of assays multiplied by the number of samples increases, the probability of a variation skewing any given assay also increases (Figure 2A).

As an alternative to PCR, targeted next generation sequencing (NGS), consolidates all the genotyping reactions for different genes and mutations into a single, focused sequencing run using the design shown in Figure 2. Hybrid capture of SNP sequences uses probes centered on the SNP sequence and is often not affected by additional mutations that result in sequence mismatch with the target capture probe. In fact, IDT xGen® Lockdown® Probes can tolerate as many as 7 mutations within the target region without affecting hybrid capture. The enriched DNA is then sequenced in a single run, further making the laborious setup of individual PCRs and gels for analysis unnecessary. Finally, it is easy to expand an experiment to contain additional probes targeting other SNPs without having to re-optimize the assay.

D4.1-CC-Target enrichment_Fig 2

Figure 2. Hybrid capture simplifies genotyping analysis. (A) PCR primers are very sensitive to binding sequence mismatches. A single mismatch underlying a PCR primer binding site can cause up to 7°C ΔTm which can dramatically reduce PCR efficiency or cause PCR failure. (B) Target capture probe(s) are centered on mutation under investigation. IDT xGen Lockdown Probes used for this purpose have a tolerance for as many as 7 mutations within the target region without affecting hybrid capture.

Identifying viral and transposon integration sites

Experiments studying gene function often involve ectopic expression of the gene and its subsequent knockdown using RNA interference (RNAi). A retrovirus is used to introduce the exogenous DNA into the genome of the organism under study. The DNA fragment may be integrated into an innocuous intergenic region or into a transcribed region [2], which can alter gene function, causing spurious unrelated phenotypes. To screen for problematic integration sites across multiple samples, targeted sequencing using capture probes that correspond to regions of the known exogenous DNA sequence can be performed to identify the flanking integration sites (Figure 3). Samples that have exogenous DNA incorporated within acceptable genomic regions can then be selected for further study.

D4.1-CC-Target enrichment_Fig 3

Figure 3. Probe design for identification of integration sites. Capture probes are designed to enrich for regions of exogenous, known DNA (viral/transposon). These fragments can then be used to identify flanking integration site sequences by next generation sequencing.

The method is also important for identifying integration sites of transposons and infection-causing viruses. These insights may improve understanding of the associated diseases, especially by providing information about the molecular mechanisms of integration, which may be leveraged for use in other applications. Identification of viral DNA in a host organism can also be used to diagnose a disease state [3]. Performing targeted sequencing of human DNA that has been enriched using probes designed to target viral genomes can identify viral sequences within the human genome and, therefore, confirm infection by the virus.

Alternative splicing

Alternative splicing occurs in ~95% of human genes that have more than 1 exon, and erroneous splicing is implicated in many diseases [4]. Some of these diseases can be diagnosed using an assay that detects alternatively spliced genes. RNA sequencing (RNA-seq) is a suitable NGS technique for this purpose. RNA-seq provides a snapshot of the quantity of RNA present at a specific moment in time; therefore, it is sensitive to differential gene expression. To perform RNA-seq, RNA is extracted from the sample, converted to DNA, and processed for sequencing according to standard DNA sequencing procedures. To ensure that low levels of disease-related splice variants are easily detected, and particularly for multiplex sequencing of different samples, probes specific to the transcripts under investigation can be used to enrich for those transcripts, facilitating their detection (Figure 4). Additionally, if different protein isoforms are detected in a proportion of samples, the remaining samples can be more easily screened for alternatively spliced transcripts rather than trying to detect the different proteins using western blots.

D4.1-CC-Target enrichment_Fig 4

Figure 4. Multiple methods for examining alternative splicing. (A) To capture splice variants probes can be designed to span exon junctions, comprising sequence information from both exons of interest. (B) Unknown regions that have been incorporated into a transcript can be identified by designing probes targeting the known transcript. Subsequent sequencing will extend the read into the unknown region, helping to identify the source. (C) Identify recombination by designing probes targeting different exons and examining relative positions of the sequence reads.

Hybrid capture improves flexibility

Targeted NGS using hybrid capture allows flexibility to combine applications (i.e., using the same probe set to simultaneously answer different questions) and easily increase the number of targeted sites. The ability to increase the number of target sites is particularly useful for various reasons. As an example, if relevant SNP data for a new genomic region becomes available, additional probes can be ordered for detection of these regions. These probes can also be added to the existing capture panel for use in subsequent genotyping experiments. Generally, adding additional probes does not diminish the performance of the existing probes, and it allows researchers to extract more data per sequencing experiment. This flexibility enables researchers to respond quickly to the rapid rate of NGS publications, reducing the wait time for making new discoveries.

Can your workflow benefit from target capture?

Although target capture has many applications, it may not be the most appropriate solution for all researchers. For example, the percentage of sequence reads obtained for a given region of interest increases with increasing probe number. Larger probe sets are usually more efficient and cost-effective because users can make greater use of their sequencing capacity. Smaller target capture panels are useful for retroviral and transposon applications; however, researchers should be aware that results may vary for the different panel sizes and applications (e.g., repeat regions may result in higher on-target rates because they exist in higher numbers within the genome), and must be able to interpret them.

It should be noted that target capture may not be the best option for every type of experiment; e.g., identification of a single target in thousands of samples may be best achieved through PCR-based methods rather than using in-solution hybridization for target enrichment. xGen Lockdown Probes are fully customizable probe sets that can be used to create capture panels of any size. If you have questions about your application and how to fit hybrid capture with xGen Lockdown Probes into your workflow, contact us at


  1. Hayden EC (2013) Gene sequencing leaves the laboratory. Nature 494(7437):290-291.
  2. Ambrosi A, Cattoglio C, Di Serio C. (2008) Retroviral integration process in the human genome: is it really non-random? A new statistical approach. PLoS Comput Biol 4(8):e1000144.
  3. Depledge DP, Palser AL, et al. (2011) Specific capture and whole-genome sequencing of viruses from clinical samples. PLoS ONE 6(11): e27805.
  4. Matlin AJ, Clark F, Smith CW. (2005) Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol, 6(5):386–398.
  5. Cancer Genome Atlas Research Network. (2013) Genomic and epigenomic landscapes of adult _de novo_ acute myeloid leukemia. N Engl J Med, 368(22):2059–2074.