Frequently asked questions

Our Scientific Applications Support team has assembled a list of frequently asked questions to help you find answers quickly. Filter using one or more categories to focus on specific topics, or use the search bar to perform a text search.

Search all FAQs:

How do I use UMIs for error correction?

Depending on your sample type or experiment goals, you can choose to use UMIs or ignore them altogether. The xGen Prism DNA Library Prep Kit Analysis Guidelines leads you through the recommended analysis pipeline using open-source tools starting with FASTQ files and resulting in variant calling.

As an overview, fixed UMI sequences, such as those used with the xGen Prism DNA Library Prep Kit, enable identification and correction of sequencing or PCR errors, even if they appear within the UMI sequence.

  • Single read families analysis: UMIs can also be used to correct errors in sequencing data at the same time as removing duplicate reads. For example, all reads with the same start-stop position and UMI can grouped as a single read family then collapsed (Figure 2C). Rather than simply choosing the highest quality read, this method uses all reads within the single read family to choose the most likely base at each position from beginning to end. This process yields a collapsed single read family that can be used for variant calling. This approach is taken by the tools GroupReadsByUmi plus CallMolecularConsensusReads (fgbio).
  • Combined read families analysis: A more stringent method of error correction is also enabled by the xGen Prism DNA Library Prep Kit. During Ligation 1, UMIs are added to the top and bottom strands by single stranded ligation, which are subsequently added to the other strand by gap filling during Ligation 2. Thus, both strands can be tracked back to the same original molecule. This approach makes use of start-stop position and a combination of single read families originating from the same original molecule (Figure 2D). Again, rather than simply choosing the highest quality read, this method uses all reads within a combination of both single read families to choose the most likely base at each position from beginning to end. This process yields a collapsed combined read family that can be used for variant calling, which greatly decreases the chances of false positives. Tools like GroupReadsByUmi plus CallDuplexConsensusReads can be used for this analysis.

Note: Using UMIs for error correction analysis usually requires significantly deeper sequencing and may not be appropriate for damaged samples like low-quality formalin-fixed paraffin-embedded (FFPE) samples.

xGen Prism error correction methods

 

Figure 1. Schematic of error correction methods with UMIs.



Tags:
xgen