Submitted to: Methods in Molecular Biology
Publication Type: Book / Chapter
Publication Acceptance Date: 6/11/2018
Publication Date: N/A
Interpretive Summary: The detection of genetic variants that are larger than a single base pair requires many different computational methods and tools. This book chapter provides details on how to install and operate USDA-ARS released software (RAPTR-SV). This software will enable molecular biologists to detect large genetic variants from DNA sequencing data. Currently, such variants are difficult to identify from sequencing data. More information on their presence in natural populations is likely to allow us to better quantify their impacts on livestock genetic selection.
Technical Abstract: High throughput short read sequencing technologies are still the leading cost-effective means of assessing variation in individual samples. Unfortunately, while such technologies are eminently capable of detecting single nucleotide polymorphisms (SNP) and small insertions and deletions, the detection of large copy number variants (CNV) with these technologies is prone to numerous false positives. CNV detection tools that incorporate multiple variant signals and exclude regions of systemic bias in the genome tend to reduce the probability of false positive calls and therefore represent the best means of ascertaining true CNV regions. To this end, we provide instructions and details on the use of the RAPTR-SV CNV detection pipeline, which is a tool that incorporates read-pair and split-read signals to identify high confidence CNV regions in a sequenced sample. By combining two different structural variant (SV) signals in variant calling, RAPTR-SV enables the easy filtration of artifact CNV calls from large datasets.