|GAO, YAHUI - University Of Maryland|
|MA, LI - University Of Maryland|
|Liu, Ge - George|
Submitted to: Genes
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 5/4/2022
Publication Date: 5/6/2022
Citation: Gao, Y., Ma, L., Liu, G. 2022. Initial analysis of structural variation detections in cattle using long-read sequencing methods. Genes. 13(5):828. https://doi.org/10.3390/genes13050828.
Interpretive Summary: Structural variations (SVs) are an important type of genetic variation. We detected SVs from one cattle individual using 10x Genomics linked read and long read sequencing methods, including Pacific Biosciences continuous long read (CLR) and circular consensus sequencing (CCS), as well as Oxford Nanopore Technologies PromethION. We found that long reads outperformed short reads in terms of SV detections. These results fill our knowledge gaps and provide the foundation for incorporating CNV into the future goat breeding program. Farmers, scientists, and policy planners who need to improve animal health and production based on genome-enable animal selection will benefit from this study.
Technical Abstract: Structural variations (SVs) are a great source of genetic variation and are widely distributed in the genome. SVs involve more genomic sequences and potentially have more effects, but they are still not well understood by short reads sequencing owing to their size and relevance to repeats. Im-proved characterization of SVs can provide more advanced insight into complex traits. With the availability of long-read sequencing, it has become feasible to reveal the full extent of SVs. Here we sequenced one cattle individual using 10x Genomics (10xG) linked read, Pacific Biosciences (PacBio) continuous long reads (CLR) and circular consensus sequencing (CCS), as well as Oxford Nanopore Technologies (ONT) PromethION. We then evaluated the ability of various methods for SV detec-tion. We identified 21,164 SVs, which amount to 186 Mb covering 7.07% of the whole genome. The number of SVs inferred from long-read-based inferences was greater than that of short-reads. The PacBio CLR identified the most of large SVs and covered the most genomes. SVs called with PacBio CCS and ONT data showed high concordance. The one with the most overlap with the results obtained by short-read data is PB CCS. Together, we found that long reads outperformed short reads in terms of SV detections.