|ZHENG, YI - Boyce Thompson Institute|
|GAO, SHAN - Boyce Thompson Institute|
|GALVEZ, MARCO - International Potato Center|
|GUTIERREZ, DINA - International Potato Center|
|FUENTES, SEGUNDO - International Potato Center|
|KREUZE, JAN - International Potato Center|
|FEI, ZANGJUN - Boyce Thompson Institute|
Submitted to: Virology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/20/2016
Publication Date: 10/10/2016
Citation: Zheng, Y., Gao, S., Padmanabhan, C., Li, R., Galvez, M., Gutierrez, D., Fuentes, S., Ling, K., Kreuze, J., Fei, Z. 2016. VirusDetect: An automated pipeline for efficient virus discovery using deep sequencing of small RNAs. Virology. 500(2017)130-138.
Interpretive Summary: Virus infections are universal and recognized as a significant threat to agriculture production and human health. Efficient and accurate detection of viruses in plants and animals is essential for the development of effective strategies to manage the spread and impact of viral diseases. Conventional virus detection methods such as enzyme-linked immuno-sorbent assay, polymerase chain reaction, nucleic acid hybridization or microarray are useful but they require prior knowledge or sequence information of the potential pathogens, thus they are not highly efficient in detecting novel viruses or virus variants. Recently, a new approach in virus discovery using high throughput sequencing and assembly of total small ribonucleic acids (sRNAs), has proven to be highly efficient in plant and animal virus detection. But, there is no computational tool specifically designed for virus identification and discovery of novel viruses using sRNAs. In the present study, ARS scientists collaborating with others at the Boyce Thompson Institute and the International Potato Center developed a bioinformatics pipeline, called VirusDetect, to efficiently and effectively identity known and unknown viruses. This universal virus discovery tool is a great interest to scientists working in research or clinical diagnosis on viral diseases of plants, animals and human.
Technical Abstract: Accurate detection of viruses in plants and animals is critical for agriculture production and human health. Deep sequencing and assembly of virus-derived siRNAs has proven to be a highly efficient approach for virus discovery. However, to date no computational tools specifically designed for both known and novel virus discovery using siRNA sequences are available. Here we present VirusDetect, a novel bioinformatics pipeline that can efficiently analyze large-scale siRNA datasets for both known and novel virus identification. VirusDetect first aligns siRNA sequences to a curated virus reference database and performs reference-guided assembly. It then performs host siRNA subtraction and de novo assembly of siRNA sequences with automated parameter optimization. The assembled contigs are compared to the reference virus database for known and novel virus identification. Extensive evaluations using plant and Drosophila melanogaster siRNA datasets and comparison with another tool suggest that VirusDetect is highly sensitive and efficient in identifying known and novel viruses, which is achieved by employing the reference-guided assembly of virus-derived siRNAs using a curated and classified virus reference database, and de novo assembly using host-subtracted siRNAs and automated parameter optimization. Furthermore, VirusDetect shows good performance in virus discovery using other types of next-generation sequencing (NGS) datasets, indicating it can be used as a universal virus discovery tool.