Location: Children's Nutrition Research CenterTitle: A computational pipeline to infer alternative poly-adenylation from 3' sequencing data
|YALAMANCHILI, HARI - CHILDREN'S NUTRITION RESEARCH CENTER (CNRC)|
|ELROD, NATHAN - UNIVERSITY OF TEXAS MEDICAL BRANCH|
|JENSEN, MADELINE - UNIVERSITY OF TEXAS MEDICAL BRANCH|
|JI, PING - UNIVERSITY OF TEXAS MEDICAL BRANCH|
|LIN, AI - UNIVERSITY OF TEXAS MEDICAL BRANCH|
|WAGNER, ERIC - UNIVERSITY OF TEXAS MEDICAL BRANCH|
|LIU, ZHANDONG - BAYLOR COLLEGE OF MEDICINE|
Submitted to: Methods in Enzymology
Publication Type: Book / Chapter
Publication Acceptance Date: 3/20/2021
Publication Date: 6/5/2021
Citation: Yalamanchili, H.K., Elrod, N.D., Jensen, M.K., Ji, P., Lin, A., Wagner, E.J., Liu, Z. 2021. A computational pipeline to infer alternative poly-adenylation from 3' sequencing data. In: Tian, B. editor. Methods in Enzymology. 1st edition. Cambridge, MA: Academic Press. p. 185-204. https://doi.org/10.1016/bs.mie.2021.04.001.
Interpretive Summary: Researchers are the expanding horizons of human health and disease research by tapping into alternative polyadenylation (APA), an under-charted mechanism that regulates gene expression. APA is about modifying the 3-prime end (3'end) of RNA strands that are transcribed from DNA. The propelling appreciation of APA and 3' sequencing datasets created an urgent need of novel computational tools to investigate genome-wide transcriptomic diversity. Addressing this urgent need in this chapter We describe a streamlined computational pipeline to infer alternative poly-adenylation from 3' sequencing data. PolyA-miner can effectively identify novel APA sites that are otherwise undetected when using reference-based approaches. This chapter establishes and standardizes a computational pipeline to investigate alternative poly-adenylation changes in various biological phenomena; from nutrition to neurological disorders.
Technical Abstract: An increasing number of investigations have established alternative polyadenylation (APA) as a key mechanism of gene regulation through altering the length of 3’untranslated region (UTR) and generating distinct mRNA termini. Further, appreciation for the significance of APA in disease contexts propelled the development of several 3' sequencing techniques. While these RNA sequencing technologies have advanced APA analysis, the intrinsic limitation of 3' read coverage and lack of appropriate computational tools constrain precise mapping and quantification of polyadenylation sites. Notably, Poly(A)-ClickSeq (PAC-seq) overcomes limiting factors such as poly(A) enrich- ment and 3’ linker ligation steps using click-chemistry. Here we provide an updated PolyA-miner protocol, a computational approach to analyze PAC-seq or other 3'-Seq datasets. As a key practical constraint, we also provide a detailed account on the impact of sequencing depth on the number of detected polyadenylation sites and APA changes. This protocol is also updated to handle unique molecular identifiers used to address PCR duplication potentially observed in PAC-seq.