Location: Virus and Prion ResearchTitle: (w)HOL(e)ISTIC gene ontology and pathway analysis of data using open source web tools
|FLEMING, DAMARIUS - Orise Fellow
Submitted to: Genomics Data
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/11/2019
Publication Date: 1/8/2020
Citation: Fleming, D.S., Miller, L.C. 2020. (w)HOL(e)ISTIC gene ontology and pathway analysis of data using open source web tools. Genomics Data. https://doi.org/10.21203/rs.2.20371/v1.
Interpretive Summary: Livestock diseases can cause dramatic economic loss to US farmers. Studying how these diseases occur may provide information that can help design better strategies to control and prevent them. Investigating how the animal responds to an infectious agent like a virus is a complex process that involves collection of samples from the infected animal and then analysing them to discover how the immune system responds. Part of this analysis is to determine the change in gene expression in response to infection. In order for researchers to gain insight into the problems they study, it is often necessary for them to apply additional tests to their results. For results, such as lists of expressed genes, it is helpful to gather information on the how the genes in the list interact with each other. One way of gathering this information is to run tests that allow for similar genes to be grouped together by a common theme, such as disease resistance. This process can utilize a variety of different software packages, but some of them are very expensive, or not useful for studying the immune response in livestock. This paper describes a method derived from open-source freely available software that can be used to group similar genes together in order to examine the different processes effecting the results of an experiment. This new approach may help scientists in their quest to better understand how diseases affect livestock which may lead to better control and prevention strategies.
Technical Abstract: Downstream analysis of high throughput and next generation sequencing experiments (NGS) often provides researchers a means of deciphering their results. These downstream analyses often elucidate clusters of genes or networks of biological interest under the condition being studied. One convention for examining gene interactions is to conduct downstream investigations based on gene ontology (GO), pathway, and network analyses of statistically significant genes of interest. Unfortunately, the software available for these types of analyses is too often expensive, not species specific, and subject to gaps in annotation. These difficulties can cause studies to omit this downstream step, limiting the utility of the data. In order to facilitate pathway and network analyses of candidate gene lists from high-throughput studies, a workflow was constructed based on the use of open sourced freely available software and genomic databases termed the “(w)HOL(e)ISTIC gene ontology enrichment” approach. The result of this downstream approach is centered around the overlap of multiple open source software to annotate, analyze, and visualize biological networks. It is a 3-stage process in which stage 1 (Annotation) is the generation of alias identifiers. Stage 2 (Analysis) is a two-part process generating ontologies and networks with statistical inferences. Stage 2 relies on info from databases such as Reactome, KEGG, and InterPro. Stage 3 (Visualization) allows for figure creation.