Location: Forage Seed and Cereal ResearchTitle: Statistical power in plant pathology research
|Gent, David - Dave|
|ESKER, PAUL - Universidad De Costa Rica|
|KRISS, ALISSA - Syngenta Crop Protection|
Submitted to: Phytopathology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/24/2017
Publication Date: 1/31/2018
Citation: Gent, D.H., Esker, P.D., Kriss, A.B. 2018. Statistical power in plant pathology research. Phytopathology. 108(1):15-22.
Interpretive Summary: When a treatment effect exists but is not detected in study it is termed a type II error. These errors are most likely to occur when studies lack sufficient statistical power to detect a treatment size of interest. Power of a statistical test depends on several factors, for example how large a treatment effect is, variability in the data, and sample size. This review presents information on why analysis of power is important, how to improve power, and some practical examples of how power analysis can be used to improve experimental design.
Technical Abstract: In null hypothesis testing, failure to reject a null hypothesis may have two potential interpretations. One interpretation is that the treatments being evaluated indeed may not have a significant effect, and a correct conclusion was reached in the study and analysis. Alternatively, a treatment effect may have existed but the study concluded there was not a significant effect. This is termed a type II error, and these errors are most likely to occur when studies lack sufficient statistical power to detect a treatment (effect) size of interest. Power of a statistical test is dependent on the size of a treatment effect (also known as the effect size), variance, sample size, and significance criterion (a threshold). Low statistical power is prevalent in scientific literature in general, including plant pathology. Power analysis is rarely reported in plant pathology literature, creating uncertainty in the interpretation of negative results and potentially underestimating small, yet biologically significant relationships. The appropriate level of power for a study depends on the impact of type I versus type II errors and no single level of power is acceptable for all purposes. Nonetheless, by convention 0.8 is often considered an acceptable threshold in many studies. While there is not a single appropriate level of power, studies with power less than 0.5 generally should not be conducted if the results of that study are intended to be conclusive. The emphasis on power analysis should be in the planning stages of an experiment. Commonly employed strategies that increase power include increasing sample sizes, selecting a less stringent level of a, increasing the hypothesized effect size, focusing studies on as few treatment groups as possible, and reducing variability of measurements through use of greater precision and inclusion of relevant covariates. The net impact of a properly conducted power analysis typically is to more efficiently use resources and better focus the research questions addressed.