Skip to main content

A systematic review of statistical power in software engineering experiments

TitleA systematic review of statistical power in software engineering experiments
Author(s)Dyba T, Kampenes V B & Sjoberg D
DetailsArticle: Information & Software Technology, 48, 2006
AbstractStatistical power is an inherent part of empirical studies that employ significance testing and is essential for the planning of studies, for the interpretation of study results, and for the validity of study conclusions. This paper reports a quantitative assessment of the statistical power of empirical software engineering research based on the 103 papers on controlled experiments (of a total of 5,453 papers) published in nine major software engineering journals and three conference proceedings in the decade 1993–2002. The results show that the statistical power of software engineering experiments falls substantially below accepted norms as well as the levels found in the related discipline of information systems research. Given this study’s findings, additional attention must be directed to the adequacy of sample sizes and research designs to ensure acceptable levels of statistical power. Furthermore, the current reporting of significance tests should be enhanced by also reporting effect sizes and confidence intervals.
DOIhttps://doi.org/10.1016/j.infsof.2005.08.009
BibTex@article{dyba-power-2006,
abstract = {Statistical power is an inherent part of empirical studies that employ significance testing and is essential for the planning of studies, for the interpretation of study results, and for the validity of study conclusions. This paper reports a quantitative assessment of the statistical power of empirical software engineering research based on the 103 papers on controlled experiments (of a total of 5,453 papers) published in nine major software engineering journals and three conference proceedings in the decade 1993–2002. The results show that the statistical power of software engineering experiments falls substantially below accepted norms as well as the levels found in the related discipline of information systems research. Given this study’s findings, additional attention must be directed to the adequacy of sample sizes and research designs to ensure acceptable levels of statistical power. Furthermore, the current reporting of significance tests should be enhanced by also reporting effect sizes and confidence intervals.},
author = {Tore Dyb{\aa} and Vigdis By Kampenes and Dag I.K. Sj{\o}berg},
doi = {https://doi.org/10.1016/j.infsof.2005.08.009},
issn = {0950-5849},
journal = {Information and Software Technology},
keywords = {Empirical software engineering, Controlled experiment, Systematic review, Statistical power, Effect size},
number = {8},
pages = {745-755},
title = {A systematic review of statistical power in software engineering experiments},
url = {https://www.sciencedirect.com/science/article/pii/S0950584905001333},
volume = {48},
year = {2006},
bdsk-url-1 = {https://www.sciencedirect.com/science/article/pii/S0950584905001333},
bdsk-url-2 = {https://doi.org/10.1016/j.infsof.2005.08.009}}
Topicsempirical software engineering, statistical power, controlled experiments