(Statistical) Power Analysis refers to the ability of a statistical test to detect an effect of a certain size, if the effect really exists. In other words, power is the probability of correctly rejecting the null hypothesis when it should be rejected. So while statistical significance deals with Type I (α) errors (false positives), power analysis deals with Type II (β) errors (false negatives), which means power is 1- β
Cohen (1988) recommends that research studies be designed to achieve alpha levels of at least .05 and if we use Cohen’s rule of .2 for β, then 1- β= 0.8 (an 80% chance of detecting an effect).
We often employ power analysis during the sampling phase (a priori) to determine the required sample size to achieve the .05 significance and .80 power minimum cut-offs. Alternatively, or additionally, we determine power when evaluating the data (post hoc) to determine the likelihood of a Type II error but if you found insufficient power, its a bit too late to get back to the sampling phase!
If we calculate a power of .80 or more, then we can be confident that we achieved sufficient power to detect any effects that might have existed. If p< .8, then may want to replicate the study using a larger sample to increase the power (if post hoc) or if a priori, to plan for a larger sample size.
Note that there is a direct and positive correlation between sample size and power. Also, smaller effect sizes normally require larger sample sizes to achieve the desired power. As can be expected, the more stringent significance levels we choose (e.g. .0001 rather than .01), the larger the sample size required to achieve the desired power.
Much have been written about power analysis so I will not offer further details now.
To calculate power, download the small G*Power calculator which is available for both PC and Mac.
As we know the values of the Type I error probability (α) and the Type II heuristic (β) as well as our required Cohen’s d (effect size), we can calculated the minimum sample size. For a post hoc power evaluation, we can substitute the known sample size with the unknown power value.
Once we have the critical elements of statistical significance, practical significance (effect size), and statistical power, then we can make a statement as follows:
“The difference is significant at p< .05, with an effect size of .69 which is a moderate (and typical effect), and the power is .90 (well above Cohen’s suggested .08) so the probability that I make any error type is remote. So I am very confident in saying there is a significant, moderate, and correctly identified difference between my two test groups”.
Measuring effect size and statistical power analysis
Practical significance and effect size measures
Cohen, J. (1988). Statistical power analysis for the behavior sciences. (2nd ed.). Hillsdale, NJ: Erlbaum