# Test statistics and significance

A test statistic such as the F-test, t-test, or the χ² test, all look at the proportion of variance explained (effect) by our model versus variance not explained (error) by our model. Our model can be as basic as a mean score which is calculated as the sum of the observed scores divided by the number of observations included. If this proportion is >1, then the variance explained (effect) is larger than the variance not explained (error). The higher this proportion the better our model.

Lets say it is 5 (rather than 1), so the proportion of explained variance (effect) is 5 times higher than the not-explained variance (error). At a 95% confidence level, if the probability to get the test statistic is lower than 95% (p<.05), then it means the probability of getting this high test statistic by chance, is very low (so there very likely is a real effect), so we are very certain the high statistic is not by chance (and we reject the null hypothesis that there is no effect (differences or association)).

If the test statistic is low, then it is more likely it could be by chance and therefore may show as not significant. However, keep in mind that significance depends on sample size and with larger sample sizes it is more likely to detect significance so even though your test statistic may be small (or even very small), if may falsely be indicated as significant (Type I error).

Therefore we need to calculate the

*effect size*to determine the magnitude of the significance (the practical significance). See Cohen’s rules on what is significant and what not. Also, what is the possibility that we made a Type II error (saying there is no difference [failing to reject the null hypothesis] when in fact there is a difference. So we fail to accurately detect a difference. To prevent a Type II error, we need to determine the statistical power. Both effect size and statistical power can be calculated with the G*Power calculator._________________________________________

**Related Posts:**

Statistical power analysis

Practical significance and effect size measures

Measuring affect size and statistical power analysis

Tests of statistical significant can be dangerous and misleading

**Further Reading:**

Ziliak, S. T. and McCloskey, D. N., “The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives”, The University of Michigan Press, Ann Arbor, 2008.

Ioannidis, J. P. A. (2005) Why Most Published Research Findings Are False, PLOS Med, PLoS Med. 2005, 2e, 124.

Sterne, J. A. C. and Smith, G. D. (2001) Sifting the evidence—what’s wrong with significance tests? BMJ

“Statistical Power Analysis for the Behavioral Sciences”, Cohen 2008

Ioannidis, J. P. A. (2005) Why Most Published Research Findings Are False, PLOS Med, PLoS Med. 2005, 2e, 124.

Sterne, J. A. C. and Smith, G. D. (2001) Sifting the evidence—what’s wrong with significance tests? BMJ

“Statistical Power Analysis for the Behavioral Sciences”, Cohen 2008

_________________________________________

/zza88