# Is my Likert-scale data fit for parametric statistical procedures?

We’re all very familiar with the “Likert-scale” but do we know that a true Likert-scale consists not of a single item, but of several items which under the right conditions – i.e. subjected to an assessment of its reliability (e.g. intercorrelations between all pairs of items) and validity (e.g. convergent, discriminant, construct etc.) can be summed into a single score. The Likert-scale is a unidimensional scaling method (so it measures a one-dimensional construct), is bipolar, and in its purest form consists of only 5 scale points, though often we refer to a 7-point or even a 9 or 10-point scale as a Likert-scale.

There’s a difference between a true “Likert-scale” (a series of unidimensional items measured on a 5-point scale) and a “Likert-type scale” (which could be anything close to the true Likert-scale such as a single 10-point scale). Make sure to differentiate between these types in your reporting. Don’t write about a “single 7-point Likert-scale” as that is certainly not true. Better phrase it as a “Likert-type scale”.

Furthermore, the Likert-scale consists of a set of ordered categories which produces ordered-scale data (and not interval scaled data as we generally believe) and should technically be subjected to non-parametric tests such as the Mann-Whitney U-test, Kruskal Wallis, and Spearman’s, and not t-tests, ANOVA, or the Pearson’s product moment test.

Does it mean that all along we have used the wrong tests with Likert-scale data. Probably many times, but not always. How do we judge? Well, its not really the scale that matters as much as the underlying construct we are measuring which should be continuous. If respondents perceive the difference between adjacent levels (or labels) on the scale as equal (equidistant), then we can safely use parametric tests. Also, if assumptions such as skewness and the number of categories are met, our p and F-vaues can be perfectly valid. However, if you combine some of the categories (e.g. agree vs neutral vs disagree), its best to compare for differences with Kruskal Wallis (median ranks) or other non-parametric tests. In regression, check specifically for normality and equal variance of residuals (i.e. homoscedasticity).

A safe bet is to use an expanded version (7, 9, 10, 20-point scale) and compare results of the non-parametric test with that of the parametric counterpart. A serious discrepancy could be indicative of the violation of parametric data assumptions. Ensure that the labels that describe the points on the scale suggest more or less equal intervals between each point, and accompany the labels by visuals (such as happy-sad faces) to encourage responders to think along a continuum scale.

A more stringent alpha level such as p-values of .01 or even .001 is a safer bet with Likert-scales rather than a less conservative level of .05.

_________________________________________

**Related Posts:**

Revisiting the basics of data and measurement scales (part 1)

Revisiting the basics of data and measurement scales (Part 2)

**Further Reading:**

Carifio, J. and Perla, R. J. (2007). Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes

*Journal of Social Sciences*3 (3), 106-116. Accessed on Sep 16, 2012 at: thescipub.com/pdf/10.3844/jssp.2007.106.116_________________________________________

/zza73