Way-way back in 1946 Stanley S. Stevens (b:1906 – d:1973) at Harvard University published his classic paper entitled “On the Theory of Scales of Measurement” in which he describes four statistical operations applicable to measurements made with regards to objects, and he identified a particular scale associated with each. He calls them “determination of equality” (nominal scales), “determination of great or less” (ordinal scales), “determination of equality of intervals or differences” (interval scales), and “determination of equality of ratios” (ratio scales).
Yes, I know we all know what nominal, ordinal, interval and ratio scales are, and yes we always skip these chapters in research handbooks, but maybe it is time we revisit them and make sure we differentiate accurately among terms related to measurement scales such as “true interval scale”, “continuous”, “categorical”, “discrete”, “qualitative”, “quantitative”, “parametric”, ‘non-parametric’, etc.
What is striking in the paper of Stevens is that he defined interval scales as “determination of equality of intervals or differences” but he admitted that psychological measurement (he was working in the “Psycho-Acoustic Lab” at Harvard) “aspires to create interval scales, and it sometimes succeeds”. Sometimes succeed! He goes further to write that “the problem usually is to devise operations for equalising the units of the scales”.
It is 66 years since Stevens wrote that few measurements succeed to use interval scales accurately (as they mostly fail to reflect equality of intervals), and it seems that as researchers we have accepted the fact that we don’t truly comply with the theory of what the interval scale really is (or wait, maybe we are just unaware of, or have forgotten, what a real interval scale is).
Lets quickly review scales of measures and the terminology as it is vital to understand what they are so we can select the most appropriate statistical procedures.
NOMINAL, ORDINAL, INTERVAL AND RATIO DATA
Nominal scales: If the data can be assigned by only two values (e.g. 0, 1), we refer to it as dichotomous or binary. If more than two categories, it is nominal so the observations can be assigned a few codes or labels. These categories can be counted (frequency or count data) but they can’t be ordered or measured. Examples include gender where females=0 and males=1, or four groups of colours coded as 1, 2, 3, and 4.
Ordinal scales: Stevens described this group as “isotonic or order-preserving” as the observations can be ranked (or ordered), but can not be measured. Most of the scales we use such as 5-point rating scales with labels such as 1=”strongly dislike”, 2=”somewhat dislike”, 3=”neutral”, 4=”somewhat like” and 5=”strongly like” only means that a rating of 5 indicates more enjoyment (liking) than a rating of 4, and 4 is more than 3 – without any indication of the differences in distance. In fact, Stevens wrote that “most of the scales used widely and effectively by psychologists are ordinal scales” (and please count in marketing researchers). In the strictest propriety the ordinary statistics involving means and standard deviations ought not to be used with these scales, “for these statistics imply a knowledge of something more than the relative rank-order of data.” So its clear that ordinal ranking only provides relative positions in an ordered series, and no real measure of the extent or magnitude as there is no indication and no real perception of the amount of differences between the orders. But wait! While this implies that our 5-point Likert scale is ordinal, we really should not judge by the scale alone, but by the construct under measurement. See the post entitled Is my Likert-Scale Data Fit for Parametric Statistical Procedures?
Interval scales: Where the distance between any two adjacent units of measurement (or ‘intervals’) is clearly implied to be the same, even though the zero point is arbitrary, or could be absent, we can talk about an interval scale. Examples include temperature scales such as Celsius and Fahrenheit which have an arbitrary zero point, so 40C or 40F is not twice as hot as 20C or 20F. In contrast, the Kelvin temperature scale has an absolute zero point (the temperature at which all thermal motion ceases in the classical description of thermodynamics), and is thus a true ratio scale. Scores on an interval scale can be added and subtracted (and you can calculate changes in terms of scale values) but scores can not be meaningfully multiplied or divided as we have no true zero point of reference which means we can not, and shall not, calculate percentages or ratios. Interval scales are most commonly used in marketing research to measure human attitudes, perceptions, feelings, preferences, likes etc. It is important to note that one should be very careful when making statements about the relationships among the mean scores or ratings between two or more groups. This applies between groups and within a group. As we don’t have a true zero reference point, we should never say that a rating score of 10/10 is twice as good as 5/10. This can only be expressed for ratio scales.
Ratioscales: Stevens described ratio scaled data as possible only when there exists operations for determining all four relations: equality (nominal), rank-order (ordinal), equality of intervals (interval), and equality of ratios (ratio). Ratio scales have a true zero point so we can calculate changes in terms of percentages and scale values (such as income, years of experience, and also firmographics such as gross dollar sales, etc).
Note that for marketing research purposes we normally just combine interval and ratio data in “interval+ data” or just “interval data”. And also in marketing research, we too often treat ordinal scales as interval. However, regardless of what we normally do, be cognizant of what real interval scales are.