Getting the hang of z-scores

If we have a sample of data drawn randomly from a population with a normal distribution, we can assume that our sample distribution also has a normal distribution (provided a sample size of more than 30). If we have a mean of zero and a standard deviation (SD) of 1, then we can calculate the probability of getting a particular score based on the frequencies we have. To centre our data around a mean of zero, we need to subtract each individual score from the overall mean, then divide this by the standard deviation. This is the process of standardisation of raw data into z-scores.
This means we transform our data to a “standard normal distribution” referred to as a “standard score” or a “z-score” which is scale-free, and has a mean equal to zero and a variance (and SD) equal to 1. Scale-free scores mean we can easily compare data collected through different calibrated scales. Standardized scores are also great for use in further analysis such as in interaction terms in multiple regression and comparing samples from different populations. In fact, effect size is essentially a kind of z-score.
Most standard statistics software can transform raw scores into z-score which is saved as a new variable. By inspection of the newly created z-scores, if a score has a value of zero, it is equal to the variable’s group mean, if positive then it is above the mean, and if negative it is below the mean. If the z-score is +1, then it is 1 standard deviation above/below the mean, if +2, then 2 standard deviation above/below the mean, and if +3, then 3 standard deviation above/below the mean. In the assessment of univariate outliers we may choose to eliminate those cases with z-scores of +2 or beyond (which approximately only 5% should be outliers). Same applies when we detect bivariate or multivariate outliers and use the z-scores of the residuals
What else can we do with z-scores? We can convert them to a percentile rank. By looking up the z-score in a z-score table (or an online calculator) we can find out what percent of the area in a normal distribution falls below the e.g. +1.5 z-score (1.5 standard deviations) which is 93.32%. Note that you should be confident that the shape of your distribution of interest is (or very close to) normally distributed!
Converting raw scores into z-scores won’t change its distribution.
  • A z-score of +1.96 cuts off the top 2.5% of the distribution and a -1.96 cuts off the bottom 2.5% of the distribution. So together it cuts off 5%, so the remaining 95% of z-scores lie between -1.96 and +1.96.
  • A z-score of +2.58 cut off 1% of scores (99% of z-scores lie between -2.58 and +2.58)
  • A z-score of +3.29 cut off 0.1% of scores  (99.9% of z-scores lie between -3.29 and +3.29)
However, standard deviations remain a thorn in a side of some people as it is not that easy to interpret. To make it easier, we can calculate the coefficient of variation (CV) which is a normalized measure of dispersion within a probability distribution. It is calculated as the standard deviation divided by the mean. Higher values mean higher variability (dispersion / variation) which indicates a higher level of inconsistencies in the attitudes of respondents. While it is fine to report top 2-box, mean scores, standard deviations, and z-scores, the coefficient of variation is the only one that indicates variability (a relative measure) rather than just dispersion (an absolute measure).

Reporting z-scores the APA-way:
Example: “Standardised z-scores were computed for the raw customer satisfaction scores. For the raw score 9.2,  z = 2.05. This z-score tells us that the specific score was well above the average customer satisfaction scores.”