Correlation and covariance matrices

Many statistical procedures such as the ANOVA family, covariates and multivariate tests rely on either covariance and/or correlation matrices.Correlation-matrix1

Statistical assumptions such as Levene’s test for homogeneity of variance, the Box’s M test for homogeneity of variance-covariance matrices, and the assumption of sphericity specifically address the properties of the variance-covariance matrix (also referred to as the covariance matrix, or dispersion matrix).

The covariance matrix as shown below indicates the variance of the scores on the diagonal, and the covariance on the off-diagonal.

Variance is a measure of the variability or spread in a set of data, while covariance is a measure of how much two variables change, or move together in either the same direction (positive covariances) or in different directions (negative covariances). If larger values of one variable corresponds with the larger values of the other variable (and ditto for the smaller values), then the variables show similar behaviour and the covariance is a positive number. However if larger values of one variable behave together with smaller values of the other, then the covariance is negative. Therefore, the sign of the covariance shows the linear relationship between the variables.

However, the magnitude of covariance as a statistical construct is unbounded and thus difficult to interpret in its raw form (as in the above matrix). It needs to be standardised to a value bounded by -1 to +1, which we call correlations, or the correlation matrix (as shown in the matrix below).

Correlation (Pearson’s r) is the standardised form of covariance and is a measure of the direction and degree of a linear association between two variables. It is defined as the covariance of the two variables divided by the product of their standard deviations. Correlations are therefore scale-invariant (not scale dependent) which makes comparisons easier as all estimated values must fall within the bounds of -1 and +1.  Along the diagonal of the correlation matrix (see below) is the value 1 (perfect correlation), since they indicate the correlation of a variable with itself. Off the diagonal are the coefficients of the bivariate correlations.