correlation and causation

December 13, 2017

 “…the difference between correlation and causation does not matter if you have enough data” is complete nonsense but I still hear this assertion, even when the phrase “While correlation doesn’t mean causation…” is tossed in. The mix-up between correlation and causation came to my attention vividly many years ago in the context of what I would call political epidemiology. The distinction is also something statisticians learn in the classroom (or should), and is relevant with data of all volumes, velocities and varieties. Big Data actually can be Big [READ MORE]

Outlier cases – bivariate and multivariate outliers

August 14, 2016

In follow-up to the post about univariate outliers, there are a few ways we can identify the extent of bivariate and multivariate outliers:   First, do the univariate outlier checks and with those findings in mind (and with no immediate remedial action), follow some, or all of these bivariate or multivariate outlier identifications depending on the type of analysis you are planning.  _____________________________________________________ BIVARIATE OUTLIERS: For one-way ANOVA, we can use the GLM (univariate) procedure to save standardised or studentized residuals. Then do a normal [READ MORE]

Correlation and covariance matrices

August 30, 2015

Many statistical procedures such as the ANOVA family, covariates and multivariate tests rely on either covariance and/or correlation matrices. Statistical assumptions such as Levene’s test for homogeneity of variance, the Box’s M test for homogeneity of variance-covariance matrices, and the assumption of sphericity specifically address the properties of the variance-covariance matrix (also referred to as the covariance matrix, or dispersion matrix). The covariance matrix as shown below indicates the variance of the scores on the diagonal, and the covariance on the [READ MORE]

Data Assumption: Homogeneity of variance-covariance matrices (Multivariate Tests)

October 15, 2013

Very brief description: “Homogeneity of variance-covariance matrices” is the multivariate version of the univariate assumption of Homogeneity of variance and the bivariate assumption of Homoscedasticity. Refer to the post “Homogeneity of variance” for a discussion of equality of variances. In short, homogeneity of variance-covariance matrices concerns the variance-covariance matrices of the multiple dependent measures (such as in MANOVA) for each group. For example, if you have five dependent variables, it tests for five correlations and ten covariances for [READ MORE]