Correlation and causation

“…the difference between correlation and causation does not matter if you have enough data” is complete nonsense but I still hear this assertion, even when the phrase “While correlation doesn’t mean causation…” is tossed in.
.
The mix-up between correlation and causation came to my attention vividly many years ago in the context of what I would call political epidemiology. The distinction is also something statisticians learn in the classroom (or should), and is relevant with data of all volumes, velocities and varieties.
.
Big Data actually can be Big Trouble, as well. Imagine doing a cluster analysis with thousands of variables…when most of the data are missing, imputed or inaccurate? Harvard statistics professor Xiao-Li Meng has done some fascinating lectures on this very topic – how Big Data can be problematic – a few of which have made it onto YouTube. -Kevin Gray