The mix-up between correlation and causation came to my attention vividly many years ago in the context of what I would call political epidemiology. The distinction is also something statisticians learn in the classroom (or should), and is relevant with data of all volumes, velocities and varieties.
Big Data actually can be Big Trouble, as well. Imagine doing a cluster analysis with thousands of variables…when most of the data are missing, imputed or inaccurate? Harvard statistics professor Xiao-Li Meng has done some fascinating lectures on this very topic – how Big Data can be problematic – a few of which have made it onto YouTube. -Kevin Gray