The Research Methods Knowledge Base website reminds us researchers (and readers of research findings) of the “Two Research Fallacies”. “A fallacy is an error in reasoning, usually based on mistaken assumptions”. The two most serious research fallacies discussed in this article are the “ecological fallacy” and the “exception fallacy” “The ecological fallacy occurs when you make conclusions about individuals based only on analyses of group data”. For example, if the average income of a group of people is $60,000, we [READ MORE]

Discussing the causes, impact, identification and remedial action of outliers is a lengthy subject. I will keep it short by only focussing on a few ways to identify, in this post, univariate outliers. Also refer to the post entitled: Outlier cases – bivariate and multivariate outliers. Be reminded that with bivariate and multivariate analysis, the focus should not be on univariate outliers, though it is advisable to check them but don’t take immediate remedial action. First and foremost, do the obvious by looking at a few visuals such as histograms, stem-and-leaf plots, [READ MORE]

BRIEF DESCRIPTION The Two-independent sample t-test is for continuous scaled data and it compares the observed mean on the dependent variable between two groups as defined by the independent variable. For example, is the mean customer satisfaction score (on the dependent variable) significantly different between men and women (on the independent variable). The t-test is a parametric procedure. SIMILAR STATISTICAL PROCEDURES Non-parametric counterparts of the Two-independent t-test include the (Wilcoxon) Mann-Whitney U-test (non-parametric), Wald-Wolfowitz Runs [READ MORE]

Everyday, researchers around the world collect sample data to build statistical models to be as representative of the real world as possible so these models can be used to predict changes and outcomes in the real world. Some models are very complex, while others are as basic as calculating a mean score by summating several observations and then dividing the score by the number of observations. This mean score is a hypothetical value so it is just a simple model to describe the data. The extent to which a statistical model (e.g. the mean score) represents the randomly collected [READ MORE]

I was asked to review a report and in the regression analysis several independent variables were shown to not have a significant relationship with the dependent variable. While I have no access to the raw data, to me it was obvious that there must be at least one significant interaction effect among the independent variables and hence I decided to start off 2014 by writing about interactive effects in regression! This can be a very long discussion but to be in-line with my approach here at IntroSpective Mode, is we keep things brief and concise, and leave it up to the reader to go elsewhere [READ MORE]

We’re all very familiar with the “Likert-scale” but do we know that a true Likert-scale consists not of a single item, but of several items which under the right conditions – i.e. subjected to an assessment of its reliability (e.g. intercorrelations between all pairs of items) and validity (e.g. convergent, discriminant, construct etc.) can be summed into a single score. The Likert-scale is a unidimensional scaling method (so it measures a one-dimensional construct), is bipolar, and in its purest form consists of only 5 scale points, though often we refer to a [READ MORE]

Many of the statistical procedures used by marketing researchers are based on “general linear models” (GLM). These can be categorised into univariate, multivariate, and repeated measures models. The underlying statistical formula is Y = Xb + e where Y is generally referred to as the “dependent variable”, X as the “independent variable”, b is the “parameters” to be estimated, and e is the “error” or noise which is present in all models (also generally referred to as the statistical error, error terms, or residuals). Note that both [READ MORE]

In brand image studies, like most research, it’s GIGO (Garbage In – Garbage Out). For example, very general adjectives such as Cheerful, Fun, and Unique will seldom differentiate brands meaningfully. Instead, the attributes should be relevant to consumers, specific to the category and reflect the actual positionings of the brands and, in most cases, include functional and other objective characteristics. How the image data are collected is also important. Pick-any association matrices are usually the least differentiating. Lastly, how the data are analyzed is also important. [READ MORE]

“…the difference between correlation and causation does not matter if you have enough data” is complete nonsense but I still hear this assertion, even when the phrase “While correlation doesn’t mean causation…” is tossed in. The mix-up between correlation and causation came to my attention vividly many years ago in the context of what I would call political epidemiology. The distinction is also something statisticians learn in the classroom (or should), and is relevant with data of all volumes, velocities and varieties. Big Data actually can be Big [READ MORE]

Here is a great article by my friend Andy Field about sphericity. If you are looking for a great intro to SPSS, check out this book by Andy. When it came out in 2013 I worked through it – front to end – and I learned a lot, and it refreshed my memory of many things I have forgotten. He writes in an easy to understand way! [READ MORE]

Very brief description The assumption of sphericity refers to the equality of variances of the differences between treatment levels. In Repeated Measures ANOVA it is a measure of the homogeneity of the variances of the differences between levels so it is quite similar to homogeneity of variance in between-groups in the univariate ANOVA. It is denoted by ε and sometimes referred to as “circularity”. Who cares Sphericity applies to repeated measures ANOVA and MANOVA. While technically not an assumption of Factor Analysis, “Bartlett’s test of [READ MORE]

Very brief description: Linearity means that mean values of the outcome variable (dependent variable) for each increment of the predictors (independent variables) lie along a straight line (so we are modeling a straight relationship). Who cares The assumption of linearity is required by all multivariate techniques based on correlation measures of association e.g. Regression, Logistics Regression, Factor Analysis, Structural Equation Modeling, Discriminant Analysis, General Linear Models, etc. Why it is important Most, if not all of the tests of association / relationships that we [READ MORE]