Revisiting the basics of data and measurement scales (Part 1)

December 4, 2019

Way-way back in 1946 Stanley S. Stevens (b:1906 – d:1973) at Harvard University published his classic paper entitled “On the Theory of Scales of Measurement” in which he describes four statistical operations applicable to measurements made with regards to objects, and he identified a particular scale associated with each. He calls them “determination of equality” (nominal scales), “determination of great or less” (ordinal scales), “determination of equality of intervals or differences” (interval scales), and “determination of [READ MORE]

Reporting statistics in client reports – A few thoughts

September 20, 2019

A few thoughts on reporting statistical findings (and no, this is not academical jargon – its is the smart way to report the findings to your clients): 1. As marketing researchers we generally follow the APA (American Psychological Association) style of reporting so when reporting e.g. significance (p‘<‘.05), never put a ‘0’ in front of the decimal (i.e. p‘<‘.05) if the number cannot be greater than 1.00. So p‘<‘.05 is correct, and p‘<‘0.05 is not the proper reporting style. Statistical output of significance testing, effect size (e.g. Cohen’s d), [READ MORE]

Which Test: Chi-Square, Logistic Regression, or Log-linear analysis

August 19, 2019

In a previous post I have discussed the differences between logistic regression and discriminant function analysis, but how about log-linear analysis? Which, and when, to choose between chi-square, logistic regression, and log-linear analysis? Lets briefly review each of these statistical procedures: The chi-square test (χ²) is a descriptive statistic, just as correlation is descriptive of the association between two variables. Chi-square is not a modeling technique, so in the absence of a dependent (outcome) variable, there is no prediction of either a value (such as in ordinary [READ MORE]

When the regression work-horse looks sick

July 19, 2019

Regression, in particular, simple bivariate and multiple regression (and to a much lesser extent multivariate regression which is a “multivariate general linear model” procedure) is the work-horse of many researchers. For some, it is a horse exploited to the bone when other statistical (or even non-statistical) procedures would have done a better job!  Also, many statistical procedures are based on linear regression models (often without us realising it such as the fact that the ANOVA can be explained as a simple regression model).   At the core of many statistical analytics is [READ MORE]

Data Assumption: Normality of error term distribution

May 20, 2019

We often come across requirements in procedures such as General Linear Models (GLM) used for ANOVA’s, ANCOVA’s, etc, which state “normality of error term distribution”, “normally distributed errors” or “normality of residuals”.    These all mean the same thing: Residuals (error) must be random, normally distributed with a mean of zero, so the difference between our model and the observed data should be close to zero. Not only do residuals have to be normally distributed, but they should be normally distributed at every value of the dependent [READ MORE]

Which test: Predict the value (or group membership) of one variable based on the value of another based on their relationship / association

April 16, 2019

When the research objective is to use one or more predictor variables to predict the values (or group membership) of one or more outcome variables, we have a choice among different statistical procedures, depending on the following variable characteristics:   Number of variables:  One (or more) dependent / outcome variable(s) and one (or more) independent / predictor variable(s)   Examples:  To what extent can we use the values of a predictor variable to predict the values of an outcome variable? (predict the values) Which predictor variables best predict whether a respondent will be [READ MORE]

Means, sum of squares, squared differences, variance, standard deviation and standard error

March 29, 2019

I remember how confusing these terms were to me when I started learning statistics. Let me offer a brief non-technical explanation of each: When we take a random sample of observations from a population of particular interest (e.g. all our customers), we would like to do some modelling (e.g. mean or regression) so that our sample can describe and/or predict the total population of interest. The most basic model we can use is to calculate the mean score of any given variable or construct, and then conclude that it represents the population of interest.   However, before we can use the [READ MORE]

Brand image studies

February 18, 2019

In brand image studies, like most research, it’s GIGO (Garbage In – Garbage Out). For example, very general adjectives such as Cheerful, Fun, and Unique will seldom differentiate brands meaningfully. Instead, the attributes should be relevant to consumers, specific to the category and reflect the actual positionings of the brands and, in most cases, include functional and other objective characteristics. How the image data are collected is also important. Pick-any association matrices are usually the least differentiating. Lastly, how the data are analyzed is also important. [READ MORE]

Is my Likert-scale data fit for parametric statistical procedures?

December 13, 2018

We’re all very familiar with the “Likert-scale” but do we know that a true Likert-scale consists not of a single item, but of several items which under the right conditions – i.e. subjected to an assessment of its reliability (e.g. intercorrelations between all pairs of items) and validity (e.g. convergent, discriminant, construct etc.) can be summed into a single score. The Likert-scale is a unidimensional scaling method (so it measures a one-dimensional construct), is bipolar, and in its purest form consists of only 5 scale points, though often we refer to a [READ MORE]

Data Assumptions: Univariate Normality

November 7, 2018

BRIEF DESCRIPTION:  As one of the most basic data assumptions, much has been written about univariate, bivariate and multivariate normality. An excellent reference is by Tom Burdenski (2000) entitled Evaluating Univariate, Bivariate, and Multivariate Normality Using Graphical and Statistical Procedures. . A few noteworthy comments about normality: . 1. Normality can have different meanings in different contexts, i.e. sampling distribution normality and model error distribution (e.g. in Regression and GLM). Be very careful which type of normality is applicable.   2. By definition, a [READ MORE]

One-Sample Kolmogorov-Smirnov goodness-of-fit test

October 2, 2018

BRIEF DESCRIPTION: The Kolmogorov-Smirnov (K-S) test is a goodness-of-fit measure for continuous scaled data. It tests whether the observations could reasonably have come from the specified distribution, such as the normal distribution (or poisson, uniform, or exponential distribution, etc.), so it most frequently is used to test for the assumption of univariate normality. The categorical data counterpart is the Chi-Square (χ²) goodness-of-fit test. The K-S test is a non-parametric procedure.    SIMILAR STATISTICAL PROCEDURES: Adjusted Kolmogorov-Smirnov Lilliefors test (null [READ MORE]

Which test: Compare a single group distribution to a hypothetical / known distribution (goodness-of-fit tests)

September 19, 2018

When the research objective is to compare a single group distribution to a hypothetical / known distribution (goodness-of-fit tests), we have a choice among different statistical procedures, depending on the following variable characteristics:   The number of variables:  One dependent variable   Examples:  Does our sample data distribution fit the binomial / normal / poisson curve? Is our interval-measured sample distribution significantly different from a normal distribution (goodness-of-fit for normality)? Is the 10%/20%/20%/30%/20% age proportions in our [READ MORE]
1 3 4 5 6