Statistical Power Analysis

July 15, 2016

(Statistical) Power Analysis refers to the ability of a statistical test to detect an effect of a certain size, if the effect really exists. In other words, power is the probability of correctly rejecting the null hypothesis when it should be rejected. So while statistical significance deals with Type I (α) errors (false positives), power analysis deals with Type II (β) errors (false negatives), which means power is 1- β Cohen (1988) recommends that research studies be designed to achieve alpha levels of at least .05 and if we use Cohen’s rule of .2 for β, then 1- β= 0.8 (an 80% [READ MORE]

Skepticism in Social Media

June 15, 2016

I was talking this morning with someone about which blogs that review products and/or services are the most popular around my part of the world – Asia. I consulted Google Search but could not come up with an answer. I did however come across a recent report (June 25, 2012) by Kristen Sala, Senior Manager, Electronic Media at Cision (a public relations software and media tools firm) that lists the Top 50 independent “Product Review Blogs” in North America. Mama-B Blog is first, followed by Computer Audiophile, and 48 others.  Still, I could not find much information [READ MORE]

Data Assumption: Multicollinearity

May 13, 2016

Very brief description Multicollinearity is a condition in which the independent variables are highly correlated (r=0.8 or greater) such that the effects of the independents on the outcome variable cannot be separated. In other words, one of the predictor variables can be nearly perfectly predicted by one of the other predictor variables.  Singularity is when the independent variables are (almost) perfectly correlated (r=1) so any one of the independent variables could be regarded as a combination of one or more of the other independent variables. In practice, you should not [READ MORE]

Is my Likert-scale data fit for parametric statistical procedures?

April 8, 2016

We’re all very familiar with the “Likert-scale” but do we know that a true Likert-scale consists not of a single item, but of several items which under the right conditions – i.e. subjected to an assessment of its reliability (e.g. intercorrelations between all pairs of items) and validity (e.g. convergent, discriminant, construct etc.) can be summed into a single score. The Likert-scale is a unidimensional scaling method (so it measures a one-dimensional construct), is bipolar, and in its purest form consists of only 5 scale points, though often we refer to a [READ MORE]

Variables and their many names

March 12, 2016

Many of the statistical procedures used by marketing researchers are based on “general linear models” (GLM). These can be categorised into univariate, multivariate, and repeated measures models.  The underlying statistical formula is Y = Xb + e where Y is generally referred to as the “dependent variable”, X as the “independent variable”, b is the “parameters” to be estimated, and e is the “error” or noise which is present in all models (also generally referred to as the statistical error, error terms, or residuals). Note that both [READ MORE]

Variables – three key types

February 10, 2016

Now here’s an easy one: What is a variable? It is simply something that varies – either its value or its characteristic. In fact, it must vary. If it does not vary then we can’t call it a VARiable, so we call it a “constant” such as the regression constant (the y-intercept).    In the equation of a straight line (linear relationship) Y = a + bX, where:    Y=dependent variable    X=independent variable    a=constant (the Y-axes intercept, or the value of Y when X=0)    b=coefficient (slope of the line, in other words the amount that Y increases [or [READ MORE]

One-Sample Chi-square (χ²) goodness-of-fit test

January 9, 2016

BRIEF DESCRIPTION The Chi-square (χ²) goodness-of-fit test is a univariate measure for categorical scaled data, such as dichotomous, nominal, or ordinal data.  It tests whether the variable’s observed frequencies differ significantly from a set of expected frequencies. For example, is our observed sample’s age distribution of 20%, 40%, 40% significantly different from what we expect (e.g. the population age distribution) of 30%, 30%, 40%. Chi-square (χ²) is a non-parametric procedure.   SIMILAR STATISTICAL PROCEDURES: Binomial goodness-of-fit (for binary data) [READ MORE]

Two research fallacies

December 9, 2015

The Research Methods Knowledge Base website reminds us researchers (and readers of research findings) of the “Two Research Fallacies”.    “A fallacy is an error in reasoning, usually based on mistaken assumptions”.   The two most serious research fallacies discussed in this article are the “ecological fallacy” and the “exception fallacy”   “The ecological fallacy occurs when you make conclusions about individuals based only on analyses of group data”.  For example, if the average income of a group of people is $60,000, we [READ MORE]

Marketing research and theory

November 12, 2015

  I was inspired by the article (or open letter) written by Terry H. Grapentine and R. Kenneth Teas entitled “From Information to Theory: it’s time for a new definition of marketing research” which appears on the AMA’s website, (accessed October 2012).  The authors debate the importance of theory in marketing research and urge for the rightful place of “theory” and the “creation of knowledge” in the American Marketing Association’s definition of marketing research.    They propose a new definition of marketing [READ MORE]

Which Test: Logistic Regression or Discriminant Function Analysis

October 8, 2015

Discriminant Function Analysis (DFA) and Logistic Regression (LR) are so similar, yet so different. Which one when, or either at any time? Lets see….   DISCRIMINANT FUNCTION ANALYSIS (DFA): Is used to model the value (exclusive group membership) of a either a dichotomous or a nominal dependent variable (outcome) based on its relationship with one or more continuous scaled independent variables (predictors). A predictive model consisting of one or more discriminant functions (based on the linear combinations of the predictor variables that provide the best discrimination between the groups) [READ MORE]

So many regression procedures. Confused?

September 11, 2015

Regression is the work-horse  of research analytics. It has been around for a long time and it probably will be around for a long time to come. Whether we always realise it or not, most of our analytical tools are in some way or another based on the concept of correlation and regression.   Lets look at a few regression based procedures in the researchers’ toolbox:   1. Simple and multiple linear regression: Applicable if both the single dependent variable (outcome or response variable) and one or many independent variables (predictors) are measured on an interval scale. If we [READ MORE]

Correlation and covariance matrices

August 30, 2015

Many statistical procedures such as the ANOVA family, covariates and multivariate tests rely on either covariance and/or correlation matrices. Statistical assumptions such as Levene’s test for homogeneity of variance, the Box’s M test for homogeneity of variance-covariance matrices, and the assumption of sphericity specifically address the properties of the variance-covariance matrix (also referred to as the covariance matrix, or dispersion matrix). The covariance matrix as shown below indicates the variance of the scores on the diagonal, and the covariance on the [READ MORE]
1 2 3 5