Statistics Principles

Variables – three key types

December 10, 2022

Now here’s an easy one: What is a variable? It is simply something that varies – either its value or it’s characteristic. In fact, it must vary. If it does not vary then we can’t call it a VARiable, so we call it a “constant” such as the regression constant (the y-intercept).    In the equation of a straight line (linear relationship) Y = a + bX, where:    Y=dependent variable    X=independent variable    a=constant (the Y-axes intercept, or the value of Y when X=0)    b=coefficient (slope of the line, in other words the amount that Y increases [READ MORE]

Tests of statistical significant can be dangerous and misleading.

September 6, 2022

Years ago we used to programme our IBM PC’s to run t-tests overnight to determine if groups of respondents differ on a series of product attributes. We then highlighted all the attributes with significant differences at p‘<‘.05, p‘<‘.01 and p‘<‘.001 levels and proudly reported to the client which attributes are differentiating and which not. However, after all these years this practice (in many different forms) is still continued by some researchers (though now calculated in a split second), and in total disregard to the validity of a [READ MORE]

Practical significance and effect size measures

April 20, 2022

If statistical significance is found (e.g. p<.001), the next logical step should be to calculate the practical significance i.e. the effect size (e.g. the standardised mean difference between two groups), which is a group of statistics that measure the magnitude differences, treatment effects, and strength of associations. Unlike statistical significance tests, effect size indices are not affected by large sample sizes (as in the case of statistical significance).    As effect size measures are standardised (units of measurement removed), they are easy to evaluate and easy to [READ MORE]

Getting the hang of z-scores

December 4, 2020

If we have a sample of data drawn randomly from a population with a normal distribution, we can assume that our sample distribution also has a normal distribution (provided a sample size of more than 30). If we have a mean of zero and a standard deviation (SD) of 1, then we can calculate the probability of getting a particular score based on the frequencies we have. To centre our data around a mean of zero, we need to subtract each individual score from the overall mean, then divide this by the standard deviation. This is the process of standardisation of raw data into z-scores. This [READ MORE]

Research questions and hypotheses?

May 20, 2020

When doing proposals or client reports, we often refer to “research questions” and “research hypotheses” (sometimes used interchangeably). What is the difference?   Research Questions do NOT entail specific predictions (magnitude or direction of the outcome variable) and are therefore phrased in a question format that could include questions about descriptives, difference or association (or relationship). These assist the researcher to choose the most appropriate statistical techniques. Let’s look at each:   1. Research questions that relate to describing [READ MORE]

Statistical Power Analysis

April 27, 2020

(Statistical) Power Analysis refers to the ability of a statistical test to detect an effect of a certain size if the effect really exists. In other words, power is the probability of correctly rejecting the null hypothesis when it should be rejected. So while statistical significance deals with Type I (α) errors (false positives), power analysis deals with Type II (β) errors (false negatives), which means power is 1- β Cohen (1988) recommends that research studies be designed to achieve alpha levels of at least .05 and if we use Cohen’s rule of .2 for β, then 1- β= 0.8 (an 80% chance [READ MORE]

Measuring effect size and statistical power analysis

March 3, 2020

Effect size measures are crucial to establish practical significance, in addition to statistical significance. Please read the post “Tests of Significant are dangerous and can be very misleading” to better appreciate the importance of practical significance. Normally we only consider differences and associations from a statistical significance point of view and report at what level e.g. p<.001 we reject the null hypothesis (H0) and accept that there is a difference or association (note that we can never “accept the alternative hypothesis (H1)” – see the [READ MORE]

Type I and II errors – Hypothesis testing

February 10, 2020

In so many statistical procedures we execute, the statistical significance of findings is the basis of statements, conclusions, and for making important decisions. While the importance of statistical significance (compared with practical significance) should never be overestimated, it is important to understand how statistical significance relates to hypothesis testing. A hypothesis statement is designed to either be disproven or failed to be disproven. (Note that a hypothesis can be disproven (or failed to be disproven), but can not be proven to be true). Hypotheses relate to either [READ MORE]

Revisiting the basics of data and measurement scales (Part 2)

January 28, 2020

The statistical procedures we choose depend on the type of data we collected with the different types of measurement scales we employed. We should be careful to understand the constructs we measure and the type of scales we employ as this will determine what statistical procedures are appropriate for analysis.   This post (in following-up to part 1) is partly based on what S.S. Stevens told us in 1946 (see “Further Reading” below) about data and scales. Please note that this classification is no science as there is a lingering debate about the classification system. [READ MORE]

Revisiting the basics of data and measurement scales (Part 1)

December 4, 2019

Way-way back in 1946 Stanley S. Stevens (b:1906 – d:1973) at Harvard University published his classic paper entitled “On the Theory of Scales of Measurement” in which he describes four statistical operations applicable to measurements made with regards to objects, and he identified a particular scale associated with each. He calls them “determination of equality” (nominal scales), “determination of great or less” (ordinal scales), “determination of equality of intervals or differences” (interval scales), and “determination of [READ MORE]

Means, sum of squares, squared differences, variance, standard deviation and standard error

March 29, 2019

I remember how confusing these terms were to me when I started learning statistics. Let me offer a brief non-technical explanation of each: When we take a random sample of observations from a population of particular interest (e.g. all our customers), we would like to do some modelling (e.g. mean or regression) so that our sample can describe and/or predict the total population of interest. The most basic model we can use is to calculate the mean score of any given variable or construct, and then conclude that it represents the population of interest.   However, before we can use the [READ MORE]

Variables and their many names

January 12, 2018

Many of the statistical procedures used by marketing researchers are based on “general linear models” (GLM). These can be categorised into univariate, multivariate, and repeated measures models.  The underlying statistical formula is Y = Xb + e where Y is generally referred to as the “dependent variable”, X as the “independent variable”, b is the “parameters” to be estimated, and e is the “error” or noise which is present in all models (also generally referred to as the statistical error, error terms, or residuals). Note that both [READ MORE]
1 2