Outlier cases – bivariate and multivariate outliers

August 14, 2016

In follow-up to the post about univariate outliers, there are a few ways we can identify the extent of bivariate and multivariate outliers:   First, do the univariate outlier checks and with those findings in mind (and with no immediate remedial action), follow some, or all of these bivariate or multivariate outlier identifications depending on the type of analysis you are planning.  _____________________________________________________ BIVARIATE OUTLIERS: For one-way ANOVA, we can use the GLM (univariate) procedure to save standardised or studentized residuals. Then do a normal [READ MORE]

So many regression procedures. Confused?

September 11, 2015

Regression is the work-horse  of research analytics. It has been around for a long time and it probably will be around for a long time to come. Whether we always realise it or not, most of our analytical tools are in some way or another based on the concept of correlation and regression.   Lets look at a few regression based procedures in the researchers’ toolbox:   1. Simple and multiple linear regression: Applicable if both the single dependent variable (outcome or response variable) and one or many independent variables (predictors) are measured on an interval scale. If we [READ MORE]

Interaction effects in regression

January 6, 2014

I was asked to review a report and in the regression analysis several independent variables were shown to not have a significant relationship with the dependent variable. While I have no access to the raw data, to me it was obvious that there must be at least one significant interaction effect among the independent variables and hence I decided to start off 2014 by writing about interactive effects in regression! This can be a very long discussion but to be in-line with my approach here at IntroSpective Mode, is we keep things brief and concise, and leave it up to the reader to go elsewhere [READ MORE]

When the regression work-horse looks sick

June 19, 2013

Regression, in particular simple bivariate and multiple regression (and to a much lesser extent multivariate regression which is a “multivariate general linear model” procedure) is the work-horse of many researchers. For some, it is a horse exploited to the bone when other statistical (or even non-statistical) procedures would have done a better job!  Also, many statistical procedures are based on linear regression models (often without us realising it such as the fact that the ANOVA can be explained as a simple regression model).    At the core of many statistical analytics is [READ MORE]

Data Assumptions: Its about the residuals, and not the variables’ raw data

June 3, 2013

Normality, or normal distributions is a very familiar term but what does it really mean and what does it refer to…   In linear models such as ANOVA and Regression (or any regression-based statistical procedures), an important assumptions is “normality”. The question is whether it refers to the outcome (dependent variable “Y”), or the predictor (independent variable “X”). We should remember that the true answer is “none of the above”.    In linear models where we look at the relationship between dependent and independent variables, our [READ MORE]

Building statistical models: Linear (OLS) regression

October 17, 2012

Everyday, researchers around the world collect sample data to build statistical models to be as representative of the real world as possible so these models can be used to predict changes and outcomes in the real world. Some models are very complex, while others are as basic as calculating a mean score by summating several observations and then dividing the score by the number of observations. This mean score is a hypothetical value so it is just a simple model to describe the data.   The extent to which a statistical model (e.g. the mean score) represents the randomly collected [READ MORE]