Outliers

Outlier cases – univariate outliers

June 26, 2018

Discussing the causes, impact, identification and remedial action of outliers is a lengthy subject. I will keep it short by only focussing on a few ways to identify, in this post, univariate outliers. Also refer to the post entitled: Outlier cases – bivariate and multivariate outliers.   Be reminded that with bivariate and multivariate analysis, the focus should not be on univariate outliers, though it is advisable to check them but don’t take immediate remedial action.   First and foremost, do the obvious by looking at a few visuals such as histograms, stem-and-leaf plots, [READ MORE]

Getting the hang of z-scores

January 4, 2017

If we have a sample of data drawn randomly from a population with a normal distribution, we can assume that our sample distribution also has a normal distribution (provided a sample size of more than 30). If we have a mean of zero and a standard deviation (SD) of 1, then we can calculate the probability of getting a particular score based on the frequencies we have. To centre our data around a mean of zero, we need to subtract each individual score from the overall mean, then divide this by the standard deviation. This is the process of standardisation of raw data into z-scores. This [READ MORE]

Outlier cases – bivariate and multivariate outliers

August 14, 2016

In follow-up to the post about univariate outliers, there are a few ways we can identify the extent of bivariate and multivariate outliers:   First, do the univariate outlier checks and with those findings in mind (and with no immediate remedial action), follow some, or all of these bivariate or multivariate outlier identifications depending on the type of analysis you are planning.  _____________________________________________________ BIVARIATE OUTLIERS: For one-way ANOVA, we can use the GLM (univariate) procedure to save standardised or studentized residuals. Then do a normal [READ MORE]