Discriminant Function Analysis (DFA) and Logistic Regression (LR) are so similar, yet so different. Which one when, or either at any time? Lets see…. DISCRIMINANT FUNCTION ANALYSIS (DFA): Is used to model the value (exclusive group membership) of a either a dichotomous or a nominal dependent variable (outcome) based on its relationship with one or more continuous scaled independent variables (predictors). A predictive model consisting of one or more discriminant functions (based on the linear combinations of the predictor variables that provide the best discrimination between the groups) [READ MORE]

Regression is the work-horse of research analytics. It has been around for a long time and it probably will be around for a long time to come. Whether we always realise it or not, most of our analytical tools are in some way or another based on the concept of correlation and regression. Lets look at a few regression based procedures in the researchers’ toolbox: 1. Simple and multiple linear regression: Applicable if both the single dependent variable (outcome or response variable) and one or many independent variables (predictors) are measured on an interval scale. If we [READ MORE]

Many statistical procedures such as the ANOVA family, covariates and multivariate tests rely on either covariance and/or correlation matrices. Statistical assumptions such as Levene’s test for homogeneity of variance, the Box’s M test for homogeneity of variance-covariance matrices, and the assumption of sphericity specifically address the properties of the variance-covariance matrix (also referred to as the covariance matrix, or dispersion matrix). The covariance matrix as shown below indicates the variance of the scores on the diagonal, and the covariance on the [READ MORE]

The statistical procedures we choose depend on the type of data we collected with the different types of measurement scales we employed. We should be careful to understand the constructs we measure and the type of scales we employ as this will determine what statistical procedures are appropriate for analysis. This post (in following-up to part 1) is partly based on what S.S. Stevens told us in 1946 (see “Further Reading” below) about data and scales. Please note that this classification is no science as there is a lingering debate about the classification [READ MORE]

Way-way back in 1946 Stanley S. Stevens (b:1906 – d:1973) at Harvard University published his classic paper entitled “On the Theory of Scales of Measurement” in which he describes four statistical operations applicable to measurements made with regards to objects, and he identified a particular scale associated with each. He calls them “determination of equality” (nominal scales), “determination of great or less” (ordinal scales), “determination of equality of intervals or differences” (interval scales), and “determination [READ MORE]

If statistical significance is found (e.g. p<.001), the next logical step should be to calculate the practical significance i.e. the effect size (e.g. the standardised mean difference between two groups), which is a group of statistics that measure the magnitude differences, treatment effects, and strength of associations. Unlike statistical significance tests, effect size indices are not affected by large sample sizes (as in the case of statistical significance). As effect size measures are standardised (units of measurement removed), they are easy to evaluate and easy to [READ MORE]

BRIEF DESCRIPTION The Analysis of Covariance (ANCOVA) follows the same procedures as the ANOVA except for the addition of an exogenous variable (referred to as a covariate) as an independent variable. The ANCOVA procedure is quite straightforward: It uses regression to determine if the covariate can predict the dependent variable and then does a test of differences (ANOVA) of the residuals among the groups. If there remains a significant difference among the groups, it signifies a significant difference between the dependent variable and the predictors after the effect of the [READ MORE]

Lets have a quick review of how we measure importance e.g. of attributes in purchase decisions or in customer satisfaction, etc. Traditionally we looked at stated importance but generally we give preference to derived importance. So we’ve been taught. Stated importance can be divided into the constrained methods (e.g. a 5-point rating scale, constant sum methods, Q-sort, and rank order) and unconstrained methods which are unbounded rating scales and open-ended questions. On the other (better) hand, derived importance can be established via correlation-based methods such as [READ MORE]

In so many statistical procedures we execute, statistical significance of findings is the basis of statements, conclusions, and for making important decisions. While the importance of statistical significance (compared with practical significance) should never be overestimated, it is important to understand how statistical significance relates to hypothesis testing. A hypothesis statement is designed to either be disproven or failed to be disproven. (Note that a hypothesis can be disproven (or failed to be disproven), but can not proven to be true). Hypotheses relate to either [READ MORE]

BRIEF DESCRIPTION The Kolmogorov-Smirnov (K-S) test is a goodness-of-fit measure for continuous scaled data. It tests whether the observations could reasonably have come from the specified distribution, such as the normal distribution (or poisson, uniform, or exponential distribution, etc.), so it most frequently is used to test for the assumption of univariate normality. The categorical data counterpart is the Chi-Square (χ²) goodness-of-fit test. The K-S test is a non-parametric procedure. SIMILAR STATISTICAL PROCEDURES: Adjusted Kolmogorov-Smirnov Lilliefors test (null [READ MORE]

Very brief description As one of the most basic data assumptions, much has been written about univariate, bivariate and multivariate normality. An excellent reference is by Tom Burdenski (2000) entitled Evaluating Univariate, Bivariate, and Multivariate Normality Using Graphical and Statistical Procedures. A few noteworthy comments about normality: 1. Normality can have different meanings in different contexts, i.e. sampling distribution normality and model error distribution (e.g. in Regression and GLM). Be very careful which type of normality is applicable. 2. By definition, a [READ MORE]

A few thoughts on reporting statistical findings (and no, this is not academical jargon – its is the smart way to report the findings to your clients): 1. As marketing researchers we generally follow the APA (American Psychological Association) style of reporting so when reporting e.g. significance (p‘<‘.05), never put a ‘0’ in front of the decimal (i.e. p‘<‘.05) if the number cannot be greater than 1.00. So p‘<‘.05 is correct, and p‘<‘0.05 is not the proper reporting style. Statistical output of significance testing, effect size (e.g. Cohen’s d), [READ MORE]