# Which test: Predict the value (or group membership) of one variable based on the value of another based on their relationship / association

When the research objective is to use one or more predictor variables to

**predict the values (or group membership)**of one or more outcome variables, we have a choice among different statistical procedures, depending on the following variable characteristics:**Number of variables**:

One (or more) dependent / outcome variable(s) and one (or more) independent / predictor variable(s)

**Examples:**

- To what extent can we use the values of a predictor variable to predict the values of an outcome variable? (predict the values)
- Which predictor variables best predict whether a respondent will be a buyer or a non-buyer? (predict group membership)
- Based on the values of the predictor variables, can we estimate the probability that a respondent will be a buyer or a non-buyer (predict group membership)

**When the dependent variable is BINOMIAL / BINARY / DICHOTOMOUS**

- Binary Discriminant Analysis (predicts group membership)
- Simple Binary Logistic Regression (predicts group membership)
- Multiple Binary Logistic Regression (predicts group membership)
- Note that logistic regression is generally preferred over discriminant analysis for a binary outcome variable due to fewer assumptions by logistic regression. However, if data assumptions of discriminant analysis are met, it is preferred to use discriminant analysis.

**When the dependent variable is NOMINAL**

- Multiple Discriminant Analysis (predicts group membership)
- Multinomial Logistic Regression (predicts group membership)
- Preference is for Discriminant Analysis rather than Logistic Regression for a nominally (‘>‘2 levels) scaled outcome variable.

**When the dependent variable is ORDINAL / RANK-DATA**

- Ordinal Logistic Regression (predicts group membership)
- Categorical Regression (CATREG) (predicts values along a categorical outcome variable)

**When the dependent variable is INTERVAL and passed the assumption of normality (parametric data)**

- Simple Linear Regression (single predictor of values)
- Multiple Linear Regression (more than one predictors of values)
- Multivariate Multiple Linear Regression (more than one outcome and more than one predictors of values)

**When the dependent variable is INTERVAL but failed the assumption of normality (non-**

**parametric data**

**)**

- Non-parametric Regression (e.g. kernel estimation, local polynomal regression and smoothing splines to predict values).

______________________________________________________________

zza57a