Regression is a statistical tool used mainly to study the association between:
- one or more predictor/explanatory/independent variables (x; plotted on x-axis) and
- an outcome/response/dependent variable (y; plotted on y-axis).
Uses:
- Prediction: Use of 1 or more independent variables to estimate the value of a dependent variable
- To control for confounders: To study the association between independent variable(s) and the outcome of interest while controlling for (adjusting) the effect of one or more variables
Types of regression based on outcome variable (y):
Remember the “NOIR” mnemonic for the types of variable.
- Continuous: Linear regression
- Reported as: coefficient
- Binomial (nominal): Logistic regression
- Reported as: odds ratio (OR)
- Time to event: Cox regression (survival analysis)
- Reported as: hazard ratio (HR)
- Count and rates: Poisson regression
- Reported as: incident rate ratio (IRR)
Types of regression based on number of independent variables (x):
- Simple: 1 x
- Resulting estimates: called crude or unadjusted
- Multiple: >1 x
- Resulting estimates: called adjusted
Regression vs Correlation:
Correlation measures the strength of the association between variables.
Regression quantifies the association. It should only be used if one of the variables is thought to precede
or cause the other.
References:
- Short notes in medical statistics for medical examinations – Dr. Mohamed Elsherif
- Medical statistics made easy – M. Harris and G. Taylor