"In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable... and one or more independent variables..."
Linear regression analysis, logistic regression analysis, and other regression techniques.
Basic concepts: This includes understanding what regression analysis is, the types of variables involved in regression analysis, and the assumptions underlying regression analysis.
Simple linear regression: In simple linear regression, we examine the relationship between one outcome variable and one predictor variable. We also learn about the assumptions of simple linear regression and how to interpret results.
Multiple linear regression: This is similar to simple linear regression but involves more than one predictor variable. We learn how to interpret the results of a multiple linear regression model and how to assess the assumptions of the model.
Logistic regression: Logistic regression is used when the outcome variable is binary. It is used to model the relationship between a binary outcome variable and one or more predictor variables.
Correlation and causation: We learn about the difference between correlation and causation, and how regression analysis can help us identify causal relationships.
Modeling techniques: We learn about different modeling techniques used in regression analysis, such as stepwise regression, hierarchical regression, and Bayesian regression.
Model selection: We learn about different methods of model selection, such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). These methods help us select the best model from a set of competing models.
Model diagnostics: We learn how to check the assumptions of a regression model and how to diagnose problems with the model. We also learn about different methods for handling outliers and influential observations.
Nonlinear regression: Nonlinear regression is used when the relationship between the outcome variable and predictor variable(s) is not linear. We learn about different types of nonlinear regression models, such as exponential, power, and logistic models.
Time series regression: Time series regression is used when the outcome variable is a time series. We learn about different methods for modeling time series data, such as autoregressive models and moving average models.
Survival analysis: Survival analysis is used when the outcome variable is time to an event, such as death or disease. We learn about different methods for analyzing survival data, such as Kaplan-Meier curves and Cox proportional hazards models.
Simple Linear Regression: Used to model the relationship between one independent variable and one dependent variable through a linear equation.
Multiple Linear Regression: Used when there is more than one independent variable to model the relationship with a dependent variable.
Polynomial Regression: Used when there is a nonlinear relationship between the independent and dependent variables to account for curvature.
Logistic Regression: Used when the dependent variable is dichotomous, modeling the probability of an event occurring given certain predictors.
Cox Regression: Used when the dependent variable is time until an event occurs, modeling the hazard rate as a function of the predictors.
Poisson Regression: Used when the dependent variable is a count or rate, modeling the number of events occurring in a given period based on predictors.
Negative Binomial Regression: Used when there is overdispersion or extra-variance in the Poisson model, allowing for more flexibility.
Ridge Regression: Used when multicollinearity is present in the independent variables, adding a penalty term to shrink the coefficients towards zero.
Lasso Regression: Similar to ridge, but instead of adding a penalty term to all variables, it shrinks some coefficients entirely to zero.
Elastic Net Regression: A combination of ridge and lasso regression, used when there is both multicollinearity and a large number of predictors.
Time-Series Regression: Used when the data is collected sequentially over time, allowing for modeling the temporal relationship between variables.
Bayesian Regression: Used to estimate parameters based on prior knowledge or beliefs in addition to the observed data.
"Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting... Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables."
"The most common form of regression analysis is linear regression..."
"For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane)."
"... this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values."
"Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression)."
"...where its use has substantial overlap with the field of machine learning."
"Regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset."
"To use regressions for prediction... a researcher must carefully justify why existing relationships have predictive power for a new context."
"The latter is especially important when researchers hope to estimate causal relationships using observational data."
"...often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance."
"...often called 'predictors', 'covariates', 'explanatory variables' or 'features'."
"...computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane)."
"Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis)..."
"Yes, nonparametric regression can be used to estimate the conditional expectation across a broader collection of non-linear models."
"A researcher must carefully justify why existing relationships have predictive power for a new context."
"The latter is especially important when researchers hope to estimate causal relationships using observational data."
"...this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable..."
"... regression analysis is widely used for prediction and forecasting..."
"For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane)."