# Regression¶

## Overview¶

### Linear Regression Modeling¶

In this case, sample data is fit by a linear function as formalized by:

$\begin{split} y_i = \beta_0 + \beta_1x_{i,1} + \beta_2x_{i,2} + \ldots + \beta_px_{i,p} + \epsilon_i \hspace{5pt} \forall \hspace{5pt} i \in \{1, \ldots, n\}\\ \end{split}$

where $$p$$ is the number of features, $$n$$ is the number of samples and $$\epsilon$$ is an error term with mean of zero and finite variance. Or in vector notation:

$\mathbf{y} = \mathbf{X}\mathbf{\beta} + \mathbf{\epsilon}$

where $$y$$ is a response vector $$[y_1, y_2, ..., y_n]^\mathbf{T}$$ of length $$n$$, $$\mathbf{X}$$ is a $$n \times (p + 1)$$ design matrix of features $$[\mathbf{1}, \mathbf{x_1}, \mathbf{x_2}, ..., \mathbf{x_p}]$$, and $$\mathbf{\beta}$$ is a length $$(p+1)$$ coefficient vector $$[\beta_0, \beta_1, \beta_2, ..., \beta_p]$$ with $$\beta_0$$ an intercept term. This intercept term is included in the model through data augmentation of the column of $$\mathbf{1}$$s to the design matrix. When an intercept is not sought, this column can be omitted and $$\mathbf{\beta}$$ is length $$p$$.

## Sources¶

Contributions made by our wonderful GitHub Contributors: