Ordinary least squares

current hub

Write something...

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

About hubStatsRules

See all

Wikipedia

Grokipedia

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one^{[clarification needed]} effects of a linear function of a set of explanatory variables) by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable (values of the variable being observed) in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

Geometrically, this is seen as the sum of the squared distances, parallel to the axis of the dependent variable, between each data point in the set and the corresponding point on the regression surface—the smaller the differences, the better the model fits the data. The resulting estimator can be expressed by a simple formula, especially in the case of a simple linear regression, in which there is a single regressor on the right side of the regression equation.

The OLS estimator is consistent for the level-one fixed effects when the regressors are exogenous and forms perfect colinearity (rank condition), consistent for the variance estimate of the residuals when regressors have finite fourth moments and—by the Gauss–Markov theorem—optimal in the class of linear unbiased estimators when the errors are homoscedastic and serially uncorrelated. Under these conditions, the method of OLS provides minimum-variance mean-unbiased estimation when the errors have finite variances. Under the additional assumption that the errors are normally distributed with zero mean, OLS is the maximum likelihood estimator that outperforms any non-linear unbiased estimator.

Suppose the data consists of $n$ observations $\left\{\mathbf {x} _{i},y_{i}\right\}_{i=1}^{n}$ . Each observation $i$ includes a scalar response $y_{i}$ and a column vector $\mathbf {x} _{i}$ of $p$ parameters (regressors), i.e., $\mathbf {x} _{i}=\left[x_{i1},x_{i2},\dots ,x_{ip}\right]^{\operatorname {T} }$ . In a linear regression model, the response variable, $y_{i}$ , is a linear function of the regressors:

or in vector form,

where $\mathbf {x} _{i}$ , as introduced previously, is a column vector of the $i$ -th observation of all the explanatory variables; ${\boldsymbol {\beta }}$ is a $p\times 1$ vector of unknown parameters; and the scalar $\varepsilon _{i}$ represents unobserved random variables (errors) of the $i$ -th observation. $\varepsilon _{i}$ accounts for the influences upon the responses $y_{i}$ from sources other than the explanatory variables $\mathbf {x} _{i}$ . This model can also be written in matrix notation as

where $\mathbf {y}$ and ${\boldsymbol {\varepsilon }}$ are $n\times 1$ vectors of the response variables and the errors of the $n$ observations, and $\mathbf {X}$ is an $n\times p$ matrix of regressors, also sometimes called the design matrix, whose row $i$ is $\mathbf {x} _{i}^{\operatorname {T} }$ and contains the $i$ -th observations on all the explanatory variables.

Typically, a constant term is included in the set of regressors $\mathbf {X}$ , say, by taking $x_{i1}=1$ for all $i=1,\dots ,n$ . The coefficient $\beta _{1}$ corresponding to this regressor is called the intercept. Without the intercept, the fitted line is forced to cross the origin when $x_{i}={\vec {0}}$ .