Hubbry Logo
search button
Sign in
Residual sum of squares
Residual sum of squares
Comunity Hub
History
arrow-down
starMore
arrow-down
bob

Bob

Have a question related to this hub?

bob

Alice

Got something to say related to this hub?
Share it here.

#general is a chat channel to discuss anything related to the hub.
Hubbry Logo
search button
Sign in
Residual sum of squares
Community hub for the Wikipedia article
logoWikipedian hub
Welcome to the community hub built on top of the Residual sum of squares Wikipedia article. Here, you can discuss, collect, and organize anything related to Residual sum of squares. The purpose of the hub...
Add your contribution
Residual sum of squares

In statistics, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared estimate of errors (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model, such as a linear regression. A small RSS indicates a tight fit of the model to the data. It is used as an optimality criterion in parameter selection and model selection.

In general, total sum of squares = explained sum of squares + residual sum of squares. For a proof of this in the multivariate ordinary least squares (OLS) case, see partitioning in the general OLS model.

One explanatory variable

[edit]

In a model with a single explanatory variable, RSS is given by:[1]

where yi is the ith value of the variable to be predicted, xi is the ith value of the explanatory variable, and is the predicted value of yi (also termed ). In a standard linear simple regression model, , where and are coefficients, y and x are the regressand and the regressor, respectively, and ε is the error term. The sum of squares of residuals is the sum of squares of ; that is

where is the estimated value of the constant term and is the estimated value of the slope coefficient .

Matrix expression for the OLS residual sum of squares

[edit]

The general regression model with n observations and k explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is

where y is an n × 1 vector of dependent variable observations, each column of the n × k matrix X is a vector of observations on one of the k explanators, is a k × 1 vector of true coefficients, and e is an n× 1 vector of the true underlying errors. The ordinary least squares estimator for is

The residual vector ; so the residual sum of squares is:

(equivalent to the square of the norm of residuals). In full:

where H is the hat matrix, or the projection matrix in linear regression.

Relation with Pearson's product-moment correlation

[edit]

The least-squares regression line is given by

where and , where and

Therefore,

where

The Pearson product-moment correlation is given by therefore,

See also

[edit]

References

[edit]
  1. ^ Archdeacon, Thomas J. (1994). Correlation and regression analysis : a historian's guide. University of Wisconsin Press. pp. 161–162. ISBN 0-299-13650-7. OCLC 27266095.
  • Draper, N.R.; Smith, H. (1998). Applied Regression Analysis (3rd ed.). John Wiley. ISBN 0-471-17082-8.