Degrees of freedom:

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df).

In general, the degrees of freedom of an estimate is equal to the number of independent scores that go into the estimate minus the number of parameters estimated as intermediate steps in the estimation of the parameter itself (which, in sample variance, is one, since the sample mean is the only intermediate step).

ADVERTISEMENTS:

Mathematically, degrees of freedom are the dimension of the domain of a random vector, or essentially the number of ‘free’ components: how many components need to be known before the vector is fully determined.

The term is most often used in the context of linear models (linear regression, analysis of variance), where certain random vectors are constrained to lie in linear subspaces, and the number of degrees of freedom is the dimension of the subspace. The degrees-of-freedom are also commonly associated with the squared lengths (or “Sum of Squares”) of such vectors, and the parameters of chi-squared and other distributions that arise in associated statistical testing problems.

Degrees of freedom parameters in probability distributions :

Several commonly encountered statistical distributions (Student’s t, Chi-Squared, F) have parameters that are commonly referred to as degrees of freedom. This terminology simply reflects that in many applications where these distributions occur, the parameter corresponds to the degrees of freedom of an underlying random vector, as in the preceding ANOVA example. Another simple example is: if Xi: i = 1, …, n are independent normal (u,a2) random variables, the statistic follows a chi-squaned distribution with n-1 degrees of freedom. Here, the degrees of freedom arises from the residual sum-of- squares in the numerator, and in turn the n-1 degrees of freedom of the underlying residual vector (X1 – X).

ADVERTISEMENTS:

In the application of these distributions to 1 linear models, the degrees of freedom parameters can take only integer values. The underlying families of distributions allow fractional values for the degrees-of- freedom parameters, which can arise in more sophisticated uses.

One set of examples is problems where chi-squared approximations based on effective degrees of freedom are used. In other applications, such as modelling heavy-tailed data, a t or F distribution may be used as an empirical model. In these cases, there is no particular degrees of freedom interpretation to the distribution parameters, even though the terminology may continue to be used.