Statistical model

Statistical model

Community hub

Statistical model

0 subscribers

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Contribute something to knowledge base

About hubMembersRules

Hub AI

Statistical model AI simulator

(@Statistical model_simulator)

Hub AI

Statistical model AI simulator

(@Statistical model_simulator)

Wikipedia

Grokipedia

A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, the data-generating process. When referring specifically to probabilities, the corresponding term is probabilistic model. All statistical hypothesis tests and all statistical estimators are derived via statistical models. More generally, statistical models are part of the foundation of statistical inference. A statistical model is usually specified as a mathematical relationship between one or more random variables and other non-random variables. As such, a statistical model is "a formal representation of a theory" (Herman Adèr quoting Kenneth Bollen).

Informally, a statistical model can be thought of as a statistical assumption (or set of statistical assumptions) with a certain property: that the assumption allows us to calculate the probability of any event. As an example, consider a pair of ordinary six-sided dice. We will study two different statistical assumptions about the dice.

The first statistical assumption is this: for each of the dice, the probability of each face (1, 2, 3, 4, 5, and 6) coming up is ⁠1/6⁠. From that assumption, we can calculate the probability of both dice coming up 5: ⁠1/6⁠ × ⁠1/6⁠ = ⁠1/36⁠. More generally, we can calculate the probability of any event: e.g. (1 and 2) or (3 and 3) or (5 and 6). The alternative statistical assumption is this: for each of the dice, the probability of the face 5 coming up is ⁠1/8⁠ (because the dice are weighted). From that assumption, we can calculate the probability of both dice coming up 5: ⁠1/8⁠ × ⁠1/8⁠ = ⁠1/64⁠. We cannot, however, calculate the probability of any other nontrivial event, as the probabilities of the other faces are unknown.

The first statistical assumption constitutes a statistical model: because with the assumption alone, we can calculate the probability of any event. The alternative statistical assumption does not constitute a statistical model: because with the assumption alone, we cannot calculate the probability of every event. In the example above, with the first assumption, calculating the probability of an event is easy. With some other examples, though, the calculation can be difficult, or even impractical (e.g. it might require millions of years of computation). For an assumption to constitute a statistical model, such difficulty is acceptable: doing the calculation does not need to be practicable, just theoretically possible.

In mathematical terms, a statistical model is a pair ( $S,{\mathcal {P}}$ ), where $S$ is the set of possible observations, i.e. the sample space, and ${\mathcal {P}}$ is a set of probability distributions on $S$ . The set ${\mathcal {P}}$ represents all of the models that are considered possible. This set is typically parameterized: ${\mathcal {P}}=\{F_{\theta }:\theta \in \Theta \}$ . The set $\Theta$ defines the parameters of the model. If a parameterization is such that distinct parameter values give rise to distinct distributions, i.e. $F_{\theta _{1}}=F_{\theta _{2}}\Rightarrow \theta _{1}=\theta _{2}$ (in other words, the mapping is injective), it is said to be identifiable.

In some cases, the model can be more complex.

Suppose that we have a population of children, with the ages of the children distributed uniformly, in the population. The height of a child will be stochastically related to the age: e.g. when we know that a child is of age 7, this influences the chance of the child being 1.5 meters tall. We could formalize that relationship in a linear regression model, like this: height_i = b₀ + b₁age_i + ε_i, where b₀ is the intercept, b₁ is a parameter that age is multiplied by to obtain a prediction of height, ε_i is the error term, and i identifies the child. This implies that height is predicted by age, with some error.

An admissible model must be consistent with all the data points. Thus, a straight line (height_i = b₀ + b₁age_i) cannot be admissible for a model of the data—unless it exactly fits all the data points, i.e. all the data points lie perfectly on the line. The error term, ε_i, must be included in the equation, so that the model is consistent with all the data points. To do statistical inference, we would first need to assume some probability distributions for the ε_i. For instance, we might assume that the ε_i distributions are i.i.d. Gaussian, with zero mean. In this instance, the model would have 3 parameters: b₀, b₁, and the variance of the Gaussian distribution. We can formally specify the model in the form ( $S,{\mathcal {P}}$ ) as follows. The sample space, $S$ , of our model comprises the set of all possible pairs (age, height). Each possible value of $\theta$ = (b₀, b₁, σ²) determines a distribution on $S$ ; denote that distribution by $F_{\theta }$ . If $\Theta$ is the set of all possible values of $\theta$ , then ${\mathcal {P}}=\{F_{\theta }:\theta \in \Theta \}$ . (The parameterization is identifiable, and this is easy to check.)

See all

Wikipedia

Grokipedia

Wikipedia

Grokipedia

Statistical model

In some cases, the model can be more complex.

See all

Knowledge Base

Talk Channels

Special Pages

Statistical model

Recent from talks

Recent from talks

Contribute something to knowledge base

Subscribers

Supporters

Contributors

Moderators

Hub AI

Hub AI

Hub AI

Statistical model

Statistical model

History

Statistical model

Recent from talks

Recent from talks

Contribute something to knowledge base

Subscribers

Supporters

Contributors

Moderators

Hub AI

Hub AI

Hub AI

Statistical model

Statistical model