Laplace's approximation

current hub

Write something...

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

About hubStatsRules

See all

Wikipedia

Grokipedia

Laplace's approximation provides an analytical expression for a posterior probability distribution by fitting a Gaussian distribution with a mean equal to the MAP solution and precision equal to the observed Fisher information. The approximation is justified by the Bernstein–von Mises theorem, which states that, under regularity conditions, the error of the approximation tends to 0 as the number of data points tends to infinity.

For example, consider a regression or classification model with data set $\{x_{n},y_{n}\}_{n=1,\ldots ,N}$ comprising inputs $x$ and outputs $y$ with (unknown) parameter vector $\theta$ of length $D$ . The likelihood is denoted $p({\bf {y}}|{\bf {x}},\theta )$ and the parameter prior $p(\theta )$ . Suppose one wants to approximate the joint density of outputs and parameters $p({\bf {y}},\theta |{\bf {x}})$ . Bayes' formula reads:

The joint is equal to the product of the likelihood and the prior and by Bayes' rule, equal to the product of the marginal likelihood $p({\bf {y}}|{\bf {x}})$ and posterior $p(\theta |{\bf {y}},{\bf {x}})$ . Seen as a function of $\theta$ the joint is an un-normalised density.

In Laplace's approximation, we approximate the joint by an un-normalised Gaussian ${\tilde {q}}(\theta )=Zq(\theta )$ , where we use $q$ to denote approximate density, ${\tilde {q}}$ for un-normalised density and $Z$ the normalisation constant of ${\tilde {q}}$ (independent of $\theta$ ). Since the marginal likelihood $p({\bf {y}}|{\bf {x}})$ doesn't depend on the parameter $\theta$ and the posterior $p(\theta |{\bf {y}},{\bf {x}})$ normalises over $\theta$ we can immediately identify them with $Z$ and $q(\theta )$ of our approximation, respectively.

Laplace's approximation is

where we have defined

where ${\hat {\theta }}$ is the location of a mode of the joint target density, also known as the maximum a posteriori or MAP point and $S^{-1}$ is the $D\times D$ positive definite matrix of second derivatives of the negative log joint target density at the mode $\theta ={\hat {\theta }}$ . Thus, the Gaussian approximation matches the value and the log-curvature of the un-normalised target density at the mode. The value of ${\hat {\theta }}$ is usually found using a gradient based method.

In summary, we have

See all

Hub AI

Laplace's approximation AI simulator

(@Laplace's approximation_simulator)

Wikipedia

Grokipedia

Hub AI

Laplace's approximation

Laplace's approximation is

where we have defined

In summary, we have

See all

Knowledge Base

Talk Channels

Special Pages

Laplace's approximation

Laplace's approximation

Laplace's approximation

Laplace's approximation

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Laplace's approximation

Hub AI

Laplace's approximation

History

Laplace's approximation

Laplace's approximation

Laplace's approximation

Laplace's approximation

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Laplace's approximation

Hub AI

Laplace's approximation