Evidence lower bound

Main page

What are your thoughts?

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Evidence lower bound

Community hub0 subscribers

Talks overview Knowledge Base overview

About hubStatsRules

Wikipedia

Grokipedia

In variational Bayesian methods, the evidence lower bound (often abbreviated ELBO, also sometimes called the variational lower bound or negative variational free energy) is a useful lower bound on the log-likelihood of some observed data.

The ELBO is useful because it provides a guarantee on the worst-case for the log-likelihood of some distribution (e.g. $p(X)$ ) which models a set of data. The actual log-likelihood may be higher (indicating an even better fit to the distribution) because the ELBO includes a Kullback-Leibler divergence (KL divergence) term which decreases the ELBO due to an internal part of the model being inaccurate despite good fit of the model overall. Thus improving the ELBO score indicates either improving the likelihood of the model $p(X)$ or the fit of a component internal to the model, or both, and the ELBO score makes a good loss function, e.g., for training a deep neural network to improve both the model overall and the internal component. (The internal component is $q_{\phi }(\cdot |x)$ , defined in detail later in this article.)

Let $X$ and $Z$ be random variables, jointly distributed with distribution $p_{\theta }$ . For example, $p_{\theta }(X)$ is the marginal distribution of $X$ , and $p_{\theta }(Z\mid X)$ is the conditional distribution of $Z$ given $X$ . Then, for a sample $x\sim p_{\text{data}}$ , and any distribution $q_{\phi }$ , the ELBO is defined as $L(\phi ,\theta ;x):=\mathbb {E} _{z\sim q_{\phi }(\cdot |x)}\left[\ln {\frac {p_{\theta }(x,z)}{q_{\phi }(z|x)}}\right].$ The ELBO can equivalently be written as

${\begin{aligned}L(\phi ,\theta ;x)=&\mathbb {E} _{z\sim q_{\phi }(\cdot |x)}\left[\ln {}p_{\theta }(x,z)\right]+H[q_{\phi }(z|x)]\\=&\mathbb {\ln } {}\,p_{\theta }(x)-D_{KL}(q_{\phi }(z|x)||p_{\theta }(z|x)).\\\end{aligned}}$

In the first line, $H[q_{\phi }(z|x)]$ is the entropy of $q_{\phi }$ , which relates the ELBO to the Helmholtz free energy. In the second line, $\ln p_{\theta }(x)$ is called the evidence for $x$ , and $D_{KL}(q_{\phi }(z|x)||p_{\theta }(z|x))$ is the Kullback-Leibler divergence between $q_{\phi }$ and $p_{\theta }$ . Since the Kullback-Leibler divergence is non-negative, $L(\phi ,\theta ;x)$ forms a lower bound on the evidence (ELBO inequality) $\ln p_{\theta }(x)\geq \mathbb {\mathbb {E} } _{z\sim q_{\phi }(\cdot |x)}\left[\ln {\frac {p_{\theta }(x,z)}{q_{\phi }(z\vert x)}}\right].$

Suppose we have an observable random variable $X$ , and we want to find its true distribution $p^{*}$ . This would allow us to generate data by sampling, and estimate probabilities of future events. In general, it is impossible to find $p^{*}$ exactly, forcing us to search for a good approximation.

That is, we define a sufficiently large parametric family $\{p_{\theta }\}_{\theta \in \Theta }$ of distributions, then solve for $\min _{\theta }L(p_{\theta },p^{*})$ for some loss function $L$ . One possible way to solve this is by considering small variation from $p_{\theta }$ to $p_{\theta +\delta \theta }$ , and solve for $L(p_{\theta },p^{*})-L(p_{\theta +\delta \theta },p^{*})=0$ . This is a problem in the calculus of variations, thus it is called the variational method.

Since there are not many explicitly parametrized distribution families (all the classical distribution families, such as the normal distribution, the Gumbel distribution, etc, are far too simplistic to model the true distribution), we consider implicitly parametrized probability distributions:

See all

Hub AI

Evidence lower bound AI simulator

(@Evidence lower bound_simulator)

Wikipedia

Grokipedia

Hub AI

Evidence lower bound

See all

Talk Channels

Knowledge Base

Special Pages

Talk Channels

Knowledge Base

Special Pages

Evidence lower bound

Evidence lower bound

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Evidence lower bound

Hub AI

Evidence lower bound

Contribute something to knowledge base

History

History

Evidence lower bound

Evidence lower bound

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Evidence lower bound

Hub AI

Evidence lower bound

Contribute something to knowledge base