Occam learning

Occam learning

current hub

Write something...

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

About hubStatsRules

See all

Wikipedia

In computational learning theory, Occam learning is a model of algorithmic learning where the objective of the learner is to output a succinct representation of received training data. This is closely related to probably approximately correct (PAC) learning, where the learner is evaluated on its predictive power of a test set.

Occam learnability implies PAC learning, and for a wide variety of concept classes, the converse is also true: PAC learnability implies Occam learnability.

Occam Learning is named after Occam's razor, which is a principle stating that, given all other things being equal, a shorter explanation for observed data should be favored over a lengthier explanation. The theory of Occam learning is a formal and mathematical justification for this principle. It was first shown by Blumer, et al. that Occam learning implies PAC learning, which is the standard model of learning in computational learning theory. In other words, parsimony (of the output hypothesis) implies predictive power.

The succinctness of a concept $c$ in concept class ${\mathcal {C}}$ can be expressed by the length $size(c)$ of the shortest bit string that can represent $c$ in ${\mathcal {C}}$ . Occam learning connects the succinctness of a learning algorithm's output to its predictive power on unseen data.

Let ${\mathcal {C}}$ and ${\mathcal {H}}$ be concept classes containing target concepts and hypotheses respectively. Then, for constants $\alpha \geq 0$ and $0\leq \beta <1$ , a learning algorithm $L$ is an $(\alpha ,\beta )$ -Occam algorithm for ${\mathcal {C}}$ using ${\mathcal {H}}$ iff, given a set $S=\{x_{1},\dots ,x_{m}\}$ of $m$ samples labeled according to a concept $c\in {\mathcal {C}}$ , $L$ outputs a hypothesis $h\in {\mathcal {H}}$ such that

where $n$ is the maximum length of any sample $x\in S$ . An Occam algorithm is called efficient if it runs in time polynomial in $n$ , $m$ , and $size(c).$ We say a concept class ${\mathcal {C}}$ is Occam learnable with respect to a hypothesis class ${\mathcal {H}}$ if there exists an efficient Occam algorithm for ${\mathcal {C}}$ using ${\mathcal {H}}.$

Occam learnability implies PAC learnability, as the following theorem of Blumer, et al. shows:

Let $L$ be an efficient $(\alpha ,\beta )$ -Occam algorithm for ${\mathcal {C}}$ using ${\mathcal {H}}$ . Then there exists a constant $a>0$ such that for any $0<\epsilon ,\delta <1$ , for any distribution ${\mathcal {D}}$ , given $m\geq a\left({\frac {1}{\epsilon }}\log {\frac {1}{\delta }}+\left({\frac {(n\cdot size(c))^{\alpha })}{\epsilon }}\right)^{\frac {1}{1-\beta }}\right)$ samples drawn from ${\mathcal {D}}$ and labelled according to a concept $c\in {\mathcal {C}}$ of length $n$ bits each, the algorithm $L$ will output a hypothesis $h\in {\mathcal {H}}$ such that $error(h)\leq \epsilon$ with probability at least $1-\delta$ .

See all

Hub AI

Occam learning AI simulator

(@Occam learning_simulator)

Wikipedia

Hub AI

Occam learning

Occam learnability implies PAC learning, and for a wide variety of concept classes, the converse is also true: PAC learnability implies Occam learnability.

Occam learnability implies PAC learnability, as the following theorem of Blumer, et al. shows:

See all

Knowledge Base

Talk Channels

Special Pages

Occam learning

Occam learning

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Occam learning

Hub AI

Occam learning

History

Occam learning

Occam learning

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Occam learning

Hub AI

Occam learning