Hubbry Logo
Hyperplane separation theoremHyperplane separation theoremMain
Open search
Hyperplane separation theorem
Community hub
Hyperplane separation theorem
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Hyperplane separation theorem
Hyperplane separation theorem
from Wikipedia

In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in n-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least one of them is compact, then there is a hyperplane in between them and even two parallel hyperplanes in between them separated by a gap. In another version, if both disjoint convex sets are open, then there is a hyperplane in between them, but not necessarily any gap. An axis which is orthogonal to a separating hyperplane is a separating axis, because the orthogonal projections of the convex bodies onto the axis are disjoint.

Key Information

The hyperplane separation theorem is due to Hermann Minkowski. The Hahn–Banach separation theorem generalizes the result to topological vector spaces.

A related result is the supporting hyperplane theorem.

In the context of support-vector machines, the optimally separating hyperplane or maximum-margin hyperplane is a hyperplane which separates two sets of points and maximizes its distance to both sets.[1][2][3]

Statements and proof

[edit]

Hyperplane separation theorem[4]Let and be two disjoint nonempty convex subsets of . Then there exist a nonzero vector and a real number such that

for all in and in ; i.e., the hyperplane , the normal vector, separates and .

If both sets are closed, and at least one of them is compact, then the separation can be strict, that is, for some

In all cases, assume to be disjoint, nonempty, and convex subsets of . The summary of the results are as follows:

summary table
closed compact closed with
closed closed compact with
open
open open

The number of dimensions must be finite. In infinite-dimensional spaces there are examples of two closed, convex, disjoint sets which cannot be separated by a closed hyperplane (a hyperplane where a continuous linear functional equals some constant) even in the weak sense where the inequalities are not strict.[5]

Here, the compactness in the hypothesis cannot be relaxed; see an example in the section Counterexamples and uniqueness. This version of the separation theorem does generalize to infinite-dimension; the generalization is more commonly known as the Hahn–Banach separation theorem.

The proof is based on the following lemma:

LemmaLet and be two disjoint closed subsets of , and assume is compact. Then there exist points and minimizing the distance over and .

Proof of lemma

Let and be any pair of points, and let . Since is compact, it is contained in some ball centered on ; let the radius of this ball be . Let be the intersection of with a closed ball of radius around . Then is compact and nonempty because it contains . Since the distance function is continuous, there exist points and whose distance is the minimum over all pairs of points in . It remains to show that and in fact have the minimum distance over all pairs of points in . Suppose for contradiction that there exist points and such that . Then in particular, , and by the triangle inequality, . Therefore is contained in , which contradicts the fact that and had minimum distance over .


Proof illustration.
Proof of theorem

We first prove the second case. (See the diagram.)

WLOG, is compact. By the lemma, there exist points and of minimum distance to each other. Since and are disjoint, we have . Now, construct two hyperplanes perpendicular to line segment , with across and across . We claim that neither nor enters the space between , and thus the perpendicular hyperplanes to satisfy the requirement of the theorem.

Algebraically, the hyperplanes are defined by the vector , and two constants , such that . Our claim is that and .

Suppose there is some such that , then let be the foot of perpendicular from to the line segment . Since is convex, is inside , and by planar geometry, is closer to than , contradiction. Similar argument applies to .

Now for the first case.

Approach both from the inside by and , such that each is closed and compact, and the unions are the relative interiors . (See relative interior page for details.)

Now by the second case, for each pair there exists some unit vector and real number , such that .

Since the unit sphere is compact, we can take a convergent subsequence, so that . Let . We claim that , thus separating .

Assume not, then there exists some such that , then since , for large enough , we have , contradiction.


Since a separating hyperplane cannot intersect the interiors of open convex sets, we have a corollary:

Separation theorem ILet and be two disjoint nonempty convex sets. If is open, then there exist a nonzero vector and real number such that

for all in and in . If both sets are open, then there exist a nonzero vector and real number such that

for all in and in .

Case with possible intersections

[edit]

If the sets have possible intersections, but their relative interiors are disjoint, then the proof of the first case still applies with no change, thus yielding:

Separation theorem IILet and be two nonempty convex subsets of with disjoint relative interiors. Then there exist a nonzero vector and a real number such that

in particular, we have the supporting hyperplane theorem.

Supporting hyperplane theoremif is a convex set in and is a point on the boundary of , then there exists a supporting hyperplane of containing .

Proof

If the affine span of is not all of , then extend the affine span to a supporting hyperplane. Else, is disjoint from , so apply the above theorem.

Converse of theorem

[edit]

Note that the existence of a hyperplane that only "separates" two convex sets in the weak sense of both inequalities being non-strict obviously does not imply that the two sets are disjoint. Both sets could have points located on the hyperplane.

Counterexamples and uniqueness

[edit]
The theorem does not apply if one of the bodies is not convex.

If one of A or B is not convex, then there are many possible counterexamples. For example, A and B could be concentric circles. A more subtle counterexample is one in which A and B are both closed but neither one is compact. For example, if A is a closed half plane and B is bounded by one arm of a hyperbola, then there is no strictly separating hyperplane:

(Although, by an instance of the second theorem, there is a hyperplane that separates their interiors.) Another type of counterexample has A compact and B open. For example, A can be a closed square and B can be an open square that touches A.

In the first version of the theorem, evidently the separating hyperplane is never unique. In the second version, it may or may not be unique. Technically a separating axis is never unique because it can be translated; in the second version of the theorem, a separating axis can be unique up to translation.

The horn angle provides a good counterexample to many hyperplane separations. For example, in , the unit disk is disjoint from the open interval , but the only line separating them contains the entirety of . This shows that if is closed and is relatively open, then there does not necessarily exist a separation that is strict for . However, if is a closed convex polyhedron then such a separation exists.[6]

More variants

[edit]

Farkas' lemma and related results can be understood as hyperplane separation theorems when the convex bodies are defined by finitely many linear inequalities.

More results may be found.[6]

Use in collision detection

[edit]

In collision detection, the hyperplane separation theorem is usually used in the following form:

Separating axis theoremTwo closed convex objects are disjoint if there exists a line ("separating axis") onto which the two objects' projections are disjoint.

Regardless of dimensionality, the separating axis is always a line. For example, in 3D, the space is separated by planes, but the separating axis is perpendicular to the separating plane.

The separating axis theorem can be applied for fast collision detection between polygon meshes. Each face's normal or other feature direction is used as a separating axis. Note that this yields possible separating axes, not separating lines/planes.

In 3D, using face normals alone will fail to separate some edge-on-edge non-colliding cases. Additional axes, consisting of the cross-products of pairs of edges, one taken from each object, are required.[7]

For increased efficiency, parallel axes may be calculated as a single axis.

See also

[edit]

Notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The hyperplane separation theorem is a fundamental result in convex geometry stating that, for any two nonempty disjoint convex sets CC and DD in Rn\mathbb{R}^n, there exists a nonzero vector aRna \in \mathbb{R}^n and a scalar bRb \in \mathbb{R} such that aTxba^T x \leq b for all xCx \in C and aTyba^T y \geq b for all yDy \in D, meaning the sets lie in opposite closed half-spaces defined by the hyperplane {zaTz=b}\{z \mid a^T z = b\}. Originally proved by in the early 20th century as part of his work on convex bodies, the theorem provides a geometric foundation for separating points from convex sets and has been generalized through the Hahn-Banach theorem to infinite-dimensional topological vector spaces. Variants of the theorem address stricter conditions: for instance, if one set is compact and the other closed, strict separation is possible, where the inequalities are strict (aTx<b<aTya^T x < b < a^T y), ensuring the hyperplane does not touch either set. The theorem underpins key concepts in convex analysis, such as supporting hyperplanes for convex sets at boundary points, and plays a central role in optimization theory, including duality in linear and convex programming, where it justifies the existence of separating functionals for feasible regions.

Fundamentals

Statement of the theorem

The hyperplane separation theorem concerns the separation of disjoint convex sets in Euclidean space. Euclidean space Rn\mathbb{R}^n is equipped with the standard inner product a,x=i=1naixi\langle \mathbf{a}, \mathbf{x} \rangle = \sum_{i=1}^n a_i x_i for vectors a,xRn\mathbf{a}, \mathbf{x} \in \mathbb{R}^n. A set CRnC \subseteq \mathbb{R}^n is convex if, for any x,yC\mathbf{x}, \mathbf{y} \in C and λ[0,1]\lambda \in [0,1], the point λx+(1λ)yC\lambda \mathbf{x} + (1-\lambda) \mathbf{y} \in C. Two convex sets A,BRnA, B \subseteq \mathbb{R}^n are disjoint if AB=A \cap B = \emptyset. A hyperplane in Rn\mathbb{R}^n is an affine subspace of dimension n1n-1, defined as H={xRna,x=b}H = \{\mathbf{x} \in \mathbb{R}^n \mid \langle \mathbf{a}, \mathbf{x} \rangle = b \} for some nonzero normal vector aRn{0}\mathbf{a} \in \mathbb{R}^n \setminus \{\mathbf{0}\} and scalar bRb \in \mathbb{R}. The theorem, originated by in the early 20th century, provides conditions under which disjoint convex sets can be separated by a hyperplane. In finite dimensions, for any two nonempty disjoint convex sets AA and BB in Rn\mathbb{R}^n, there exists a nonzero vector aRn{0}\mathbf{a} \in \mathbb{R}^n \setminus \{\mathbf{0}\} such that supxAa,xinfyBa,y.\sup_{\mathbf{x} \in A} \langle \mathbf{a}, \mathbf{x} \rangle \leq \inf_{\mathbf{y} \in B} \langle \mathbf{a}, \mathbf{y} \rangle. This weak separation guarantees a separating hyperplane H={zRna,z=b}H = \{\mathbf{z} \in \mathbb{R}^n \mid \langle \mathbf{a}, \mathbf{z} \rangle = b \} for some bb with supAbinfB\sup_A \leq b \leq \inf_B, placing AA in one closed half-space and BB in the other (possibly touching the hyperplane). If additionally AA is compact and BB is closed (or vice versa), then strict separation is possible: supxAa,x<infyBa,y,\sup_{\mathbf{x} \in A} \langle \mathbf{a}, \mathbf{x} \rangle < \inf_{\mathbf{y} \in B} \langle \mathbf{a}, \mathbf{y} \rangle, so the hyperplane can be chosen not to touch either set. This finite-dimensional result generalizes via the Hahn-Banach theorem to separating disjoint convex sets in more abstract topological vector spaces.

Proof of the theorem

The proof of the hyperplane separation theorem for the strict case relies on the assumption that the two disjoint convex sets AA and BB in Rn\mathbb{R}^n satisfy AA compact and BB closed. These conditions guarantee that the distance between AA and BB is positive and that points achieving this minimum distance exist.

Geometric Proof Sketch

The geometric proof constructs the separating using the closest points between the sets. Let d=infxA,yBxy>0d = \inf_{x \in A, y \in B} \|x - y\| > 0, which is achieved at some xAx^* \in A and yBy^* \in B due to the of AA and closedness of BB. The vector v=yxv = y^* - x^* points from AA to BB along the shortest direction. The m=(x+y)/2m = (x^* + y^*)/2 lies on the between them. The is defined to vv passing through mm, ensuring it strictly separates AA and BB. This supports the sets at the boundary points without touching either, leveraging the convexity to extend the separation to the entire sets.

Derivation of the Separating Hyperplane Equation

Normalize the direction as a=v/v\mathbf{a} = v / \|v\| with a=1\|\mathbf{a}\| = 1, though the unnormalized form suffices for separation. The equation is az=α\mathbf{a} \cdot \mathbf{z} = \alpha, where α=am\alpha = \mathbf{a} \cdot m. Substituting, α=a(x+y)/2=(ax+ay)/2\alpha = \mathbf{a} \cdot (x^* + y^*)/2 = (\mathbf{a} \cdot x^* + \mathbf{a} \cdot y^*)/2. Since a=yx\mathbf{a} = y^* - x^* (unnormalized for simplicity), α=(y2x2)/2\alpha = (\|y^*\|^2 - \|x^*\|^2)/2. The separation condition is ax<α\mathbf{a} \cdot x < \alpha for all xAx \in A and ay>α\mathbf{a} \cdot y > \alpha for all yBy \in B, or equivalently, a(yx)>0\mathbf{a} \cdot (y - x) > 0 for all xAx \in A, yBy \in B. This derives directly from the perpendicularity and the position at the , ensuring the halfspaces align with the sets. To verify the separation holds for the entire sets, suppose there exists xAx \in A with axα\mathbf{a} \cdot x \geq \alpha. Consider the line segment z(t)=(1t)x+txz(t) = (1-t) x^* + t x for t[0,1]t \in [0,1], which lies in AA by convexity. If ax>α\mathbf{a} \cdot x > \alpha, then az(t)\mathbf{a} \cdot z(t) is linear in tt, starting at ax<α\mathbf{a} \cdot x^* < \alpha and exceeding α\alpha at t=1t=1, so there is t(0,1)t^* \in (0,1) with az(t)=α\mathbf{a} \cdot z(t^*) = \alpha. But to contradict, assume ax<ax\mathbf{a} \cdot x < \mathbf{a} \cdot x^* (adjusting direction if needed; symmetrically). More precisely, the verification uses the minimality of the distance: suppose ax>ax\mathbf{a} \cdot x > \mathbf{a} \cdot x^*. Then xx,xy>0\langle x - x^*, x^* - y^* \rangle > 0 (noting axy\mathbf{a} \propto x^* - y^* wait, signs: with a=yx\mathbf{a} = y^* - x^*, ax>ax\mathbf{a} \cdot x > \mathbf{a} \cdot x^* implies xx,a>0=xx,yx>0\langle x - x^*, \mathbf{a} \rangle > 0 = \langle x - x^*, y^* - x^* \rangle > 0. For contradiction on the AA side assuming violation ax>ax\mathbf{a} \cdot x > \mathbf{a} \cdot x^*, consider small ϵ>0\epsilon > 0, xϵ=(1ϵ)x+ϵxAx_\epsilon = (1 - \epsilon) x^* + \epsilon x \in A. Then xϵy2=xy2+2ϵxx,xy+ϵ2xx2.\|x_\epsilon - y^*\|^2 = \|x^* - y^*\|^2 + 2\epsilon \langle x - x^*, x^* - y^* \rangle + \epsilon^2 \|x - x^*\|^2. Since xx,xy=xx,a<0\langle x - x^*, x^* - y^* \rangle = - \langle x - x^*, \mathbf{a} \rangle < 0 (as a(xx)>0\mathbf{a} \cdot (x - x^*) > 0), the linear term is negative, so for small ϵ\epsilon, xϵy<xy=d\|x_\epsilon - y^*\| < \|x^* - y^*\| = d, contradicting minimality of dd. A symmetric argument applies if a point in BB violates the inequality. This confirms strict separation.

Analytic Proof via Contradiction

For the general weak case without compactness (relying on finite dimensionality), consider the Minkowski difference C=AB={abaA,bB}C = A - B = \{a - b \mid a \in A, b \in B\}, which is convex and does not contain 0 since AB=A \cap B = \emptyset. Assume for contradiction that no separating hyperplane exists for AA and BB. This implies no nonzero aRn\mathbf{a} \in \mathbb{R}^n satisfies ac0\mathbf{a} \cdot c \geq 0 for all cCc \in C. In finite dimensions, the absence of such a supporting hyperplane at 0 means $0 \in \conv(C).Thus,thereexist. Thus, there exist \lambda_i > 0,, \sum \lambda_i = 1,and, and c_i = a_i - b_i \in Csuchthatsuch that\sum \lambda_i c_i = 0,or, or \sum \lambda_i a_i = \sum \lambda_i b_i =: p.Then. Then p \in \conv(A) = Aandandp \in \conv(B) = B,so, so A \cap B \neq \emptyset$, contradicting disjointness. Strict inequality is possible under , as the is positive.

Extensions and Variants

Strict and weak separation

The hyperplane separation theorem distinguishes between weak and strict forms of separation for disjoint convex sets. In the weak separation, two nonempty disjoint convex sets AA and BB in Rn\mathbb{R}^n can be separated by a defined by a nonzero vector aRna \in \mathbb{R}^n and scalar bRb \in \mathbb{R} such that axba \cdot x \leq b for all xAx \in A and ayba \cdot y \geq b for all yBy \in B, allowing points from AA and BB to lie on the itself. This form holds under the basic assumptions of convexity and disjointness without additional topological conditions. Strict separation strengthens this condition by requiring the sets to lie in opposite open half-spaces, so ax<ba \cdot x < b for all xAx \in A and ay>ba \cdot y > b for all yBy \in B, ensuring no points from either set touch the . Strict separation is guaranteed if one set is compact and the other is closed, or if both sets are open, provided they remain disjoint and convex. In the compact-closed case, the positive distance between the sets enables this stricter inequality. To obtain strict separation involving a compact set AA and a BB, consider the positive distance δ=inf{xy:xA,yB}>0\delta = \inf\{\|x - y\| : x \in A, y \in B\} > 0, which exists due to of AA. Let cAc \in A and dBd \in B be closest points achieving this minimum, and define the separating direction a=dca = d - c with a=δ\|a\| = \delta. The strict separating uses b=ac+a22b = a \cdot c + \frac{\|a\|^2}{2}, yielding ax<ba \cdot x < b for xAx \in A and ay>ba \cdot y > b for yBy \in B, since the hyperplane lies midway between cc and dd, and convexity with disjointness ensures the sets remain in the open half-spaces, with gaps of at least δ2>0\frac{\delta}{2} > 0 in signed distance. For both sets open and disjoint, a similar construction applies by considering their closures and adjusting for the openness to avoid boundary contact. An illustrative example of strict separation involves two disjoint open balls in Rn\mathbb{R}^n, such as A={x:xu<r}A = \{x : \|x - u\| < r\} and B={y:yv<s}B = \{y : \|y - v\| < s\} where uv>r+s\|u - v\| > r + s. The line connecting centers uu and vv provides a direction a=vua = v - u, and a perpendicular to aa midway between the balls strictly separates them, as the openness ensures the sets lie entirely within the open half-spaces.

Separation with possible intersections

In the context of convex sets in , a more general form of the hyperplane separation theorem addresses cases where the sets may have intersecting closures but disjoint relative interiors. Specifically, for two nonempty convex sets AA and BB in Rn\mathbb{R}^n, if the relative interior of AA (denoted riA\operatorname{ri} A) and the relative interior of BB (denoted riB\operatorname{ri} B) are disjoint, then there exists a nonzero vector pRnp \in \mathbb{R}^n and a scalar αR\alpha \in \mathbb{R} such that pxαxA,pyαyB,p^\top x \geq \alpha \quad \forall x \in A, \quad p^\top y \leq \alpha \quad \forall y \in B, with at least one strict inequality holding for some point in AA or BB. This is known as proper separation, which permits boundary overlap between AA and BB (i.e., ABA \cap B \neq \emptyset is possible) but ensures no overlap in their relative interiors, distinguishing it from the basic theorem requiring full disjointness of the sets themselves. This variant connects directly to the supporting hyperplane theorem, which states that if a point lies on the boundary of a convex set CC, there exists a that supports CC at that point, containing it in one closed half-space. In the separation context, when riAriB=\operatorname{ri} A \cap \operatorname{ri} B = \emptyset, the separating supports the closure of the difference set C=ABC = A - B at the origin (or a translated point), ensuring the plane touches the boundary at potential intersection points without crossing into the relative interiors. This supporting guarantees that the bounds one set without fully containing both, even if they touch. To outline the proof, consider the convex difference set C=AB={abaA,bB}C = A - B = \{a - b \mid a \in A, b \in B\}. Since riAriB=\operatorname{ri} A \cap \operatorname{ri} B = \emptyset, it follows that 0riC0 \notin \operatorname{ri} C (as riC=riAriB\operatorname{ri} C = \operatorname{ri} A - \operatorname{ri} B). By the finite-dimensional supporting hyperplane theorem, there exists a nonzero pRnp \in \mathbb{R}^n such that px0p^\top x \geq 0 for all xCx \in C, with py>0p^\top y > 0 for some yCy \in C (proper support at 0). Translating appropriately yields the separating inequalities for AA and BB, with the strict inequality ensuring properness. This approach leverages directional derivatives or the geometry of recession directions implicitly through the support condition, without requiring full disjointness. A simple example illustrates this: Consider A={(x,y)R2y0}A = \{(x, y) \in \mathbb{R}^2 \mid y \geq 0\} (the closed upper half-plane) and B={(x,y)R2y0}B = \{(x, y) \in \mathbb{R}^2 \mid y \leq 0\} (the closed lower half-plane). Their closures intersect along the x-axis (y=0y = 0), but riA={(x,y)y>0}\operatorname{ri} A = \{(x, y) \mid y > 0\} and riB={(x,y)y<0}\operatorname{ri} B = \{(x, y) \mid y < 0\} are disjoint. The hyperplane defined by p=(0,1)p = (0, 1)^\top and α=0\alpha = 0 (i.e., the line y=0y = 0) properly separates them, supporting both at the boundary while keeping the relative interiors in opposite open half-spaces.

Additional variants

The Hahn-Banach separation theorem generalizes the hyperplane separation theorem to topological vector spaces, allowing the separation of disjoint convex sets using continuous linear functionals rather than relying solely on the Euclidean structure. Specifically, in a real normed vector space VV, for nonempty convex subsets AA and BB that are disjoint with AA having nonempty interior, there exists a continuous linear functional F:VRF: V \to \mathbb{R} and a real number cc such that F(a)cF(b)F(a) \geq c \geq F(b) for all aAa \in A and bBb \in B, placing the sets in opposite closed half-spaces defined by the hyperplane {xVF(x)=c}\{x \in V \mid F(x) = c\}. This extension is achieved via the Hahn-Banach theorem, which ensures the existence of such functionals by extending bounded linear ones from subspaces, and it applies broadly to locally convex topological vector spaces where separation requires continuity to handle the topology. Farkas' lemma provides an algebraic variant of the separation theorem tailored to polyhedral sets defined by linear inequalities, establishing an equivalence between the feasibility of a system of inequalities and the non-existence of a separating . In its standard form, for a matrix ARm×nA \in \mathbb{R}^{m \times n} and vector bRmb \in \mathbb{R}^m, exactly one of the following holds: either there exists x0x \geq 0 such that Ax=bAx = b, or there exists yy such that ATy0A^T y \leq 0 and bTy>0b^T y > 0. The second alternative corresponds to a separating that strictly separates the point bb from the polyhedral cone generated by the columns of AA, as the {zyTz=0}\{z \mid y^T z = 0\} places bb on the positive side (yTb>0y^T b > 0) and the cone on the non-positive side (yT(Ax)0y^T (A x) \leq 0 for x0x \geq 0). This formulation is particularly useful for finitely generated cones, which are polyhedral, and it predates many geometric interpretations of separation by emphasizing feasibility over direct geometric constructs. In finite dimensions, the hyperplane separation theorem is equivalent to the condition for a point to lie outside the convex hull of a set, providing a characterization specific to Rn\mathbb{R}^n. A point xRnx \in \mathbb{R}^n belongs to the convex hull of a set SS if and only if there is no hyperplane strictly separating xx from conv(S)\operatorname{conv}(S), meaning that if such a hyperplane exists—with normal vector a0a \neq 0 and offset cc such that aTx>caTsa^T x > c \geq a^T s for all sSs \in S—then xconv(S)x \notin \operatorname{conv}(S). This equivalence holds because in finite dimensions, the relative interiors of {x}\{x\} and conv(S)\operatorname{conv}(S) are disjoint precisely when separation is possible, and it underpins algorithmic tests for convex hull membership by seeking separating hyperplanes. For unbounded convex sets, a variant of the hyperplane separation theorem incorporates recession directions to ensure proper separation, addressing cases where sets extend infinitely without compact support. In Rn\mathbb{R}^n, two nonempty convex sets C1C_1 and C2C_2 with disjoint relative interiors can be properly separated by a if their recession cones do not overlap in a way that prevents asymptotic separation; specifically, there exists a nonzero aRna \in \mathbb{R}^n and scalar α\alpha such that supxC1aTxαinfxC2aTx\sup_{x \in C_1} a^T x \leq \alpha \leq \inf_{x \in C_2} a^T x, with the recession directions—vectors dd where x+λdCix + \lambda d \in C_i for all λ0\lambda \geq 0 and xCix \in C_i—satisfying aTd=0a^T d = 0 to avoid directional overlap. This proper separation extends the basic version by using the recession cone 0+Ci={dx+λdCi λ0,xCi}0^+ C_i = \{d \mid x + \lambda d \in C_i \ \forall \lambda \geq 0, x \in C_i\} to verify that the sets do not "recede" together, ensuring the bounds them without intersection at ; for polyhedral sets, this reduces to checking lineality spaces and recession properties. Historically, , first correctly proven in its homogeneous linear form in 1902 by Gyula Farkas in the paper "Über die Theorie der einfachen Ungleichungen" published in the Journal für die reine und angewandte Mathematik, predates many geometric formulations of the hyperplane separation theorem and laid foundational algebraic groundwork for such results in optimization and convexity.

Theoretical Properties

Converse of the theorem

The existence of a weakly separating between two convex sets does not necessarily imply that the sets are disjoint, contrary to initial intuition, as the sets may intersect precisely on the separating itself. For instance, consider the two closed half-spaces C={xRnaxb}C = \{ x \in \mathbb{R}^n \mid a^\top x \leq b \} and D={xRnaxb}D = \{ x \in \mathbb{R}^n \mid a^\top x \geq b \} for some a0a \neq 0 and scalar bb; these sets are weakly separated by the {xax=b}\{ x \mid a^\top x = b \}, yet their is exactly that , which is nonempty. In contrast, the proper converse holds for strict separation: if two nonempty convex sets CC and DD in Rn\mathbb{R}^n admit a strictly separating , meaning there exist a0a \neq 0 and bb such that ax<ba^\top x < b for all xCx \in C and ax>ba^\top x > b for all yDy \in D, then CD=C \cap D = \emptyset. To see this, suppose for contradiction that there exists yCDy \in C \cap D; then ay<ba^\top y < b and ay>ba^\top y > b, which is impossible. For weak separation, a refined converse applies involving relative interiors: if two nonempty convex sets CC and DD in Rn\mathbb{R}^n are weakly separated by a hyperplane, then the relative interiors riC\operatorname{ri} C and riD\operatorname{ri} D are disjoint. The proof proceeds by contradiction; assume there exists yriCriDy \in \operatorname{ri} C \cap \operatorname{ri} D. Since yy lies in the relative interior of CC, there is a ball around yy (relative to the affine hull of CC) contained in CC, and similarly for DD. But weak separation requires axba^\top x \leq b for all xCx \in C and axba^\top x \geq b for all xDx \in D, so ay=ba^\top y = b. The local balls around yy would then contain points with ax<ba^\top x < b in DD (violating the inequality for DD) and points with ax>ba^\top x > b in CC (violating the inequality for CC), a contradiction. However, there is no converse establishing that weak separation implies full disjointness of the sets, as demonstrated by the half-space example above, where intersection occurs solely on the boundary.

Uniqueness and counterexamples

The separating in the hyperplane separation theorem is not necessarily ; in general, there may be infinitely many such hyperplanes for two disjoint convex sets. For example, in R2\mathbb{R}^2, consider two , such as the sets C={(x,y)y1}C = \{(x,y) \mid y \geq 1\} and D={(x,y)y1}D = \{(x,y) \mid y \leq -1\}. Any horizontal line with equation y=cy = c where 1<c<1-1 < c < 1 separates CC and DD, yielding infinitely many separating hyperplanes. Uniqueness of the separating hyperplane holds under additional conditions on the sets. If one set is a singleton {p}\{p\} and the other convex set CC has nonempty interior and is strictly convex (meaning its boundary contains no line segments), then the closest point in CC to pp is unique, and the separating hyperplane—perpendicular to the line connecting pp to this projection—is also unique. This follows from the uniqueness of projections onto strictly convex sets under a strictly convex norm, such as the Euclidean norm. The theorem requires both sets to be convex; without convexity, even disjoint sets may admit no separating hyperplane. A representative counterexample in R2\mathbb{R}^2 involves the non-convex sets A={(0,0),(2,2)}A = \{(0,0), (2,2)\} and B={(0,2),(2,0)}B = \{(0,2), (2,0)\}, which are disjoint but whose convex hulls are line segments that cross at (1,1)(1,1). Any hyperplane attempting to place both points of AA on one side and both of BB on the other would fail, as the points interleave in all directions. Even for convex sets, the theorem can fail in certain formulations, such as when strict separation is sought without compactness assumptions. The theorem also fails for strict separation of unbounded closed convex sets without compactness. For instance, let C={(x,y)R2y0}C = \{(x,y) \in \mathbb{R}^2 \mid y \geq 0\} (the closed upper half-plane) and D={(x,y)R2yex,x0;y0,x<0}D = \{(x,y) \in \mathbb{R}^2 \mid y \leq -e^{-x}, x \geq 0; y \leq 0, x < 0\} (a closed convex set approaching the x-axis asymptotically as xx \to \infty). These sets are disjoint, but the infimum distance between them is zero due to the asymptotic behavior, so no strictly separating hyperplane exists—any separating hyperplane would have to "squeeze" infinitely close at infinity, violating the strict inequality. Weak separation is possible, but strict fails without one set being compact.

Applications

In optimization and duality

The hyperplane separation theorem plays a central role in convex optimization by providing certificates for infeasibility and optimality, as well as underpinning duality results that relate primal and dual problems. In this context, the theorem certifies when a point lies outside a convex feasible set by constructing a separating hyperplane, which serves as a dual witness to the primal problem's properties. This separation mechanism is essential for proving weak and strong duality in convex programs, where the dual variables correspond to the normal vector of the separating hyperplane. In Lagrange duality, the separating hyperplane acts as a dual certificate for primal infeasibility or unboundedness. For a convex optimization problem of the form minf0(x)\min f_0(x) subject to fi(x)0f_i(x) \leq 0 for i=1,,mi=1,\dots,m and Ax=bAx = b, the Lagrange dual function g(λ,ν)=infxL(x,λ,ν)g(\lambda, \nu) = \inf_x \mathcal{L}(x, \lambda, \nu) yields lower bounds on the primal optimal value. If the primal is infeasible, there exist dual multipliers λ0\lambda \geq 0 and ν\nu such that g(λ,ν)=+g(\lambda, \nu) = +\infty, certifying infeasibility via duality theory and the separating hyperplane theorem applied to the relevant convex sets. The theorem is equivalent to Farkas' lemma, a key alternative theorem in linear inequalities that detects infeasibility in linear systems. Farkas' lemma states that for a matrix ARm×nA \in \mathbb{R}^{m \times n} and vector bRmb \in \mathbb{R}^m, the system AxbAx \leq b has a solution if and only if there is no y0y \geq 0 such that ATy=0A^T y = 0 and bTy<0b^T y < 0. Geometrically, this corresponds to separating the convex cone generated by the columns of AA from the point b-b using a hyperplane defined by yy, ensuring no intersection if the alternative holds. In linear programming, the separation theorem detects infeasible regions by constructing hyperplanes that bound polyhedral feasible sets. For an LP mincTx\min c^T x subject to AxbAx \geq b, if the feasible set is empty, a separating hyperplane with normal y0y \geq 0 satisfies ATy=0A^T y = 0, bTy>0b^T y > 0, proving no solution exists via duality. This is used in phase I methods to certify infeasibility or find initial feasible points. Strong duality in convex programs, where the primal and dual optimal values coincide, is established using the separation theorem when Slater's condition holds. Slater's condition requires a strictly feasible point in the relative interior of the domain, ensuring the constraint sets are separable from their complements. The proof separates the epigraph of the primal value function from the dual feasible set using a hyperplane, yielding zero duality gap; for example, if the primal value p>dp^* > d^*, the sets {(x,t)tf0(x),fi(x)0,Ax=b}\{(x, t) \mid t \geq f_0(x), f_i(x) \leq 0, Ax = b\} and {(u,s)sg(λ,ν),λ0}\{(u, s) \mid s \leq g(\lambda, \nu), \lambda \geq 0\} are disjoint convex sets that can be strictly separated. A representative example is the linear program mincTx\min c^T x subject to AxbAx \geq b. Infeasibility is certified by the existence of y0y \geq 0 with ATy=0A^T y = 0, bTy>0b^T y > 0, via . Unboundedness below, assuming feasibility, occurs if there exists a direction dd such that Ad0A d \geq 0 and cTd<0c^T d < 0; the absence of such dd can be certified using a separating hyperplane in the .

In machine learning

The hyperplane separation theorem underpins the formulation of support vector machines (SVMs), a class of supervised algorithms designed for by identifying a that maximizes the margin between two classes. In SVMs, the classes are represented by their convex hulls, and the theorem ensures that if these hulls are disjoint, a separating exists; the SVM then selects the hyperplane that achieves the largest possible margin to the nearest points from each class, enhancing generalization. This maximal margin approach was originally proposed by Vapnik and Lerner in their 1963 work on , which laid the groundwork for modern SVMs through the concept of optimal separating hyperplanes. For the hard-margin SVM, applicable when the training data are linearly separable—meaning the convex hulls of the two classes do not intersect—the directly guarantees the of such a , and the SVM solves a quadratic optimization problem to find the one with the widest , determined by the support vectors closest to the boundary. This formulation assumes perfect separability, relying on the strict separation variant of the where the maintains a positive from both sets. The approach was formalized and extended in the seminal work by Boser, Guyon, and Vapnik in 1992, introducing efficient training methods for this optimal . When data are not linearly separable, the soft-margin SVM extends the hard-margin case by incorporating slack variables to permit controlled violations of the margin, effectively linking to the weak separation variant of the theorem that allows the to intersect the convex hulls while minimizing overlap. This regularization balances margin maximization with errors, controlled by a penalty parameter, and was introduced by Cortes and Vapnik in to handle real-world noisy datasets. The kernel trick further leverages the theorem by mapping data into a higher-dimensional feature where linear separability may hold, even if not in the original ; this implicit transformation relies on the Hahn-Banach extension principle underlying the separation theorem to ensure a hyperplane exists in the enlarged without computing the explicit coordinates. For instance, consider two classes of 2D points, such as red points clustered around (0,0) and blue points around (2,2); the SVM would identify the optimal , like a line midway between the convex hulls of these clusters, maximizing the to the nearest points on either side. This technique, building on Vapnik's foundational geometric ideas from the to the , has made SVMs robust for nonlinear tasks.

In collision detection

The hyperplane separation theorem finds application in through the separating axis theorem (SAT), a method for detecting collisions between convex shapes by checking for non-overlapping projections along specific axes. For two convex polygons in 2D, SAT states that the polygons do not intersect if and only if there exists at least one separating axis—typically the normals to the edges of either polygon—onto which the projections of the two shapes do not overlap. This directly follows from the hyperplane separation theorem, as the projection onto a line can be viewed as a 1D separation of the convex hulls, where non-overlapping intervals imply the existence of a separating hyperplane perpendicular to that axis. The SAT algorithm proceeds by identifying candidate separating axes, which for convex polygons are the edge normals of both shapes (up to 2n axes for n-edged polygons), and projecting all vertices of each shape onto each axis to compute the min-max intervals. If the intervals on any axis are disjoint, no collision occurs; otherwise, if projections overlap on all axes, the shapes intersect. This process is efficient, achieving O(n^2) time complexity in 2D for polygons with n edges total, making it suitable for real-time applications. A simple example illustrates SAT for axis-aligned rectangles: collision is detected if the projections overlap along both the x-axis (horizontal) and y-axis (vertical), as these are the only candidate axes due to alignment. For instance, given rectangle A with x-interval [x1, x2] and y-interval [y1, y2], and rectangle B with [x3, x4] and [y3, y4], they collide if max(x1, x3) ≤ min(x2, x4) and max(y1, y3) ≤ min(y2, y4). In 3D, SAT extends to convex polyhedra by considering separating planes whose normals are derived from face normals, edge cross-products, or vertex-edge combinations, but this can become computationally intensive with up to O(n^2) candidate axes. To address this, collision detection often employs the Minkowski difference of the two shapes, transforming the problem into checking if the origin lies inside the difference set (); if separated from the origin by a , no collision exists. This approach underpins algorithms like the Gilbert-Johnson-Keerthi (GJK) method, which iteratively finds the minimum distance using support functions without enumerating all axes. SAT and its variants are widely adopted in engines for efficient broad- and narrow-phase , such as in for 2D simulations of rigid bodies.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.