Hubbry Logo
Homography (computer vision)Homography (computer vision)Main
Open search
Homography (computer vision)
Community hub
Homography (computer vision)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Homography (computer vision)
Homography (computer vision)
from Wikipedia
Geometrical setup for homography: stereo cameras O1 and O2 both pointed at X in epipolar geometry. Drawing from Neue Konstruktionen der Perspektive und Photogrammetrie by Hermann Guido Hauck (1845 — 1905)

In the field of computer vision, any two images of the same planar surface in space are related by a homography (assuming a pinhole camera model). This has many practical applications, such as image rectification, image registration, or camera motion—rotation and translation—between two images. Once camera resectioning has been done from an estimated homography matrix, this information may be used for navigation, or to insert models of 3D objects into an image or video, so that they are rendered with the correct perspective and appear to have been part of the original scene (see Augmented reality).

3D plane to plane equation

[edit]

We have two cameras a and b, looking at points in a plane. Passing from the projection of in b to the projection of in a:

where and are the z coordinates of P in each camera frame and where the homography matrix is given by

.

is the rotation matrix by which b is rotated in relation to a; t is the translation vector from a to b; n and d are the normal vector of the plane and the distance from origin to the plane respectively. Ka and Kb are the cameras' intrinsic parameter matrices.

The figure shows camera b looking at the plane at distance d. Note: From above figure, assuming as plane model, is the projection of vector along , and equal to . So . And we have where .

This formula is only valid if camera b has no rotation and no translation. In the general case where and are the respective rotations and translations of camera a and b, and the homography matrix becomes

where d is the distance of the camera b to the plane.

Affine homography

[edit]

When the image region in which the homography is computed is small or the image has been acquired with a large focal length, an affine homography is a more appropriate model of image displacements. An affine homography is a special type of a general homography whose last row is fixed to

See also

[edit]

References

[edit]

Toolboxes

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In , a is a projective transformation that relates two images of the same planar surface in space, assuming a , by mapping points from one view to the other while preserving straight lines and . It is mathematically represented as a 3×3 HH acting on , such that for a point x\mathbf{x} in the first image, the corresponding point x\mathbf{x}' in the second image satisfies xHx\mathbf{x}' \sim H \mathbf{x}, where \sim denotes equality up to scale; this formulation has 8 due to the scale invariance of projective transformations. Homographies emerge in scenarios involving camera without or when the observed scene lies on a single plane, enabling the modeling of perspective distortions between views. Key properties of homographies include their ability to map lines to lines and preserve incidence relations, but they do not conserve angles, lengths, or parallelism unless the transformation is affine or Euclidean. In practice, homographies are estimated from point correspondences between images using algorithms like the (DLT), which solves a from at least four non-degenerate point pairs, often refined with robust methods such as RANSAC to handle outliers. These transformations underpin numerous applications, including for panoramic mosaics, overlay, perspective rectification in document scanning, and video stabilization. Advancements in estimation have incorporated techniques, such as convolutional neural networks, to improve accuracy in challenging conditions like low-texture scenes or large viewpoint changes, often outperforming traditional feature-based methods like SIFT while demonstrating robustness to domain shifts. Overall, serves as a foundational tool in multi-view geometry, bridging 2D image analysis with 3D scene understanding.

Fundamentals

Definition

In , a is a projective transformation that maps points between two images of the same planar surface in , assuming a . This mapping preserves collinearity and incidence relations—straight lines remain straight, and points lying on lines continue to do so—but it does not preserve distances, angles, or parallelism, often resulting in perspective distortions where converge at a . Unlike the general two-view transformation encoded by the fundamental matrix, which has 7 for arbitrary camera motions and non-planar scene structures, a applies specifically to scenarios where the points of interest are coplanar or the camera undergoes pure around its optical , simplifying the relation to a 2D projective mapping with 8 . These representations often rely on to handle points at and projective equivalences. The term "" originates from , coined by in the to denote collineations—bijective mappings preserving projective structure—between planes. In , it was adapted and popularized in the 1990s for practical applications like image alignment and mosaicing, building on foundational work in multiple-view . A representative example is the correction of in , where a on a planar surface, such as a viewed obliquely, appears as a ; applying the warps it back to rectangular form.

Role in Projective Geometry

In , the projective plane P2\mathbb{P}^2 extends the by incorporating points at infinity, ensuring that every pair of distinct lines intersects at exactly one point and every pair of distinct points determines a unique line. This structure arises from the quotient space of nonzero vectors in R3\mathbb{R}^3 under , represented using (x:y:z)(x : y : z), where points at infinity are captured when the last coordinate is zero, allowing parallel lines to meet on the line at infinity. A key feature of the projective plane is the duality between points and lines: the incidence relation is symmetric, meaning statements about points lying on lines have dual counterparts about lines passing through points, which underpins transformations that preserve geometric incidences. A homography is a collineation in projective space, defined as a bijective mapping that preserves collinearity by sending lines to lines and points to points while maintaining the projective structure. In the context of P2\mathbb{P}^2, such transformations are induced by invertible linear maps on the underlying vector space, ensuring that the image of a line (as the span of two points) remains a line. Homographies also preserve the cross-ratio of four collinear points, defined as (A,B;C,D)=(CA)/(DA)(CB)/(DB)(A,B;C,D) = \frac{(C-A)/(D-A)}{(C-B)/(D-B)} in one dimension and extended to the plane, which is the fundamental projective invariant unaffected by perspective distortions. This preservation enables consistent geometric reasoning across views in computer vision tasks involving projective transformations. In multi-view , the relates to the fundamental matrix, which encodes the between two images; specifically, a emerges in special cases such as pure camera (degenerate , where F=0) or a planar scene where all points lie on a single plane. In the planar case, the fundamental matrix FF relates to the HH via F[e]×HF \sim [\mathbf{e}']_\times H, where e\mathbf{e}' is the epipole in the second image. Under these conditions, the epipolar constraint xTFx=0\mathbf{x}'^T F \mathbf{x} = 0 is satisfied, but the direct point mapping xHx\mathbf{x}' \sim H \mathbf{x} simplifies correspondence, with H having 8 compared to 7 for the general F. Within , homographies play a crucial role in normalizing images to a view, transforming perspective-distorted observations into a standardized fronto-parallel representation that enhances feature matching by aligning corresponding points and reducing viewpoint variance. This rectification process, often applied to planar facades or documents, warps one image via the inverse homography to overlay with a reference view, thereby improving descriptor invariance and matching accuracy in algorithms like SIFT.

Mathematical Formulation

Homography Matrix

In , a homography is algebraically represented by a 3×3 matrix HH that induces a projective transformation between two image planes. This matrix maps a point x\mathbf{x} in the source plane to a corresponding point x\mathbf{x}' in the target plane using , where x=(xy1)T\mathbf{x} = \begin{pmatrix} x \\ y \\ 1 \end{pmatrix}^T
Add your contribution
Related Hubs
User Avatar
No comments yet.