Hubbry Logo
InterpolationInterpolationMain
Open search
Interpolation
Community hub
Interpolation
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Interpolation
Interpolation
from Wikipedia

In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points.[1][2]

In engineering and science, one often has a number of data points, obtained by sampling or experimentation, which represent the values of a function for a limited number of values of the independent variable. It is often required to interpolate; that is, estimate the value of that function for an intermediate value of the independent variable.

A closely related problem is the approximation of a complicated function by a simple function. Suppose the formula for some given function is known, but too complicated to evaluate efficiently. A few data points from the original function can be interpolated to produce a simpler function which is still fairly close to the original. The resulting gain in simplicity may outweigh the loss from interpolation error and give better performance in calculation process.

An interpolation of a finite set of points on an epitrochoid. The points in red are connected by blue interpolated spline curves deduced only from the red points. The interpolated curves have polynomial formulas much simpler than that of the original epitrochoid curve.

Example

[edit]

This table gives some values of an unknown function .

Plot of the data points as given in the table
0 0
1 0 . 8415
2 0 . 9093
3 0 . 1411
4 −0 . 7568
5 −0 . 9589
6 −0 . 2794

Interpolation provides a means of estimating the function at intermediate points, such as

We describe some methods of interpolation, differing in such properties as: accuracy, cost, number of data points needed, and smoothness of the resulting interpolant function.

Piecewise constant interpolation

[edit]
Piecewise constant interpolation, or nearest-neighbor interpolation

The simplest interpolation method is to locate the nearest data value, and assign the same value. In simple problems, this method is unlikely to be used, as linear interpolation (see below) is almost as easy, but in higher-dimensional multivariate interpolation, this could be a favourable choice for its speed and simplicity.

Linear interpolation

[edit]
Plot of the data with linear interpolation superimposed

One of the simplest methods is linear interpolation (sometimes known as lerp). Consider the above example of estimating f(2.5). Since 2.5 is midway between 2 and 3, it is reasonable to take f(2.5) midway between f(2) = 0.9093 and f(3) = 0.1411, which yields 0.5252.

Generally, linear interpolation takes two data points, say (xa,ya) and (xb,yb), and the interpolant is given by:

This previous equation states that the slope of the new line between and is the same as the slope of the line between and

Linear interpolation is quick and easy, but it is not very precise. Another disadvantage is that the interpolant is not differentiable at the point xk.

The following error estimate shows that linear interpolation is not very precise. Denote the function which we want to interpolate by g, and suppose that x lies between xa and xb and that g is twice continuously differentiable. Then the linear interpolation error is

In words, the error is proportional to the square of the distance between the data points. The error in some other methods, including polynomial interpolation and spline interpolation (described below), is proportional to higher powers of the distance between the data points. These methods also produce smoother interpolants.

Polynomial interpolation

[edit]
Plot of the data with polynomial interpolation applied

Polynomial interpolation is a generalization of linear interpolation. Note that the linear interpolant is a linear function. We now replace this interpolant with a polynomial of higher degree.

Consider again the problem given above. The following sixth degree polynomial goes through all the seven points:

Substituting x = 2.5, we find that f(2.5) = ~0.59678.

Generally, if we have n data points, there is exactly one polynomial of degree at most n−1 going through all the data points. The interpolation error is proportional to the distance between the data points to the power n. Furthermore, the interpolant is a polynomial and thus infinitely differentiable. So, we see that polynomial interpolation overcomes most of the problems of linear interpolation.

However, polynomial interpolation also has some disadvantages. Calculating the interpolating polynomial is computationally expensive (see computational complexity) compared to linear interpolation. Furthermore, polynomial interpolation may exhibit oscillatory artifacts, especially at the end points (see Runge's phenomenon).

Polynomial interpolation can estimate local maxima and minima that are outside the range of the samples, unlike linear interpolation. For example, the interpolant above has a local maximum at x ≈ 1.566, f(x) ≈ 1.003 and a local minimum at x ≈ 4.708, f(x) ≈ −1.003. However, these maxima and minima may exceed the theoretical range of the function; for example, a function that is always positive may have an interpolant with negative values, and whose inverse therefore contains false vertical asymptotes.

More generally, the shape of the resulting curve, especially for very high or low values of the independent variable, may be contrary to commonsense; that is, to what is known about the experimental system which has generated the data points. These disadvantages can be reduced by using spline interpolation or restricting attention to Chebyshev polynomials.

Spline interpolation

[edit]
Plot of the data with spline interpolation applied

Linear interpolation uses a linear function for each of intervals [xk,xk+1]. Spline interpolation uses low-degree polynomials in each of the intervals, and chooses the polynomial pieces such that they fit smoothly together. The resulting function is called a spline.

For instance, the natural cubic spline is piecewise cubic and twice continuously differentiable. Furthermore, its second derivative is zero at the end points. The natural cubic spline interpolating the points in the table above is given by

In this case we get f(2.5) = 0.5972.

Like polynomial interpolation, spline interpolation incurs a smaller error than linear interpolation, while the interpolant is smoother and easier to evaluate than the high-degree polynomials used in polynomial interpolation. However, the global nature of the basis functions leads to ill-conditioning. This is completely mitigated by using splines of compact support, such as are implemented in Boost.Math and discussed in Kress.[3]

Mimetic interpolation

[edit]

Depending on the underlying discretisation of fields, different interpolants may be required. In contrast to other interpolation methods, which estimate functions on target points, mimetic interpolation evaluates the integral of fields on target lines, areas or volumes, depending on the type of field (scalar, vector, pseudo-vector or pseudo-scalar).

A key feature of mimetic interpolation is that vector calculus identities are satisfied, including Stokes' theorem and the divergence theorem. As a result, mimetic interpolation conserves line, area and volume integrals.[4] Conservation of line integrals might be desirable when interpolating the electric field, for instance, since the line integral gives the electric potential difference at the endpoints of the integration path.[5] Mimetic interpolation ensures that the error of estimating the line integral of an electric field is the same as the error obtained by interpolating the potential at the end points of the integration path, regardless of the length of the integration path.

Linear, bilinear and trilinear interpolation are also considered mimetic, even if it is the field values that are conserved (not the integral of the field). Apart from linear interpolation, area weighted interpolation can be considered one of the first mimetic interpolation methods to have been developed.[6]

Functional interpolation

[edit]

The Theory of Functional Connections (TFC) is a mathematical framework specifically developed for functional interpolation. Given any interpolant that satisfies a set of constraints, TFC derives a functional that represents the entire family of interpolants satisfying those constraints, including those that are discontinuous or partially defined. These functionals identify the subspace of functions where the solution to a constrained optimization problem resides. Consequently, TFC transforms constrained optimization problems into equivalent unconstrained formulations. This transformation has proven highly effective in the solution of differential equations. TFC achieves this by constructing a constrained functional (a function of a free function), that inherently satisfies given constraints regardless of the expression of the free function. This simplifies solving various types of equations and significantly improves the efficiency and accuracy of methods like Physics-Informed Neural Networks (PINNs). TFC offers advantages over traditional methods like Lagrange multipliers and spectral methods by directly addressing constraints analytically and avoiding iterative procedures, although it cannot currently handle inequality constraints.

Function approximation

[edit]

Interpolation is a common way to approximate functions. Given a function with a set of points one can form a function such that for (that is, that interpolates at these points). In general, an interpolant need not be a good approximation, but there are well known and often reasonable conditions where it will. For example, if (four times continuously differentiable) then cubic spline interpolation has an error bound given by where and is a constant.[7]

Via Gaussian processes

[edit]

Gaussian process is a powerful non-linear interpolation tool. Many popular interpolation tools are actually equivalent to particular Gaussian processes. Gaussian processes can be used not only for fitting an interpolant that passes exactly through the given data points but also for regression; that is, for fitting a curve through noisy data. In the geostatistics community Gaussian process regression is also known as Kriging.

Inverse Distance Weighting

[edit]

Inverse Distance Weighting (IDW) is a spatial interpolation method that estimates values based on nearby data points, with closer points having more influence.[8] It uses an inverse power law for weighting, where higher power values emphasize local effects, while lower values create a smoother surface. IDW is widely used in GIS, meteorology, and environmental modeling for its simplicity but may produce artifacts in clustered or uneven data.[9]

Other forms

[edit]

Other forms of interpolation can be constructed by picking a different class of interpolants. For instance, rational interpolation is interpolation by rational functions using Padé approximant, and trigonometric interpolation is interpolation by trigonometric polynomials using Fourier series. Another possibility is to use wavelets.

The Whittaker–Shannon interpolation formula can be used if the number of data points is infinite or if the function to be interpolated has compact support.

Sometimes, we know not only the value of the function that we want to interpolate, at some points, but also its derivative. This leads to Hermite interpolation problems.

When each data point is itself a function, it can be useful to see the interpolation problem as a partial advection problem between each data point. This idea leads to the displacement interpolation problem used in transportation theory.

In higher dimensions

[edit]
Comparison of some 1- and 2-dimensional interpolations.
Black and red/yellow/green/blue dots correspond to the interpolated point and neighbouring samples, respectively.
Their heights above the ground correspond to their values.

Multivariate interpolation is the interpolation of functions of more than one variable. Methods include nearest-neighbor interpolation, bilinear interpolation and bicubic interpolation in two dimensions, and trilinear interpolation in three dimensions. They can be applied to gridded or scattered data. Mimetic interpolation generalizes to dimensional spaces where .[10][11]

In digital signal processing

[edit]

In the domain of digital signal processing, the term interpolation refers to the process of converting a sampled digital signal (such as a sampled audio signal) to that of a higher sampling rate (Upsampling) using various digital filtering techniques (for example, convolution with a frequency-limited impulse signal). In this application there is a specific requirement that the harmonic content of the original signal be preserved without creating aliased harmonic content of the original signal above the original Nyquist limit of the signal (that is, above fs/2 of the original signal sample rate). An early and fairly elementary discussion on this subject can be found in Rabiner and Crochiere's book Multirate Digital Signal Processing.[12]

[edit]

The term extrapolation is used to find data points outside the range of known data points.

In curve fitting problems, the constraint that the interpolant has to go exactly through the data points is relaxed. It is only required to approach the data points as closely as possible (within some other constraints). This requires parameterizing the potential interpolants and having some way of measuring the error. In the simplest case this leads to least squares approximation.

Approximation theory studies how to find the best approximation to a given function by another function from some predetermined class, and how good this approximation is. This clearly yields a bound on how well the interpolant can approximate the unknown function.

Generalization

[edit]

If we consider as a variable in a topological space, and the function mapping to a Banach space, then the problem is treated as "interpolation of operators".[13] The classical results about interpolation of operators are the Riesz–Thorin theorem and the Marcinkiewicz theorem. There are also many other subsequent results.

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Interpolation is a fundamental method in and for estimating unknown values within a range of known discrete data points by constructing a that exactly passes through those points. This process, often involving polynomials or piecewise functions, enables the of smooth curves from sampled data, distinguishing it from broader approximation techniques that may not require exact fits at the given points. The practice of interpolation has ancient origins, with Babylonian astronomers employing it around 300 BC for tabular computations, followed by Greek scholar around 150 BC in astronomical predictions. The term "interpolation" first appeared in 1655 in a Latin text by , but systematic development began in the with Thomas Harriot's work in 1611 and Isaac Newton's foundational contributions starting in 1675, including finite difference methods for polynomial interpolation. In the , Edward Waring discovered the Lagrange interpolation formula in 1779, which was independently published by in 1795, providing an explicit basis for polynomial forms. Key methods of interpolation include , where a single polynomial of degree at most nn is fitted to n+1n+1 distinct points, ensuring uniqueness via the . The Lagrange form expresses the interpolant as a weighted sum of basis polynomials, each centered at a data point, while the Newton form uses for efficient computation and extension to more points. For improved stability and avoidance of high-degree polynomial oscillations—known as piecewise interpolation methods like splines divide the domain into subintervals, fitting low-degree polynomials (e.g., linear or cubic) on each while enforcing continuity conditions. Interpolation finds wide applications in , for curve rendering, for resampling, and numerical methods such as quadrature (integration) and differentiation, where it derives rules from fitted polynomials. estimation is essential, often bounded by the next of the underlying function and the point distribution, guiding the choice of method for accuracy and computational efficiency.

Fundamentals

Definition

In numerical analysis, interpolation is the process of constructing a function that passes exactly through a given set of discrete data points to estimate values at intermediate locations. Specifically, given a set of n+1n+1 distinct points (x0,y0),(x1,y1),,(xn,yn)(x_0, y_0), (x_1, y_1), \dots, (x_n, y_n) where the xix_i are the nodes and yi=f(xi)y_i = f(x_i) are the corresponding function values, interpolation seeks a function ff such that f(xi)=yif(x_i) = y_i for i=0,1,,ni = 0, 1, \dots, n, allowing evaluation of f(x)f(x) for xx not coinciding with any xix_i. This approach assumes basic knowledge of functions and emphasizes the role of data points as nodes where the interpolant must fit exactly. A key distinction from is that interpolation requires an exact fit at the specified nodes, whereas methods, such as least-squares fitting, seek to minimize overall error without necessarily passing through every point—often when the number of data points exceeds the degree of the approximating function. For instance, in of degree at most nn through n+1n+1 points, the solution is unique, ensuring precise reproduction at the nodes. Common formulations include the Lagrange interpolating polynomial, expressed as p(x)=i=0nyii(x),p(x) = \sum_{i=0}^n y_i \ell_i(x), where i(x)=j=0,jinxxjxixj\ell_i(x) = \prod_{j=0, j \neq i}^n \frac{x - x_j}{x_i - x_j} are the basis polynomials, or the Newton form, which builds the interpolant incrementally using without deriving the full expressions here. These provide introductory ways to represent the interpolating function while maintaining the exact fit requirement.

Motivation and History

Interpolation serves as a fundamental tool in for estimating unknown values within discrete datasets, enabling the approximation of continuous functions from sampled points in fields such as scientific measurement, engineering design, and data visualization. This need arises because real-world data is often collected at irregular or finite intervals, requiring interpolation to fill gaps, smooth curves, or predict intermediate values for practical applications like modeling physical phenomena or generating graphical representations. The practice of interpolation dates back to ancient astronomy, where Claudius Ptolemy employed techniques in the 2nd century AD to construct tables of chord functions in his , facilitating the prediction of planetary positions from discrete observations. In the , laid foundational work on finite differences and interpolation in a 1675 letter, establishing methods for approximation that influenced classical . advanced this in 1795 with his explicit formula for , providing a systematic way to construct unique polynomials passing through given points. By the early 20th century, highlighted limitations in 1901, demonstrating through examples that high-degree on equispaced points could lead to oscillatory errors, known as , which underscored the need for careful node selection in approximations. Key advancements in the mid-20th century included the mathematical formalization of splines by I. J. Schoenberg in 1946, inspired by flexible wooden or metal strips used in to draw smooth hull curves, leading to piecewise methods that avoid global oscillations. In , D. G. Krige developed early statistical interpolation techniques in 1951 for estimating ore grades in South African , which Georges Matheron formalized as in 1960, introducing optimal unbiased prediction under spatial correlation assumptions. The advent of digital computing in the 1950s propelled interpolation into numerical methods for solving differential equations and data processing on early machines like , enabling efficient implementation of algorithms for engineering simulations. In the modern era, interpolation has integrated with artificial intelligence, particularly post-2020, for data imputation in large-scale machine learning datasets, where methods like Gaussian process regression variants enhance missing value estimation while preserving statistical properties in high-dimensional data.

Univariate Interpolation Methods

Nearest-Neighbor Interpolation

Nearest-neighbor interpolation, also known as piecewise constant interpolation or zero-order hold, is the simplest method for estimating values between known data points in univariate interpolation. It assigns to a query point xx the value yiy_i of the closest data point xix_i from a given set of pairs (xj,yj)(x_j, y_j) for j=1j = 1 to nn, without any blending or smoothing. This approach is particularly suited for categorical data or scenarios where smoothness is not required, producing a step-like function with constant values in intervals defined by midpoints between neighboring points. The mathematical formulation is given by: f(x)=yiwherei=argminj=1,,nxxjf(x) = y_i \quad \text{where} \quad i = \arg\min_{j=1,\dots,n} |x - x_j| In cases of ties (equal distances to multiple points), a convention such as selecting the leftmost or rightmost point, or rounding toward even indices, is typically applied to ensure determinism. The algorithm involves, for a query point xx, computing the Euclidean distance (absolute difference in one dimension) to each xjx_j and selecting the yjy_j with the minimum distance; this naive implementation runs in O(n)O(n) time per query. With preprocessing, such as sorting the xjx_j or using a search structure like a binary search tree, the time complexity can be reduced to O(logn)O(\log n) or O(1)O(1) for uniform grids. Consider a simple example with data points (0,1)(0, 1), (1,3)(1, 3), and (2,2)(2, 2). For a query at x=0.6x = 0.6, the distances are 0.60=0.6|0.6 - 0| = 0.6, 0.61=0.4|0.6 - 1| = 0.4, and 0.62=1.4|0.6 - 2| = 1.4, so the closest point is (1,3)(1, 3) and f(0.6)=3f(0.6) = 3. This results in a discontinuous : constant at 1 for x[0.5,0.5)x \in [-0.5, 0.5), 3 for x[0.5,1.5)x \in [0.5, 1.5), and 2 for x[1.5,2.5)x \in [1.5, 2.5), visualizing as a staircase with jumps at decision boundaries. Nearest-neighbor interpolation offers significant computational efficiency, requiring minimal operations beyond distance comparisons, making it ideal for real-time applications or large datasets where speed trumps accuracy. It also preserves the exact range of input values, avoiding extrapolation beyond the data. However, it produces non-differentiable, discontinuous outputs that poorly approximate smooth underlying functions, leading to visual artifacts like blockiness in images or in signals.

Linear Interpolation

Linear interpolation is a fundamental method in for estimating values between known data points by constructing a that connects consecutive points with straight line segments, assuming the data points are ordered by their independent variable values xix_i. This approach provides a simple, first-order that is computationally efficient and preserves monotonicity within each interval. The interpolated value f(x)f(x) for a query point xx lying within the interval [xi,xi+1][x_i, x_{i+1}], where the known points are (xi,yi)(x_i, y_i) and (xi+1,yi+1)(x_{i+1}, y_{i+1}) with xi<xi+1x_i < x_{i+1}, is given by the formula: f(x)=yi+(yi+1yi)xxixi+1xif(x) = y_i + (y_{i+1} - y_i) \frac{x - x_i}{x_{i+1} - x_i} This expression ensures exact reproduction of the data points at the endpoints. The formula derives from the concept of a weighted average, where f(x)f(x) is a convex combination of yiy_i and yi+1y_{i+1}. The weight for yi+1y_{i+1} is the relative distance xxixi+1xi\frac{x - x_i}{x_{i+1} - x_i}, which ranges from 0 to 1 across the interval, and the weight for yiy_i is the complement 1xxixi+1xi1 - \frac{x - x_i}{x_{i+1} - x_i}. This linear weighting follows directly from the equation of a straight line passing through the two points, parameterized by the slope m=yi+1yixi+1xim = \frac{y_{i+1} - y_i}{x_{i+1} - x_i}. To illustrate, consider the data points (0,0)(0, 0), (1,1)(1, 1), and (2,0)(2, 0). The piecewise linear interpolant consists of two segments: from x=0x=0 to x=1x=1, and from x=1x=1 to x=2x=2. The following table shows interpolated values at selected points within these intervals:
xxIntervalf(x)f(x)
0.0[0,1]0.0
0.25[0,1]0.25
0.5[0,1]0.5
0.75[0,1]0.75
1.0[0,1] or [1,2]1.0
1.25[1,2]0.75
1.5[1,2]0.5
1.75[1,2]0.25
2.0[1,2]0.0
These values form straight lines between the points, resulting in a continuous "tent" shape that peaks at x=1x=1. Key properties of linear interpolation include C0C^0 continuity, meaning the function is continuous across all points but not necessarily differentiable at the junctions xix_i, where slopes may change abruptly. It also features local support, as the value at any xx depends only on the two nearest data points enclosing it, enabling efficient computation without global influence. The approximation error is bounded by O(h2)O(h^2), where hh is the maximum interval length, specifically fph28f\|f - p\|_\infty \leq \frac{h^2}{8} \|f''\|_\infty for twice-differentiable functions ff, assuming uniform spacing for the bound. Linear interpolation finds practical use in basic data plotting, where it connects discrete points to form smooth visual representations, and in computer animation for generating intermediate frames between key poses via straightforward tweening.

Polynomial Interpolation

Polynomial interpolation constructs a unique polynomial p(x)p(x) of degree at most n1n-1 that passes exactly through nn given distinct points (xi,yi)(x_i, y_i) for i=0,1,,n1i = 0, 1, \dots, n-1, where yi=f(xi)y_i = f(x_i) for some underlying function ff. This global method applies the same polynomial across the entire domain, making it suitable for exact fitting but prone to instability for high degrees or ill-conditioned points. One common representation is the Lagrange form, which expresses p(x)p(x) directly in terms of the data points without solving a system of equations: p(x)=i=0n1yii(x),p(x) = \sum_{i=0}^{n-1} y_i \ell_i(x), where the Lagrange basis polynomials are i(x)=j=0jin1xxjxixj.\ell_i(x) = \prod_{\substack{j=0 \\ j \neq i}}^{n-1} \frac{x - x_j}{x_i - x_j}. Each i(x)\ell_i(x) is 1 at x=xix = x_i and 0 at the other points xjx_j (jij \neq i), ensuring the interpolation conditions are satisfied. This form is intuitive for theoretical analysis but computationally inefficient for large nn due to the product evaluations. An alternative is the Newton form, which builds the polynomial incrementally using divided differences and is more efficient for adding points or evaluating at multiple locations: p(x)=f[x0]+f[x0,x1](xx0)+f[x0,x1,x2](xx0)(xx1)++f[x0,,xn1]k=0n2(xxk),p(x) = f[x_0] + f[x_0, x_1](x - x_0) + f[x_0, x_1, x_2](x - x_0)(x - x_1) + \cdots + f[x_0, \dots, x_{n-1}] \prod_{k=0}^{n-2} (x - x_k), where the coefficients are the divided differences defined recursively: f[xi]=yif[x_i] = y_i, and for k1k \geq 1, f[xi,,xi+k]=f[xi+1,,xi+k]f[xi,,xi+k1]xi+kxi.f[x_i, \dots, x_{i+k}] = \frac{f[x_{i+1}, \dots, x_{i+k}] - f[x_i, \dots, x_{i+k-1}]}{x_{i+k} - x_i}. These can be computed via a divided-difference table, facilitating numerical stability and error estimation. For equispaced points, forward differences simplify the process, but the general form handles arbitrary spacing. Example: Cubic Interpolation for Rocket Velocity Consider interpolating the upward velocity v(t)v(t) of a rocket at times t=0,2,4,6t = 0, 2, 4, 6 seconds, with data:
tt (s)v(t)v(t) (m/s)
00
2227
4362
6517
The divided-difference table is:
tit_if[ti]f[t_i]First-orderSecond-orderThird-order
00
113.5
2227-11.5
67.52.333
43622.5
77.5
6517
The first-order differences are (2270)/(20)=113.5(227-0)/(2-0) = 113.5, (362227)/(42)=67.5(362-227)/(4-2) = 67.5, (517362)/(64)=77.5(517-362)/(6-4) = 77.5. The second-order differences are (67.5113.5)/(40)=11.5(67.5 - 113.5)/(4-0) = -11.5, (77.567.5)/(62)=2.5(77.5 - 67.5)/(6-2) = 2.5. The third-order difference is (2.5(11.5))/(60)=2.333(2.5 - (-11.5))/(6-0) = 2.333. The cubic Newton polynomial is p(t)=0+113.5t11.5t(t2)+2.333t(t2)(t4).p(t) = 0 + 113.5 t - 11.5 t (t-2) + 2.333 t (t-2)(t-4). This fits the four points exactly and can be used to estimate velocity between them, such as at t=3t=3: p(3)=299p(3) = 299 m/s. The interpolation error for a sufficiently smooth function ff is f(x)p(x)=f(n)(ξ)n!ω(x),f(x) - p(x) = \frac{f^{(n)}(\xi)}{n!} \omega(x), where ξ\xi lies between the minimum and maximum of x,x0,,xn1x, x_0, \dots, x_{n-1}, and ω(x)=i=0n1(xxi)\omega(x) = \prod_{i=0}^{n-1} (x - x_i) is the nodal polynomial. This bound highlights that error depends on the (n)(n)th derivative and the point distribution; clustered points near xx can reduce ω(x)\omega(x). For exact fit, the error is zero at the nodes. A key limitation is Runge's phenomenon, where high-degree polynomials with equispaced nodes produce large oscillations near the interval endpoints, diverging from the true function even as nn increases. This arises because the Lebesgue constant, which bounds the maximum error amplification, grows as O(logn)\mathcal{O}(\log n) for equispaced points but exponentially for Chebyshev nodes. For instance, interpolating f(x)=1/(1+25x2)f(x) = 1/(1 + 25x^2) on [1,1][-1, 1] with a degree-10 polynomial through 11 equispaced points shows pronounced overshoots near x=±1x = \pm 1, up to 20% deviation. Using clustered nodes like Chebyshev points mitigates this instability. Linear interpolation is the special case of polynomial interpolation with n=2n=2 and degree 1.

Spline Interpolation

Spline interpolation constructs a function composed of piecewise polynomials of degree kk that interpolate given data points at knots xix_i, ensuring Ck1C^{k-1} continuity of derivatives at the interior knots to achieve smoothness. Common boundary conditions include natural splines, where higher-order derivatives vanish at endpoints; clamped splines, which specify endpoint derivatives; and complete splines, which incorporate additional constraints for higher smoothness. Cubic spline interpolation, with k=3k=3, uses piecewise cubic polynomials that are continuous along with their first and second derivatives at the knots, providing a balance of flexibility and smoothness suitable for most practical applications. For data points (xi,yi)(x_i, y_i) where x0<x1<<xnx_0 < x_1 < \cdots < x_n, the spline s(x)s(x) on each interval [xi,xi+1][x_i, x_{i+1}] takes the form si(x)=ai+bi(xxi)+ci(xxi)2+di(xxi)3,s_i(x) = a_i + b_i (x - x_i) + c_i (x - x_i)^2 + d_i (x - x_i)^3, with ai=yia_i = y_i. The coefficients bib_i, cic_i, and did_i are determined by enforcing interpolation at the knots, si(xi+1)=yi+1s_i(x_{i+1}) = y_{i+1}, and continuity of the first and second derivatives at interior knots, si(xi+1)=si+1(xi+1)s_i'(x_{i+1}) = s_{i+1}'(x_{i+1}) and si(xi+1)=si+1(xi+1)s_i''(x_{i+1}) = s_{i+1}''(x_{i+1}). These conditions yield a system of linear equations for the second derivatives (or moments) at the knots, which forms a tridiagonal matrix that can be efficiently solved using algorithms like the Thomas algorithm. For a natural cubic spline, the second derivatives at the endpoints are set to zero, simplifying the boundary conditions. Consider fitting five data points; the natural cubic spline produces a curve that closely follows the points with minimal overshoot, in contrast to a global quartic polynomial, which may exhibit unwanted wiggles due to sensitivity to endpoint values. Key advantages of spline interpolation include local control, where modifications to a single data point influence only adjacent segments, reducing propagation of errors. Unlike high-degree global polynomials, splines avoid the Runge phenomenon, preventing large oscillations near the boundaries. Computationally, the B-spline basis representation enhances stability and efficiency, as each basis function has compact support limited to a few intervals. In computer-aided design (CAD) software, spline methods underpin Non-Uniform Rational B-Splines (NURBS), which extend polynomial splines to rational forms for exact representation of conics and have evolved post-2000 with advanced implementations for freeform surface modeling in industries like automotive and aerospace.

Multivariate and Higher-Dimensional Interpolation

Bilinear and Higher-Order Interpolation

Bilinear interpolation extends univariate linear interpolation to two dimensions, particularly for functions defined on rectangular grids. It constructs a function f(x,y)f(x, y) that is linear in each variable separately, typically expressed as f(x,y)=a+bx+cy+dxyf(x, y) = a + b x + c y + d x y, where the coefficients a,b,c,da, b, c, d are determined by fitting the interpolant to the known values at the four corners of a rectangular cell in the grid. This method ensures the interpolant passes exactly through the grid points and provides a smooth approximation within each cell. The explicit formula for bilinear interpolation on a unit square with vertices at (0,0)(0,0), (1,0)(1,0), (0,1)(0,1), and (1,1)(1,1) is given by f(u,v)=(1u)(1v)f(0,0)+u(1v)f(1,0)+(1u)vf(0,1)+uvf(1,1),f(u, v) = (1-u)(1-v) f(0,0) + u(1-v) f(1,0) + (1-u)v f(0,1) + u v f(1,1), where (u,v)(u, v) are the normalized coordinates within the cell, scaled from the actual position (x,y)(x, y) relative to the cell boundaries. This can be computed efficiently via two successive one-dimensional linear interpolations: first along one axis to find intermediate values, then along the other axis. Higher-order extensions, such as trilinear interpolation in three dimensions, follow a tensor-product structure on cubic cells. The formula incorporates an additional parameter ww for the third dimension, yielding f(u,v,w)=f(u,v,0)(1w)+f(u,v,1)w,f(u, v, w) = f(u,v,0)(1-w) + f(u,v,1)w, where f(u,v,0)f(u,v,0) and f(u,v,1)f(u,v,1) are bilinear interpolants on the bottom and top faces of the cube, respectively, ensuring exact reproduction at the eight vertices. This approach generalizes to higher dimensions but increases computational cost due to the growing number of terms ( 2d2^d for dimension dd ). For irregular or unstructured grids, such as triangular meshes, a simplicial alternative uses barycentric coordinates to perform linear interpolation over simplices (triangles in 2D, tetrahedra in 3D). Given a point PP inside a triangle with vertices A,B,CA, B, C and values f(A),f(B),f(C)f(A), f(B), f(C), the interpolant is f(P)=λAf(A)+λBf(B)+λCf(C)f(P) = \lambda_A f(A) + \lambda_B f(B) + \lambda_C f(C), where λA,λB,λC\lambda_A, \lambda_B, \lambda_C are the barycentric coordinates satisfying λA+λB+λC=1\lambda_A + \lambda_B + \lambda_C = 1 and λi0\lambda_i \geq 0, computed as area ratios of the sub-triangles formed by PP. This method is affine-invariant and suitable for triangulated domains without requiring a regular grid. A common application is in image resizing, where bilinear interpolation estimates pixel values on a new grid to scale an image. For instance, upscaling a low-resolution image using nearest-neighbor interpolation produces blocky artifacts, while bilinear yields smoother results by blending neighboring pixels, though it may introduce slight blurring around edges compared to higher-order methods. Bilinear and simplicial interpolants are C0C^0 continuous across cell boundaries, meaning they are continuous but not necessarily differentiable, and possess affine invariance, preserving linear functions exactly. The local truncation error for smooth functions is O(h2)O(h^2) in each dimension, where hh is the grid spacing, analogous to the error in univariate linear interpolation. On non-uniform grids, bilinear interpolation can exhibit anisotropy, where the approximation quality varies directionally due to stretched or distorted cells, potentially degrading the O(h2)O(h^2) error bound if the grid aspect ratio becomes large.

Inverse Distance Weighting

Inverse Distance Weighting (IDW) is a deterministic spatial interpolation technique that estimates values at unsampled locations using a weighted average of known data points, where the weights are inversely proportional to the distances from the estimation point. This method assumes that points closer to the prediction location have greater influence on the interpolated value. The core formulation, known as Shepard's method, is given by f(x)=i=1nwiyii=1nwi,f(\mathbf{x}) = \frac{\sum_{i=1}^{n} w_i y_i}{\sum_{i=1}^{n} w_i}, where yiy_i are the observed values at known locations xi\mathbf{x}_i, wi=1/xxipw_i = 1 / \|\mathbf{x} - \mathbf{x}_i\|^p are the weights, nn is the number of neighboring points considered, and pp is the power parameter that controls the rate of weight decay with distance. The method was originally proposed for irregularly spaced data in two dimensions and has since been extended to higher dimensions. The power parameter pp, typically set to 2, determines the smoothness of the interpolated surface: lower values of pp (e.g., 1) produce smoother surfaces by giving more influence to distant points, while higher values (e.g., 3 or more) create sharper, more localized interpolations that emphasize nearby points. Another key parameter is the search radius or the number of nearest neighbors kk, which limits the points used in the weighting to improve computational efficiency and focus on local structure; using all points can lead to over-smoothing in large datasets. In the limit as pp approaches infinity, IDW behaves like nearest-neighbor interpolation by assigning full weight to the closest point. The algorithm for IDW proceeds as follows: for a query point x\mathbf{x}, identify the relevant neighboring points (all within a search radius, or the kk nearest); compute the Euclidean distances di=xxid_i = \|\mathbf{x} - \mathbf{x}_i\| to each; calculate unnormalized weights wi=1/dipw_i = 1 / d_i^p (with a small constant added to avoid division by zero at data points); and finally, compute the weighted average as normalized by the sum of weights. This process is repeated for each prediction location, making IDW suitable for scattered data without requiring a underlying grid. A common application of IDW is in estimating terrain elevation from scattered elevation measurements in two dimensions, such as generating digital elevation models (DEMs) from field surveys; for instance, with p=2p=2, closer survey points contribute more heavily to the interpolated height map, producing a surface that reflects local topography while smoothing over gaps. IDW is an exact interpolator, meaning it reproduces known data values precisely at the input locations, but the resulting surface is generally smooth yet prone to artifacts like bull's-eye patterns—concentric rings of similar values around isolated data points—especially with high pp or sparse data, as the method lacks inherent smoothness guarantees beyond continuity. Variants of IDW include the modified Shepard's method, which incorporates barriers (permeable or absolute) to prevent interpolation across obstacles like rivers or faults, and approaches with variable pp that adapt the power locally based on data density for improved accuracy. In geographic information systems (GIS) software, such as (updated through versions post-2015), IDW is implemented with customizable search neighborhoods and barrier support, facilitating its use in environmental modeling and resource estimation.

Radial Basis Functions

Radial basis function (RBF) interpolation provides a flexible, meshfree approach to constructing smooth approximations from scattered data points in arbitrary dimensions, particularly suited for multivariate settings where structured grids are unavailable. The interpolant takes the form s(x)=i=1Nλiϕ(xxi)+p(x),s(\mathbf{x}) = \sum_{i=1}^N \lambda_i \phi(\|\mathbf{x} - \mathbf{x}_i\|) + p(\mathbf{x}), where ϕ:[0,)R\phi: [0, \infty) \to \mathbb{R} is a univariate radial kernel function, xiRd\mathbf{x}_i \in \mathbb{R}^d are the scattered data sites, λiR\lambda_i \in \mathbb{R} are coefficients to be determined, yi=s(xi)y_i = s(\mathbf{x}_i) are the data values, and p(x)p(\mathbf{x}) is an optional low-degree polynomial (often of degree at most m1m-1) included to enhance stability and enable polynomial reproduction for kernels that are only conditionally positive definite. The coefficients λ=(λ1,,λN)T\lambda = (\lambda_1, \dots, \lambda_N)^T and parameters of pp are solved via the interpolation conditions, yielding the symmetric linear system Φλ=y\Phi \lambda = \mathbf{y}, where y=(y1,,yN)T\mathbf{y} = (y_1, \dots, y_N)^T and the RBF matrix Φ\Phi has entries Φij=ϕ(xixj)\Phi_{ij} = \phi(\|\mathbf{x}_i - \mathbf{x}_j\|). For conditionally positive definite kernels, the system is augmented with side conditions i=1Nλiqk(xi)=0\sum_{i=1}^N \lambda_i q_k(\mathbf{x}_i) = 0 for a basis {qk}\{q_k\} of the polynomial space to ensure solvability and uniqueness. Direct solution via Gaussian elimination costs O(N3)O(N^3), which is prohibitive for large NN, and the matrix Φ\Phi often becomes severely ill-conditioned, especially as NN grows or the shape parameter varies, leading to numerical instability in floating-point arithmetic. Preconditioning and iterative methods, such as conjugate gradients, mitigate these issues by exploiting the matrix's positive definiteness properties. Common radial kernels include the infinitely differentiable Gaussian ϕ(r)=e(ϵr)2\phi(r) = e^{-(\epsilon r)^2}, which is strictly positive definite on Rd\mathbb{R}^d for any ϵ>0\epsilon > 0 and dimension dd, guaranteeing a unique interpolant without polynomials; the multiquadric ϕ(r)=1+(ϵr)2\phi(r) = \sqrt{1 + (\epsilon r)^2}
Add your contribution
Related Hubs
User Avatar
No comments yet.