Hubbry Logo
Iterative proportional fittingIterative proportional fittingMain
Open search
Iterative proportional fitting
Community hub
Iterative proportional fitting
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Iterative proportional fitting
Iterative proportional fitting
from Wikipedia

The iterative proportional fitting procedure (IPF or IPFP, also known as biproportional fitting or biproportion in statistics or economics (input-output analysis, etc.), RAS algorithm[1] in economics, raking in survey statistics, and matrix scaling in computer science) is the operation of finding the fitted matrix which is the closest to an initial matrix but with the row and column totals of a target matrix (which provides the constraints of the problem; the interior of is unknown). The fitted matrix being of the form , where and are diagonal matrices such that has the margins (row and column sums) of . Some algorithms can be chosen to perform biproportion. We have also the entropy maximization,[2][3] information loss minimization (or cross-entropy)[4] or RAS which consists of factoring the matrix rows to match the specified row totals, then factoring its columns to match the specified column totals; each step usually disturbs the previous step's match, so these steps are repeated in cycles, re-adjusting the rows and columns in turn, until all specified marginal totals are satisfactorily approximated. However, all algorithms give the same solution.[5] In three- or more-dimensional cases, adjustment steps are applied for the marginals of each dimension in turn, the steps likewise repeated in cycles.

History

[edit]

IPF has been "re-invented" many times, the earliest by Kruithof in 1937 [6] in relation to telephone traffic ("Kruithof’s double factor method"), Deming and Stephan in 1940[7] for adjusting census crosstabulations, and G.V. Sheleikhovskii for traffic as reported by Bregman.[8] (Deming and Stephan proposed IPFP as an algorithm leading to a minimizer of the Pearson X-squared statistic, which Stephan later reported it does not).[9] Early proofs of uniqueness and convergence came from Sinkhorn (1964),[10] Bacharach (1965),[11] Bishop (1967),[12] and Fienberg (1970).[13] Bishop's proof that IPFP finds the maximum likelihood estimator for any number of dimensions extended a 1959 proof by Brown for 2x2x2... cases. Fienberg's proof by differential geometry exploits the method's constant crossproduct ratios, for strictly positive tables. Csiszár (1975).[14] found necessary and sufficient conditions for general tables having zero entries. Pukelsheim and Simeone (2009) [15] give further results on convergence and error behavior.

An exhaustive treatment of the algorithm and its mathematical foundations can be found in the book of Bishop et al. (1975).[16] Idel (2016)[17] gives a more recent survey.

Other general algorithms can be modified to yield the same limit as the IPFP, for instance the Newton–Raphson method and the EM algorithm. In most cases, IPFP is preferred due to its computational speed, low storage requirements, numerical stability and algebraic simplicity.

Applications of IPFP have grown to include trip distribution models, Fratar or Furness and other applications in transportation planning (Lamond and Stewart), survey weighting, synthesis of cross-classified demographic data, adjusting input–output models in economics, estimating expected quasi-independent contingency tables, biproportional apportionment systems of political representation, and for a preconditioner in linear algebra.[18]

Biproportion

[edit]

Biproportion, whatever the algorithm used to solve it, is the following concept: , matrix and matrix are known real nonnegative matrices of dimension ; the interior of is unknown and is searched such that has the same margins than , i.e. and ( being the sum vector), and such that is close to following a given criterion, the fitted matrix being of the form , where and are diagonal matrices.

s.t., ∀ and , ∀. The Lagrangian is .

Thus , for ∀,

which, after posing and , yields

, ∀, i.e., ,

with , ∀ and , ∀. and form a system that can be solve iteratively:

, ∀ and , ∀.

The solution is independent of the initialization chosen (i.e., we can begin by , ∀ or by , ∀. If the matrix is “indecomposable”, then this process has a unique fixed-point because it is deduced from a program where the function is a convex and continuously derivable function defined on a compact set. In some cases the solution may not exist: see de Mesnard's example cited by Miller and Blair (Miller R.E. & Blair P.D. (2009) Input-output analysis: Foundations and Extensions, Second edition, Cambridge (UK): Cambridge University Press, p. 335-336 (freely available)).

Some properties (see de Mesnard (1994)):

Lack of information: if brings no information, i.e., , ∀ then .

Idempotency: if has the same margins than .

Composition of biproportions: ; .

Zeros: a zero in is projected as a zero in . Thus, a bloc-diagonal matrix is projected as a bloc-diagonal matrix and a triangular matrix is projected as a triangular matrix.

Theorem of separable modifications: if is premutiplied by a diagonal matrix and/or postmultiplied by a diagonal matrix, then the solution is unchanged.

Theorem of "unicity": If is any non-specified algorithm, with , and being unknown, then and can always be changed into the standard form of and . The demonstrations calls some above properties, particularly the Theorem of separable modifications and the composition of biproportions.

Algorithm 1 (classical IPF)

[edit]

Given a two-way (I × J)-table , we wish to estimate a new table for all i and j such that the marginals satisfy and .

Choose initial values , and for set

Repeat these steps until row and column totals are sufficiently close to u and v.

Notes:

  • For the RAS form of the algorithm, define the diagonalization operator , which produces a (diagonal) matrix with its input vector on the main diagonal and zero elsewhere. Then, for each row adjustment, let , from which . Similarly each column adjustment's , from which . Reducing the operations to the necessary ones, it can easily be seen that RAS does the same as classical IPF. In practice, one would not implement actual matrix multiplication with the whole R and S matrices; the RAS form is more a notational than computational convenience.

Algorithm 2 (factor estimation)

[edit]

Assume the same setting as in the classical IPFP. Alternatively, we can estimate the row and column factors separately: Choose initial values , and for set

Repeat these steps until successive changes of a and b are sufficiently negligible (indicating the resulting row- and column-sums are close to u and v).

Finally, the result matrix is

Notes:

  • The two variants of the algorithm are mathematically equivalent, as can be seen by formal induction. With factor estimation, it is not necessary to actually compute each cycle's .
  • The factorization is not unique, since it is for all .

Discussion

[edit]

The vaguely demanded 'similarity' between M and X can be explained as follows: IPFP (and thus RAS) maintains the crossproduct ratios, i.e.

since

This property is sometimes called structure conservation and directly leads to the geometrical interpretation of contingency tables and the proof of convergence in the seminal paper of Fienberg (1970).

Direct factor estimation (algorithm 2) is generally the more efficient way to solve IPF: Whereas a form of the classical IPFP needs

elementary operations in each iteration step (including a row and a column fitting step), factor estimation needs only

operations being at least one order in magnitude faster than classical IPFP.

IPFP can be used to estimate expected quasi-independent (incomplete) contingency tables, with , and for included cells and for excluded cells. For fully independent (complete) contingency tables, estimation with IPFP concludes exactly in one cycle.

Comparison with the NM-method

[edit]

Similar to the IPF, the NM-method is also an operation of finding a matrix which is the “closest” to matrix () while its row totals and column totals are identical to those of a target matrix .

However, there are differences between the NM-method and the IPF. For instance, the NM-method defines closeness of matrices of the same size differently from the IPF.[19] Also, the NM-method was developed to solve for matrix in problems, where matrix is not a sample from the population characterized by the row totals and column totals of matrix , but represents another population.[19] In contrast, matrix is a sample from this population in problems where the IPF is applied as the maximum likelihood estimator.

Macdonald (2023)[20] is at ease with the conclusion by Naszodi (2023)[21] that the IPF is suitable for sampling correction tasks, but not for generation of counterfactuals. Similarly to Naszodi, Macdonald also questions whether the row and column proportional transformations of the IPF preserve the structure of association within a contingency table that allows us to study social mobility.

Existence and uniqueness of MLEs

[edit]

Necessary and sufficient conditions for the existence and uniqueness of MLEs are complicated in the general case (see[22]), but sufficient conditions for 2-dimensional tables are simple:

  • the marginals of the observed table do not vanish (that is, ) and
  • the observed table is inseparable (i.e. the table does not permute to a block-diagonal shape).

If unique MLEs exist, IPFP exhibits linear convergence in the worst case (Fienberg 1970), but exponential convergence has also been observed (Pukelsheim and Simeone 2009). If a direct estimator (i.e. a closed form of ) exists, IPFP converges after 2 iterations. If unique MLEs do not exist, IPFP converges toward the so-called extended MLEs by design (Haberman 1974), but convergence may be arbitrarily slow and often computationally infeasible.

If all observed values are strictly positive, existence and uniqueness of MLEs and therefore convergence is ensured.

Example

[edit]

Consider the following table, given with the row- and column-sums and targets.

1 2 3 4 TOTAL TARGET
1 40 30 20 10 100 150
2 35 50 100 75 260 300
3 30 80 70 120 300 400
4 20 30 40 50 140 150
TOTAL 125 190 230 255 800
TARGET 200 300 400 100 1000

For executing the classical IPFP, we first adjust the rows:

1 2 3 4 TOTAL TARGET
1 60.00 45.00 30.00 15.00 150.00 150
2 40.38 57.69 115.38 86.54 300.00 300
3 40.00 106.67 93.33 160.00 400.00 400
4 21.43 32.14 42.86 53.57 150.00 150
TOTAL 161.81 241.50 281.58 315.11 1000.00
TARGET 200 300 400 100 1000

The first step exactly matched row sums, but not the column sums. Next we adjust the columns:

1 2 3 4 TOTAL TARGET
1 74.16 55.90 42.62 4.76 177.44 150
2 49.92 71.67 163.91 27.46 312.96 300
3 49.44 132.50 132.59 50.78 365.31 400
4 26.49 39.93 60.88 17.00 144.30 150
TOTAL 200.00 300.00 400.00 100.00 1000.00
TARGET 200 300 400 100 1000

Now the column sums exactly match their targets, but the row sums no longer match theirs. After completing three cycles, each with a row adjustment and a column adjustment, we get a closer approximation:

1 2 3 4 TOTAL TARGET
1 64.61 46.28 35.42 3.83 150.13 150
2 49.95 68.15 156.49 25.37 299.96 300
3 56.70 144.40 145.06 53.76 399.92 400
4 28.74 41.18 63.03 17.03 149.99 150
TOTAL 200.00 300.00 400.00 100.00 1000.00
TARGET 200 300 400 100 1000

Implementation

[edit]

The R package mipfp (currently in version 3.2) provides a multi-dimensional implementation of the traditional iterative proportional fitting procedure.[23] The package allows the updating of a N-dimensional array with respect to given target marginal distributions (which, in turn can be multi-dimensional).

Python has an equivalent package, ipfn[24][25] that can be installed via pip. The package supports numpy and pandas input objects.

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Iterative proportional fitting (IPF), also known as biproportional fitting or the RAS method, is an iterative that adjusts the entries of an initial non-negative matrix—typically a —to match specified row and column marginal totals while preserving the relative internal structure of the data as closely as possible. Introduced in 1940 by and Frederick F. Stephan to reconcile discrepancies between observed and expected frequency totals in sampled tables, the procedure alternates between scaling rows and columns proportionally until the marginal constraints are satisfied within a tolerance, often converging to the unique solution under mild conditions such as positive marginals. IPF yields maximum likelihood estimates for cell probabilities under a multinomial model with fixed margins, equivalent to fitting an independence , and is widely applied in survey (as "raking"), economic input-output analysis, population synthesis for microsimulation, and multidimensional extensions for higher-order tables. Despite its empirical reliability and computational simplicity, convergence is not guaranteed in all cases without positivity assumptions, and alternatives like Sinkhorn-Knopp scaling address related transport problems.

Overview

Definition and Purpose

Iterative proportional fitting (IPF) is an iterative for adjusting the entries of an initial to satisfy given row and column marginal totals, while preserving the relative proportions of the original matrix as closely as possible. The method alternates between scaling all rows to match their prescribed totals and then scaling all columns to match theirs, repeating until convergence or a specified tolerance is achieved. This process ensures the adjusted matrix reproduces the exact marginal constraints multiplicatively. The core purpose of IPF is to generate or refine multidimensional contingency tables from partial information, such as one-dimensional marginal distributions and a seed matrix capturing approximate cell dependencies. It addresses the problem of estimating probabilities or frequencies when direct observations are unavailable but margins are known from sources like censuses or surveys. For instance, in demographic modeling, IPF constructs cross-classified tables (e.g., households by and ) by fitting an initial uniform or prior-based matrix to separate univariate totals, enabling consistent population projections. In economic applications, such as input-output analysis, IPF updates matrices under the RAS procedure to reconcile margins amid structural changes, maintaining balance without assuming independence. The algorithm's multiplicative nature yields the maximum solution or maximum likelihood estimates under log-linear models with Poisson-distributed cell counts, providing a statistically principled adjustment that avoids negative values and respects data nonnegativity. Limitations include potential non-convergence for incompatible margins or highly sparse seeds, though it remains computationally efficient for moderate dimensions.

Relation to Biproportional Fitting

Biproportional fitting refers to the adjustment of an initial ZZ to a target matrix XX such that Xij=pizijqjX_{ij} = p_i z_{ij} q_j for scaling factors pi>0p_i > 0 and qj>0q_j > 0, ensuring the row sums of XX match prescribed totals uu and the column sums match prescribed totals vv. This formulation preserves the cross-product ratios of the initial matrix while achieving marginal consistency, a property central to applications in estimation and input-output analysis. Iterative proportional fitting (IPF) provides the computational procedure to obtain this biproportional solution by alternately normalizing rows and columns of an intermediate matrix until convergence to the unique solution satisfying the margins, assuming compatibility conditions hold (i.e., the total sum of row margins equals that of column margins). The equivalence arises because each IPF multiplies row ii by ui/u_i / (current row sum ii) and column jj by vj/v_j / (current column sum jj), accumulating the diagonal matrices P=diag(p1,,pm)P = \operatorname{diag}(p_1, \dots, p_m) and Q=diag(q1,,qn)Q = \operatorname{diag}(q_1, \dots, q_n) such that X=PZQX = P Z Q. Under the assumption of positive entries in ZZ and compatible margins, the fixed point of IPF coincides with the unique biproportional fit, which also maximizes the likelihood under a multinomial model for the cell probabilities proportional to zijz_{ij}. This relation underscores IPF's role as the iterative solver for the biproportional problem, with historical usage of "biproportional fitting" predating widespread adoption of IPF terminology, as noted in economic literature from the .

Historical Development

Origins and Early Applications

The iterative proportional fitting procedure originated in 1937 with R. Kruithof's work on telephone traffic networks, where it served as a method to adjust an initial matrix of observed call volumes to match specified row and column totals representing origin-destination demands and capacities. Kruithof's approach, known as the "double factor method," iteratively scaled rows and columns alternately until the marginal constraints were satisfied, addressing imbalances in traffic data estimation without assuming independence between factors. This application arose from practical needs in to reconcile incomplete or inconsistent datasets with known aggregates, demonstrating the method's utility in balancing multi-way arrays under additive constraints. In 1940, and Frederick F. Stephan independently adapted and formalized the procedure for statistical in contingency tables, applying it to adjust sample frequency tables from surveys or censuses to align with known population marginal totals. Their paper outlined an iterative process to minimize discrepancies while preserving relative cell proportions, motivated by challenges in raking survey data where direct marginal controls were available but internal crosstabulations were under-sampled or erroneous. This statistical framing emphasized least-squares adjustments under Poisson sampling assumptions, establishing IPF as a tool for unbiased in incomplete tabular data, particularly in demographic and social surveys. Early applications extended to fields requiring reconciliation of inconsistent matrices, such as early input-output economic modeling and , where the method balanced supply-demand tables against fixed totals. By the mid-1940s, it gained traction in U.S. adjustments and sample validation, though convergence properties remained empirically observed rather than theoretically proven until later analyses. These initial uses highlighted IPF's robustness for multi-dimensional problems but also its sensitivity to starting values and marginal consistency, as inconsistent constraints could prevent convergence.

Key Advancements Post-1940

In the decades following its initial formulation, iterative proportional fitting (IPF) saw significant theoretical refinements, particularly regarding convergence and its statistical interpretation. Sinkhorn (1964) provided an early proof of convergence for the procedure when applied to scaling matrices to doubly stochastic form, establishing conditions under which the iterations stabilize. This work laid groundwork for broader applications, though it initially focused on permutation-invariant cases. (1965) extended analysis to biproportional fitting in economic contexts, demonstrating uniqueness under compatible marginal constraints. By the late 1960s, connections to log-linear models emerged as a pivotal advancement. Darroch (1962) implicitly employed IPF to compute maximum likelihood estimates under models of , bridging the algorithm to probabilistic in contingency tables. Fienberg (1970) formalized its role in deriving maximum likelihood estimates for hierarchical log-linear models, as proposed by (1963), enabling IPF's use beyond simple margin matching to fit complex interaction structures. Bishop (1967) contributed proofs of existence and uniqueness for solutions in multi-way tables, addressing scalability to higher dimensions. Ireland and Kullback (1968) advanced the theoretical foundation by proving monotonic convergence to the unique solution minimizing Kullback-Leibler divergence from the initial matrix, subject to marginal constraints, thus linking IPF to and maximization principles. This interpretation justified IPF's use in scenarios requiring minimal alteration to prior estimates, such as survey weighting or input-output table balancing via the RAS variant, which gained traction in economic modeling during the 1960s for updating matrices with new sector totals. Subsequent extensions in the and beyond included algorithmic optimizations for computational efficiency and handling of incompatible margins, with Fienberg and others emphasizing IPF's equivalence to under log-linear parametrization. These developments solidified IPF's status as a robust tool for empirical adjustment in statistics and , despite occasional non-convergence in sparse or high-dimensional settings without additional regularization.

Mathematical Formulation

Core Problem Setup

Iterative proportional fitting addresses the biproportional scaling problem of adjusting an initial non-negative matrix to prescribed row and column marginals via separable multiplicative factors. Given an n×mn \times m matrix X=(xij)0X = (x_{ij}) \geq 0, a positive row marginal vector r=(r1,,rn)T\mathbf{r} = (r_1, \dots, r_n)^T with ri>0r_i > 0, and a positive column marginal vector c=(c1,,cm)T\mathbf{c} = (c_1, \dots, c_m)^T with cj>0c_j > 0 satisfying the compatibility condition i=1nri=j=1mcj\sum_{i=1}^n r_i = \sum_{j=1}^m c_j, the task is to determine positive diagonal matrices [P](/page/P′′)=diag(p)[P](/page/P′′) = \operatorname{diag}(\mathbf{p}) and [Q](/page/Q)=diag(q)[Q](/page/Q) = \operatorname{diag}(\mathbf{q}), with pR++n\mathbf{p} \in \mathbb{R}^n_{++} and qR++m\mathbf{q} \in \mathbb{R}^m_{++}, such that the scaled matrix Y=PXQY = P X Q fulfills the marginal constraints Y1m=rY \mathbf{1}_m = \mathbf{r} and 1nTY=cT\mathbf{1}_n^T Y = \mathbf{c}^T, where 1k\mathbf{1}_k is the kk-dimensional all-ones vector. This formulation ensures that, for entries where xij>0x_{ij} > 0, the adjustment ratios satisfy yij/xij=piqjy_{ij} / x_{ij} = p_i q_j, maintaining proportionality through independent row and column scaling factors. The problem originates from the need to reconcile a sampled or prior frequency table with known marginal totals, as posed by Deming and Stephan in , who framed it as a minimizing discrepancies relative to the initial estimates while enforcing the marginals. Under standard assumptions of positive initial entries in the support of the solution and marginal compatibility, a unique solution exists within the biproportional family.

Connection to

Iterative proportional fitting (IPF) computes the maximum likelihood estimates (MLEs) for the expected cell frequencies in a under a log-linear Poisson model where the observed marginal totals are treated as fixed sufficient statistics. Specifically, assuming the cell counts XijX_{ij} are independent Poisson random variables with means μij\mu_{ij}, the model posits logμij=u+ui(1)+uj(2)\log \mu_{ij} = u + u_i^{(1)} + u_j^{(2)} to incorporate row and column effects, ensuring the estimated margins match the observed ones. The likelihood is maximized subject to these margin constraints, and IPF iteratively scales an initial estimate (often the observed table or a uniform matrix) by row and column factors until convergence, yielding the unique MLE under standard regularity conditions. This equivalence arises because the estimating equations for the MLE in such models reduce to matching the observed margins, as the Poisson log-likelihood's score equations depend only on the differences between expected and observed marginals. Each IPF iteration corresponds to a conditional maximization step in the expectation-conditional maximization (ECM) algorithm, a generalization of the expectation-maximization (EM) algorithm, confirming its optimality for the Poisson parameterization. For sparse tables or higher-dimensional arrays, IPF remains computationally efficient for MLE, though direct Newton-Raphson methods may be used for parameter estimation in unsaturated models. Equivalently, under the multinomial sampling framework with fixed total count, IPF provides the MLE for cell probabilities by conditioning on the margins, preserving the same iterative adjustment process. This connection extends to relational models and curved exponential families, where generalized IPF variants compute MLEs for parameters beyond simple margins, such as in network inference from marginals. Empirical studies confirm IPF's MLE properties hold under the Poisson assumption, with deviations only if margins are not sufficient statistics for the model.

Algorithms

Classical Iterative Proportional Fitting

The classical iterative proportional fitting (IPF) procedure, also known as the RAS method in some economic contexts, is an alternating scaling algorithm designed to adjust an initial to satisfy specified row and column marginal totals while preserving the relative structure of the input. Introduced by Deming and Stephan in for estimating cell probabilities in sampled tables under known marginal constraints, it operates through successive multiplicative adjustments to rows and columns. The method assumes the marginal totals are compatible, meaning the sum of row margins equals the sum of column margins, ensuring a feasible solution exists. The algorithm begins with an initial matrix X(0)X^{(0)}, typically derived from observed data, a uniform prior, or an independence assumption (e.g., of marginals divided by grand total). In each k=0,1,2,k = 0, 1, 2, \dots:
  1. Compute row scaling factors ri(k+1)=uijXij(k)r_i^{(k+1)} = \frac{u_i}{\sum_j X_{ij}^{(k)}}, where uiu_i is the target ii-th row marginal.
  2. Update to intermediate matrix Xij(k+1/2)=ri(k+1)Xij(k)X^{(k+1/2)}_{ij} = r_i^{(k+1)} \cdot X_{ij}^{(k)}.
  3. Compute column scaling factors cj(k+1)=vjiXij(k+1/2)c_j^{(k+1)} = \frac{v_j}{\sum_i X_{ij}^{(k+1/2)}}, where vjv_j is the target jj-th column marginal.
  4. Update to Xij(k+1)=cj(k+1)Xij(k+1/2)X^{(k+1)}_{ij} = c_j^{(k+1)} \cdot X_{ij}^{(k+1/2)}.
Iterations continue until the adjusted marginals approximate the targets within a tolerance ϵ\epsilon, such as max(X(k)1u,1TX(k)v)<ϵ\max(|X^{(k)} \mathbf{1} - u|, | \mathbf{1}^T X^{(k)} - v |) < \epsilon, often monitored via chi-squared divergence or relative error. Under positivity assumptions (all entries of X(0)X^{(0)} and margins positive) and compatibility, the procedure converges linearly to the unique nonnegative solution minimizing the Kullback-Leibler divergence from X(0)X^{(0)} subject to the constraints, equivalent to maximum entropy adjustment. Convergence rates depend on the input matrix's structure; for sparse or ill-conditioned cases, it may require hundreds of iterations, prompting accelerations like over-relaxation or Newton-based variants. The classical form alternates row-then-column adjustments, though symmetric variants exist for balanced computation in higher dimensions. Practical implementations, as in SAS's IPF subroutine, apply it to frequency adjustments in contingency tables.

Factor Estimation Approach

The factor estimation approach to iterative proportional fitting computes scaling factors for each dimension directly, rather than iteratively adjusting the full contingency table as in the classical method. For a two-way table, it seeks diagonal matrices PP (row factors) and QQ (column factors) such that the adjusted table X=PZQX = P Z Q matches the target row marginals rr and column marginals cc, where ZZ is the seed table. This formulation minimizes divergence from ZZ subject to the marginal constraints, equivalent to maximum likelihood under a multinomial model assuming independence. The algorithm initializes column factors qj=1q_j = 1 for all jj, then alternates updates: row factors pi=ri/jzijqjp_i = r_i / \sum_j z_{ij} q_j for each ii, followed by column factors qj=cj/izijpiq_j = c_j / \sum_i z_{ij} p_i for each jj. These steps repeat until the marginal discrepancies fall below a tolerance threshold, such as the L2-norm of differences less than 10410^{-4}. The final adjusted table is then xij=zijpiqjx_{ij} = z_{ij} p_i q_j. This avoids full matrix multiplications per iteration, reducing computational cost from O(IJ)O(I J) per step in classical IPF to O(I+J)O(I + J) for factor updates, where II and JJ are dimensions, making it more efficient for large sparse tables. For multi-way tables, the approach extends to multiple factor sets, one per dimension, with cyclic updates over marginal constraints; implementations handle nn-dimensions via tensor operations. Convergence holds under the same conditions as classical IPF—positive seed entries and consistent margins—yielding the unique solution minimizing Kullback-Leibler divergence. Practical software, such as Julia's ProportionalFitting package, employs this for nn-dimensional arrays, confirming its scalability.

Theoretical Properties

Existence and Uniqueness of Solutions

A solution to the biproportional fitting problem underlying iterative proportional fitting (IPF) exists if and only if the prescribed row marginal vector u\mathbf{u} and column marginal vector v\mathbf{v} (with ui=vj\sum u_i = \sum v_j) are compatible with the support of the initial nonnegative matrix ZZ, meaning that u\mathbf{u} and v\mathbf{v} lie within the transportation defined by the zero pattern of ZZ. This compatibility ensures the feasibility of scaling factors pi>0p_i > 0 and qj>0q_j > 0 such that X=diag(p)Zdiag(q)X = \operatorname{diag}(\mathbf{p}) Z \operatorname{diag}(\mathbf{q}) achieves exactly the target margins, without violating structural zeros in ZZ. For instance, if ZZ has full support (all entries positive) and u,v>0\mathbf{u}, \mathbf{v} > \mathbf{0}, existence is guaranteed as the polytope has nonempty interior. When a solution exists, it is unique. This uniqueness stems from the fact that distinct diagonal scaling matrices diag(p)\operatorname{diag}(\mathbf{p}) and diag(q)\operatorname{diag}(\mathbf{q}) producing the same margins would imply identical row and column scaling factors up to the biproportional structure, as the mapping from scalings to margins is strictly monotone and invertible on the feasible set. Formally, if two biproportional scalings BB and CC of ZZ share the same row sums b+=c+\mathbf{b}_+ = \mathbf{c}_+ and column sums b+=c+\mathbf{b}^+ = \mathbf{c}^+, then the row scaling factors coincide (bi+/zi+=ci+/zi+b_{i+} / z_{i+} = c_{i+} / z_{i+} for all ii) and similarly for columns, yielding B=CB = C. This property holds even for matrices with zeros, provided the solution respects the support. In the context of for contingency tables under independent Poisson sampling, the IPF solution corresponds to the unique maximum-likelihood estimator (MLE) when it exists, as the likelihood is strictly log-concave on the feasible set, ensuring a unique global maximum. Absence of manifests as IPF failing to converge to exact margins, often diverging or stabilizing at infeasible approximations; then applies to any limit point within the feasible . Empirical verification in applications, such as demographic projections, confirms that violations of conditions (e.g., overly discrepant marginals relative to [Z](/page/Z)[Z](/page/Z)'s scale) lead to non-convergence, underscoring the theorem's practical import.

Convergence Conditions

The iterative proportional fitting (IPF) procedure converges a biproportional fit to the prescribed row margins rr and column margins ss exists for the initial AA, which requires the total marginal sums to be equal (r+=s+r_+ = s_+) and the target margins to satisfy the support constraints of AA's zero pattern. Specifically, for every subset II of rows, the partial row sum must not exceed the partial column sum over the columns JA(I)J_A(I) that receive support from II in AA, i.e., rIsJA(I)r_I \leq s_{J_A(I)}. These conditions ensure feasibility within the transportation defined by AA's support, preserving zeros throughout iterations and yielding a unique limit matrix when a solution exists. The L1-error between the current row and column sums and their targets decreases monotonically during IPF iterations, bounded above by the maximum discrepancy in partial sums across row , maxI(rIsJA(I)+sJA(I)rI)\max_I (r_I - s_{J_A(I)} + s_{J_A(I)'} - r_{I'}), where primes denote complements; convergence to zero error follows when this bound vanishes, confirming the procedure's success. When marginals and initial entries are strictly positive, these conditions hold generically, leading to rapid convergence to the maximum solution matching the margins. In contrast, infeasible margins—such as inconsistent totals or violations of subset inequalities—result in non-convergence, often manifesting as or in sums. Extensions to general information projection problems, including multivariate densities or constraints, guarantee exponential convergence under additional regularity assumptions like , bounded log-densities (uniformly by some R>0R > 0), and closure of the sum of constraint subspaces in L2(μ)L^2(\mu). The contraction rate ρ<1\rho < 1 depends on the of the constraint operator and geometric factors such as the Friedrichs between subspaces, with larger angles yielding faster rates; these results quantify IPF's behavior beyond classical matrix scaling.

Applications

In Contingency Tables and Demography

Iterative proportional fitting (IPF), also known as the RAS method in certain contexts, serves as a standard procedure for estimating expected cell frequencies in multi-dimensional contingency tables while ensuring consistency with specified marginal totals. Originally developed by Deming and Stephan in 1940 for two-way tables, it was generalized by Fienberg in 1970 to handle higher-dimensional arrays by iteratively adjusting an initial matrix through alternating proportional scalings of rows and columns (or higher-order margins) until the marginal constraints are satisfied within a tolerance. This yields maximum likelihood estimates under log-linear models assuming cell counts follow a Poisson distribution, equivalent to multinomial sampling with fixed totals, under the hypothesis of independence or specified conditional independences. In contingency tables, IPF addresses the problem of incomplete or under-sampling by "completing" the table: starting from an initial estimate (e.g., observed frequencies or uniform priors), it scales subsets of cells proportionally to match observed or target marginals, preserving cross-classification structure without assuming a full parametric form beyond the margins. For instance, in a three-way table of variables like age, , and occupation, IPF can fit one-way, two-way, or all three-way marginals simultaneously, producing estimates that are the unique solution under non-negativity and the specified constraints when a solution exists. The method's iterative nature ensures convergence to the MLE for hierarchical log-linear models, as proven geometrically and algebraically in the literature. Demographic applications of IPF center on population synthesis and projection, where it generates synthetic microdata or joint distributions matching aggregate marginal constraints from censuses, surveys, or administrative records. For example, to create household-level datasets for small areas, IPF adjusts an initial seed table of joint attribute frequencies (e.g., household size by income and location) to align with known univariate totals like total by age-sex or regional counts, enabling agent-based simulations without microdata disclosure risks. In multi-regional demographic models, it refits migration or matrices to updated marginals, such as age-specific rates, facilitating forecasts; a 2016 review notes its widespread use in transportation and health for cost-effective alternatives to full microsimulation when detailed joints are unavailable. Empirical evaluations, such as those in U.S. synthetic datasets, confirm IPF's effectiveness in reproducing marginals accurately, though it may underperform in capturing rare events or correlations beyond the fitted margins without extensions. In census updating, IPF has been applied since the 1970s to estimate small-area statistics by reallocating totals proportionally, as in subnational projections.

In Transportation and Spatial Microsimulation

In , iterative proportional fitting (IPF) estimates origin-destination (OD) matrices by scaling an initial matrix—often derived from surveys or prior models—to match observed marginal totals for trip productions (row sums) and attractions (column sums). This doubly constrained approach underpins in gravity-based models, enabling the reproduction of aggregate flows while preserving compatibility with supply-side constraints like network capacities. For example, in transit systems, IPF processes automatic passenger count data from boarding and alighting sensors to infer route-level OD patterns, as demonstrated in applications to bus networks where matrices are iteratively adjusted until residuals fall below predefined thresholds. A 2010 analysis applied IPF to derive passenger OD flows on urban bus routes, achieving convergence in under 20 iterations for matrices up to 50x50 in size when starting from uniform priors. IPF's utility extends to dynamic OD estimation in rail and highway contexts, where it fuses sources—such as ticket sales and traffic counts—to update matrices iteratively, supporting short-term for congestion management. In one railway network study, IPF integrated heterogeneous inputs to estimate peak-hour OD demands, yielding matrices with mean absolute percentage errors below 5% relative to ground-truth validations from dedicated surveys. The method's computational efficiency, requiring only matrix multiplications and normalizations per cycle, makes it suitable for large-scale applications, though sensitivity to matrix choice necessitates robustness checks against perturbations. In spatial microsimulation, IPF synthesizes disaggregate populations by adjusting a benchmark microdataset (e.g., records with attributes like age, income, and ) to conform to aggregate small-area constraints from censuses or administrative data. The process alternates row and column scalings across multi-way contingency tables, converging to a joint distribution that matches univariate and bivariate marginals simultaneously. This enables policy simulations, such as projecting household-level energy demands or health risks in under-sampled regions. For instance, county-level estimation has employed IPF to expand national survey microdata, aligning outputs with local vital statistics and reducing biases in inferences. Performance assessments confirm IPF's reliability for modest contingency dimensions (e.g., 10-20 categories per margin), with convergence typically achieved in 10-50 iterations under positive initial values and consistent marginals. However, it produces fractional cell weights, often addressed via post-hoc integerization like truncate-replicate-sampling to generate micro-units for agent-based modeling. Empirical tests across small areas showed IPF outperforming uniform priors in replicating observed attribute correlations, with chi-squared goodness-of-fit statistics improving by factors of 2-5 over naive benchmarks. Applications in demographic forecasting highlight its role in overcoming data sparsity, though failures occur with zero cells or incompatible constraints, prompting hybrid uses with .

Practical Considerations

Illustrative Examples

A common application of iterative proportional fitting (IPF) involves adjusting an initial estimate of a two-way to match specified row and column marginal totals while preserving the relative structure of the initial matrix. This process iteratively scales rows and columns until convergence. Consider a survey sample of 300 individuals classified by (female, male) and (Black, White, Asian, Native American, Other), initially weighted by 10 to estimate a of 3000. The initial adjusted table, derived from sample proportions, is as follows:
Sex/EthnicityBlackWhiteAsianNativeOtherTotal
Female30012006030301620
Male15010809030301380
Total450228015060603000
The target population marginals are row totals of 1510 for females and 1490 for males, and column totals of 600 (Black), 2120 (White), 150 (Asian), 100 (Native), and 30 (Other). In the first iteration, scale each row by the ratio of target to current row total: females by 1510/1620 ≈ 0.932, males by 1490/1380 ≈ 1.080. This yields:
Sex/EthnicityBlackWhiteAsianNativeOtherTotal
Female279.631118.5255.9327.9627.961510
Male161.961166.0997.1732.3932.391490
Total441.592284.61153.1060.3560.353000
Next, scale columns to match targets: e.g., Black by 600/441.59 ≈ 1.359, White by 2120/2284.61 ≈ 0.928. This produces:
Sex/EthnicityAsianNativeOtherTotal
Female379.941037.9354.7946.3313.901532.89
Male220.061082.0795.2153.6716.101467.11
Total600.002120.00150.00100.0030.003000
Subsequent iterations alternate row and column scaling until marginal deviations are below a tolerance (e.g., <1% of targets). Convergence occurs after two full iterations, with the final table approximately:
Sex/EthnicityBlackWhiteAsianNativeOtherTotal
Female375.671021.7653.7445.5713.671510.41
Male224.331098.2496.2654.4316.331489.59
Total600.002120.00150.00100.0030.003000
This example demonstrates IPF's ability to calibrate estimates efficiently, with residuals dropping to near zero (e.g., ~0.000002 after five iterations). For a minimal two-way case, IPF on a seed matrix of uniform values (e.g., all 1s in a 2x2 table) under independence assumptions converges rapidly to expected frequencies matching marginals, often in one or two cycles per dimension, as row-column adjustments compound multiplicatively.

Implementation in Software

Iterative proportional fitting (IPF) is commonly implemented in statistical programming environments through dedicated packages or custom procedures that automate the iterative adjustment of an initial matrix to prescribed marginals while minimizing from the . These implementations typically incorporate convergence checks based on relative changes in cell values or marginal discrepancies, with user-specified tolerances (e.g., 1e-5) and maximum iterations (e.g., 1000) to prevent non-termination. In , the mipfp package supports multidimensional IPF for arrays up to several dimensions, applying the classical algorithm via successive row and column (or ) scalings, with options for under log-linear models and handling of structural zeros. It extends the procedure to simulate multivariate Bernoulli or multinomial data by iteratively fitting to target marginal distributions, as detailed in its documentation for applications like estimation. The surveysd package offers an ipf() function tailored for survey , adjusting weights to match multiple constraints via cyclic proportional updates, and has been employed in such as Austrian labor force surveys since at least 2010. Custom R scripts for IPF, often shared in spatial microsimulation contexts, replicate the core loop—alternating marginal fits until residuals fall below a threshold—but may lack robustness for high dimensions without vectorized operations. Python implementations include the ipfn library, which generalizes IPF to N-dimensional arrays using for efficient matrix operations, performing iterative scaling factors computed as target marginals divided by current sums, suitable for economic modeling and data reconciliation. This package handles sparse data via array and supports convergence monitoring through logged diagnostics, though it requires users to supply initial arrays without built-in structural zero enforcement. Discussions within the community have proposed integrating IPF into core libraries for broader accessibility, highlighting existing ad-hoc solutions' limitations in maintenance and performance for large datasets. In like SAS, IPF is realized through PROC IML, which enables matrix-based programming for the biproportional adjustment loop, computing scaling vectors as marginal ratios and iterating until the chi-square statistic or maximum relative error stabilizes, as demonstrated in procedures for n-way table balancing. Cross-language tools, such as the humanleague C++ library with and Python bindings, optimize IPF for microsynthesis by leveraging compiled code for fractional population generation, achieving faster convergence on high-dimensional problems compared to pure interpreted implementations. Across platforms, implementations emphasize via damping factors or Dinkelbach acceleration variants to mitigate in sparse or ill-conditioned cases, though users must validate outputs against theoretical guarantees for specific marginal sets.

Limitations and Criticisms

Practical Challenges and Failure Modes

One major practical challenge in applying iterative proportional fitting (IPF) arises from the presence of zero cells or weights in the seed matrix, as these cannot be adjusted during iterations, potentially preventing the algorithm from achieving the target marginals if positive values are required in those positions. Sampling zeros, common in sparse datasets like census microdata, further distort goodness-of-fit measures and can lead to inaccurate estimates, with impacts intensifying in smaller samples or finer geographic units. Inconsistent marginal totals across dimensions represent another failure mode, where the overall sums of row and column targets do not match, rendering exact solutions impossible and causing non-convergence or the need for scaling that alters the . High-dimensional or large sparse matrices exacerbate convergence problems, with IPF exhibiting slow iteration rates due to increased computational demands and difficulties in preserving multivariate dependencies, often requiring hundreds or thousands of cycles to reach tolerance thresholds. Numerical instability can emerge in implementations, particularly with extreme weights assigned to under-sampled categories, leading to overflow risks or disproportionate influence from rare observations; constraints like minimum/maximum bounds mitigate this but may prolong runtime or induce failure if overly restrictive. Limited data availability, such as sampled inputs, degrades performance, with fits improving monotonically only under specific conditions like larger samples, while geographic or structural variations (e.g., ecological fallacy effects) can yield poor local accuracy despite global convergence.

Empirical Performance Issues

In empirical applications, such as spatial microsimulation for small-area population estimation, the iterative proportional fitting (IPF) procedure often exhibits degraded performance due to empty cells in the seed table, which hinder convergence and inflate error metrics like error (RMSE); for instance, in a , baseline RMSE reached 25.5, dropping to 1.24 after removing empty cells. Additional constraints, such as urban-rural categorizations, can further exacerbate inaccuracies, with RMSE rising to 283 in the same scenario, indicating IPF's sensitivity to over-constraining marginals beyond basic row and column totals. Convergence speed represents another persistent issue, particularly in high-dimensional or large contingency tables, where IPF is noted to proceed slowly despite eventual attainment of the maximum likelihood estimate under log-linear models; this slowness persists even as storage efficiency favors IPF over alternatives like for tables with hundreds of parameters, such as a 10×10×10 structure requiring only minimal memory. In survey raking contexts akin to IPF, incorporating a large number of variables—often exceeding 10—leads to inefficient weighting and potential , as the procedure struggles to balance numerous marginals without violating consistency requirements, such as mismatched aggregate totals across constraints. Integerization of IPF outputs for discrete applications, like demographic microsimulation, introduces further discrepancies, reducing overall fit quality as continuous adjustments are rounded, with empirical tests showing persistent deviations in total absolute error (TAE) and standardized errors even after 10+ iterations. These issues manifest in real-world census-like scenarios, where multidimensional data sparsity amplifies failure modes, prompting recommendations for preprocessing (e.g., zero removal) or hybrid implementations in languages like C for computational speedup, achieving 10- to 51-fold efficiency gains over base R versions.

Alternatives and Comparisons

Comparison with NM-Method

The NM-method, or Naszodi–Mendonca method, is a matrix adjustment procedure introduced in 2021 that transforms a given to match specified marginal totals while preserving a log-linear (LL) indicator, distinct from the odds-ratio preservation in iterative proportional fitting (IPF). Unlike IPF, which minimizes Kullback-Leibler (KL) divergence to estimate a full table from incomplete sample , the NM-method minimizes a cumulative KL-divergence measure to construct counterfactual tables for comparing distinct populations under adjusted margins. IPF addresses problems of completing population tables, such as scaling a sample (e.g., object distributions by shape and material) to marginals while assuming maximum likelihood under independence of margins, leading to convergence via iterative row and column scalings that maintain internal cell ratios. In contrast, the NM-method targets counterfactual construction, such as aligning tables from different time periods or groups (e.g., educational inter-marriages across generations) to enable causal-like comparisons by preserving natural rankings in the LL-indicator, which aligns with survey-based evidence on preferences rather than assuming sample-to-population scaling. Theoretically, IPF guarantees odds-ratio invariance and is optimal for likelihood-based but can distort trend interpretations in comparisons; the NM-method ensures LL-preservation and faster empirical convergence in multi-way tables for counterfactuals, though it lacks IPF's established maximum likelihood justification for direct tasks. Empirically, applications to U.S. Census Bureau data on educational inter-marriages (1980–2010) reveal divergent outcomes: IPF produces monotonous declines in social cohesion measures (e.g., reduced homogamy for late boomers in 1990 versus early boomers in 1980), implying steady weakening, whereas the NM-method yields hump-shaped patterns (e.g., increased cohesion for late boomers in 1990 but declines for late Generation X in 2010), corroborated by Pew Research Center's 2010 survey on marital preferences showing non-monotonic trends. The NM-method thus offers advantages in interpretability for policy or sociological comparisons by avoiding IPF's over-smoothing of generational shifts, but IPF remains preferable for pure estimation from samples due to its alignment with probabilistic models; neither method handles zero cells robustly without modifications, and NM-method implementations are less integrated in standard software compared to IPF. Selection between them depends on whether the goal is estimation (favoring IPF) or controlled comparison (favoring NM-method), with the latter providing empirically validated realism in non-monotonic dynamics.

Other Fitting Procedures

The maximum likelihood method for fitting contingency tables to prescribed margins involves estimating parameters of a that maximizes the likelihood under the observed cell frequencies and marginal constraints, often solved via Newton-Raphson iteration or generalized EM algorithms. This approach assumes a Poisson or multinomial sampling model and yields estimates asymptotically equivalent to those from IPF under of margins, but diverges when margins are dependent, providing a distinct solution that accounts for sampling variability. Minimum chi-square methods minimize the Pearson chi-square statistic between the fitted and seed table subject to margin constraints, using quadratic programming or iterative scaling variants; the standard version penalizes deviations quadratically, while the modified chi-square adjusts weights to emphasize relative errors in sparse cells. These procedures produce solutions closer to the seed table than IPF in cases of high sparsity or structural zeros, as they explicitly optimize a goodness-of-fit measure rather than relying on multiplicative adjustments. Empirical comparisons in multidimensional settings show minimum chi-square outperforming IPF in convergence speed for ill-conditioned tables, though it requires solving nonlinear optimizations that scale poorly beyond three dimensions without specialized software. Simulated annealing, a stochastic optimization technique, iteratively perturbs cell values to match margins while minimizing an energy function (e.g., deviation from seed proportions), accepting worse moves probabilistically to escape local minima and enforce integer outputs. Unlike deterministic methods like IPF, it handles discrete constraints natively, making it suitable for population synthesis in spatial microsimulation where fractional households are invalid; however, it demands tuning of cooling schedules and initial conditions, with runtime increasing exponentially in table dimensionality. Studies in transportation modeling report yielding unbiased marginal fits with lower variance than integerized IPF for small samples, but at higher computational cost.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.