Recent from talks
Nothing was collected or created yet.
Multinomial theorem
View on WikipediaThis article needs additional citations for verification. (December 2022) |
In mathematics, the multinomial theorem describes how to expand a power of a sum in terms of powers of the terms in that sum. It is the generalization of the binomial theorem from binomials to multinomials.
Theorem
[edit]For any positive integer m and any non-negative integer n, the multinomial theorem describes how a sum with m terms expands when raised to the nth power: where is a multinomial coefficient.[1] The sum is taken over all combinations of nonnegative integer indices k1 through km such that the sum of all ki is n. That is, for each term in the expansion, the exponents of the xi must add up to n.[2][a]
In the case m = 2, this statement reduces to that of the binomial theorem.[2]
Example
[edit]The third power of the trinomial a + b + c is given by This can be computed by hand using the distributive property of multiplication over addition and combining like terms, but it can also be done (perhaps more easily) with the multinomial theorem. It is possible to "read off" the multinomial coefficients from the terms by using the multinomial coefficient formula. For example, the term has coefficient , the term has coefficient , and so on.
Alternate expression
[edit]The statement of the theorem can be written concisely using multiindices:
where
and
Proof
[edit]This proof of the multinomial theorem uses the binomial theorem and induction on m.
First, for m = 1, both sides equal x1n since there is only one term k1 = n in the sum. For the induction step, suppose the multinomial theorem holds for m. Then
by the induction hypothesis. Applying the binomial theorem to the last factor,
which completes the induction. The last step follows because
as can easily be seen by writing the three coefficients using factorials as follows:
Multinomial coefficients
[edit]The numbers
appearing in the theorem are the multinomial coefficients. They can be expressed in numerous ways, including as a product of binomial coefficients or of factorials:
Sum of all multinomial coefficients
[edit]The substitution of xi = 1 for all i into the multinomial theorem
gives immediately that
Number of multinomial coefficients
[edit]The number of terms in a multinomial sum, #n,m, is equal to the number of monomials of degree n on the variables x1, …, xm:
The count can be performed easily using the method of stars and bars.
Valuation of multinomial coefficients
[edit]The largest power of a prime p that divides a multinomial coefficient may be computed using a generalization of Kummer's theorem.
Asymptotics
[edit]By Stirling's approximation, or equivalently the log-gamma function's asymptotic expansion, so for example,
Interpretations
[edit]Ways to put objects into bins
[edit]The multinomial coefficients have a direct combinatorial interpretation, as the number of ways of depositing n distinct objects into m distinct bins, with k1 objects in the first bin, k2 objects in the second bin, and so on.[3]
Number of ways to select according to a distribution
[edit]In statistical mechanics and combinatorics, if one has a number distribution of labels, then the multinomial coefficients naturally arise from the binomial coefficients. Given a number distribution {ni} on a set of N total items, ni represents the number of items to be given the label i. (In statistical mechanics i is the label of the energy state.)
The number of arrangements is found by
- Choosing n1 of the total N to be labeled 1. This can be done ways.
- From the remaining N − n1 items choose n2 to label 2. This can be done ways.
- From the remaining N − n1 − n2 items choose n3 to label 3. Again, this can be done ways.
Multiplying the number of choices at each step results in:
Cancellation results in the formula given above.
Number of unique permutations of words
[edit]
The multinomial coefficient
is also the number of distinct ways to permute a multiset of n elements, where ki is the multiplicity of each of the ith element. For example, the number of distinct permutations of the letters of the word MISSISSIPPI, which has 1 M, 4 Is, 4 Ss, and 2 Ps, is
Generalized Pascal's triangle
[edit]One can use the multinomial theorem to generalize Pascal's triangle or Pascal's pyramid to Pascal's simplex. This provides a quick way to generate a lookup table for multinomial coefficients.
A related structure is the multinomial triangle, or generalized Pascal triangle of order m, which may be constructed using the recurrence relation: from which Pascal's rule is recovered when . These multinomial coefficients can be written as closed-form expressions with bounded integer compositions:
and without:[4] (sequence A008287 in the OEIS)
See also
[edit]References
[edit]- ^ As with the binomial theorem, quantities of the form x0 that appear are taken to equal 1, even when x equals zero.
- ^ Aigner, Martin (1997), Combinatorial Theory, Springer, p. 77
- ^ a b Stanley, Richard (2012), Enumerative Combinatorics, vol. 1 (2 ed.), Cambridge University Press, §1.2
- ^ National Institute of Standards and Technology (May 11, 2010). "NIST Digital Library of Mathematical Functions". Section 26.4. Retrieved August 30, 2010.
- ^ Belbachir, H.; Bouroubi, S.; Khelladi, A. (2008), "Connection between ordinary multinomials, Fibonacci numbers, Bell polynomials and discrete uniform distribution", Annales Mathematicae et Informaticae, 35: 24 https://arxiv.org/abs/0708.2195
Multinomial theorem
View on GrokipediaStatement and Coefficients
Theorem Statement
The multinomial theorem generalizes the binomial theorem to the expansion of a power of a sum involving an arbitrary number of terms.[1] For non-negative integer and indeterminates , the theorem states that where the sum is taken over all ordered -tuples of non-negative integers satisfying the condition .[6][1] This summation can be expressed more compactly using multi-index notation, where a multi-index is a vector of non-negative integers with , and the coefficient is .[6]Multinomial Coefficients
The multinomial coefficient is denoted by , where and the are non-negative integers satisfying .[7] This notation generalizes the binomial coefficient, which corresponds to the case .[8] The explicit formula for the multinomial coefficient is This expression arises from the factorial representation of permutations adjusted for repeated elements.[7][9] The multinomial coefficient relates recursively to binomial coefficients through the product identity which reflects a sequential partitioning of the total into the subgroups of sizes .[7] This chained product of binomials equals the direct multinomial formula for any such partition.[7] Each multinomial coefficient corresponds uniquely to a multi-index consisting of non-negative integers that sum to , distinguishing it within the expansion of the multinomial theorem.[8]Examples and Alternate Forms
Basic Example
A basic illustration of the multinomial theorem is the expansion of . The theorem states that this equals the sum over all non-negative integers with of the terms , where the multinomial coefficients serve as multipliers for each monomial.[10] To compute step-by-step, consider the possible exponent combinations:- For :
- For :
- For :
- For :
- For :
- For :
- For :
- For :
- For :
- For :
Alternate Expressions
One common notational variant of the multinomial theorem employs multi-index notation, where a multi-index is an -tuple of non-negative integers satisfying . In this form, the theorem states that where and .[11] This symmetric expression compactly captures the expansion by summing over all multi-indices of total order , emphasizing the combinatorial structure of the coefficients. An equivalent expression utilizes falling factorials, defined as for non-negative integers and . Since the exponents satisfy , it follows that , yielding This form highlights the sequential selection interpretation underlying the multinomial coefficients, akin to chained binomial expansions.[12] The multinomial theorem extends naturally to the ring of formal power series over a commutative ring, where the finite expansion holds identically as a polynomial in the formal power series ring , without requiring convergence.[13] This context is particularly useful for algebraic manipulations in enumerative combinatorics, treating the variables as indeterminates. The theorem also relates to exponential generating functions as a foundational setup: the exponential generating function encodes the multinomial expansions, where the coefficient of is precisely .[13]Proofs
Combinatorial Proof
The combinatorial proof of the multinomial theorem proceeds by equating two ways of counting the number of sequences of length over an alphabet of symbols, where each symbol corresponds to one of the variables . One the one hand, each position in the sequence can be assigned any of the symbols independently, yielding a total of possible sequences. On the other hand, group these sequences by their type multiplicities: for a fixed multi-index with and each , the sequences with exactly occurrences of symbol (for each ) contribute to the monomial . The number of such sequences is the number of ways to assign the positions to symbols, accounting for indistinguishability within each symbol type: choose positions out of for symbol 1, then out of the remaining for symbol 2, and so on, giving Thus, the coefficient of in the expansion is .[13] Summing over all multi-indices with accounts for every possible sequence exactly once, so the full expansion of is , where the sum is over all such multi-indices. This matches the total count of sequences, confirming the theorem.[13] Equivalently, the multinomial coefficient counts the number of distinct permutations of a multiset consisting of indistinguishable items of type (for each ), which aligns with the sequential position assignments above.[14]Generating Function Proof
The generating function proof of the multinomial theorem utilizes exponential generating functions and formal power series to derive the expansion of . Consider the exponential generating function , which expands as .[15] This can equivalently be expressed as the product , where each factor . The product of these series yields .[15] To extract the coefficient of , group terms where . For such terms, , so the coefficient is . Equating this to the earlier expansion gives .[15] This derivation holds in the ring of formal power series, where operations are algebraic and convergence is not required, ensuring the equality is valid for coefficient extraction regardless of the specific values of the variables.[15]Properties of Coefficients
Sums and Identities
The sum of all multinomial coefficients over all non-negative integers equals . This identity arises directly from the multinomial theorem by substituting for each , yielding on the left side.[16] The number of such multinomial coefficients, corresponding to the number of distinct terms in the multinomial expansion, is the number of non-negative integer solutions to , which is . This count reflects the dimension of the space of homogeneous polynomials of degree in variables. A key identity involving sums of products of multinomial coefficients is the multinomial Vandermonde convolution, which generalizes the binomial case. It states that where the sum is over all non-negative integers such that for each and . This convolution arises in the product of two multinomial expansions and has applications in generating functions. Sums of multinomial coefficients weighted by powers related to subsets of variables provide additional identities. Consider a subset ; the sum , taken over , equals . This follows from setting for and otherwise in the multinomial theorem, establishing a direct relation to powers of linear combinations.[16]Asymptotic Behavior
The asymptotic behavior of multinomial coefficients for large depends on the regime of the . When the are proportional to , with fixed and , Stirling's approximation applied to the factorials yields , where is the Shannon entropy function.[17] This approximation captures the leading exponential growth, with the maximum occurring at the uniform distribution for fixed . The local central limit theorem provides a refined approximation for the multinomial distribution, describing concentration around the mean for specified probabilities , including the uniform case. Specifically, for near , the probability mass , where is the covariance matrix, with error terms explicit up to order .[18] This implies corresponding asymptotics for the coefficients via normalization by . Ratios of consecutive multinomial coefficients, such as those differing by one in a single , admit bounds and asymptotics expressible via ratios of gamma functions for large . Error terms in these approximations vary by regime: for , the relative error in the Stirling-entropy formula is , improving with higher-order Stirling expansions; for fixed with and the remainder large, , with error . These distinguish the entropic growth for balanced partitions from the polynomial regime for sparse ones.p-adic Valuation
The p-adic valuation of the multinomial coefficient , where and the are nonnegative integers, is defined as the highest exponent of a prime dividing this coefficient.[19] It is given by the formula where denotes the sum of the digits of its argument when expressed in base .[19] This expression arises directly from applying de Polignac's formula (also known as Legendre's formula) to the factorial valuations in the definition , since .[20] An equivalent formulation is provided by the generalization of Kummer's theorem to multinomial coefficients: the valuation equals the total number of carries that occur when adding the base- expansions of to obtain the base- expansion of .[21] This count is independent of the order of addition or any parenthesization of the sum.[21] For instance, consider and , so . In base 2, , , , and . Adding the least significant digits: (write 0, carry 1); next digits: (write 0, carry 1); highest digit: (no carry). Thus, there are two carries, matching the valuation.[21] This valuation connects to Lucas' theorem for multinomial coefficients modulo , which states that if the base- digits of are and of the are with , then where each smaller multinomial is computed over integers less than (hence directly).[22] The coefficient is divisible by (i.e., congruent to 0 modulo ) precisely when there is at least one carry in the base- addition for some digit position , as that makes at least one factor since would be impossible without carry-over.[22] For example, with and , the base-3 digits are all 1 for the and 3 = for ; adding three 1's in the units place gives (one carry), so the units multinomial (invalid), confirming divisibility by 3.[22]Combinatorial Interpretations
Objects into Bins
The multinomial coefficient counts the number of ways to distribute distinct objects into distinct bins such that the -th bin receives exactly objects, where .[4] This combinatorial interpretation arises from partitioning a set of distinct elements into an ordered sequence of labeled subsets of the specified sizes.[4] The bins are distinct (labeled), which distinguishes this from unordered partitions where the subsets are indistinguishable except for their contents. This setup extends the stars-and-bars method, traditionally used for distributing indistinguishable objects into distinct bins without size restrictions (yielding ways, allowing empty bins). In the multinomial case, the fixed sizes and distinct objects adjust the analogy to count ordered partitions of the set, emphasizing the labeled nature of the bins over mere numerical compositions of the integer . For indistinguishable objects with fixed sizes, there would be only one such distribution per labeling, but the distinguishability of objects introduces the factorial structure in the coefficient.[23] A useful visualization involves lining up the distinct objects in a row (in possible orders) and inserting dividers to separate them into consecutive segments of lengths , assigning each segment to the corresponding labeled bin. Since the order within each bin does not matter, divide by for each segment to account for the internal arrangements, yielding the multinomial coefficient. This linear arrangement highlights the ordered partitioning, contrasting with unordered set partitions that would require additional division by symmetries among equal-sized subsets.Permutations of Multisets
The multinomial coefficient gives the number of distinct permutations of a multiset with types of elements, where there are identical elements of type for each and .[7] This formula arises because the total arrangements of items treating all as distinct would be , but identical items within each type lead to overcounting by a factor of for each type, which must be divided out.[12] A classic example is the word "MISSISSIPPI," which forms a multiset with 1 M, 4 I's, 4 S's, and 2 P's (). The number of distinct rearrangements is .[24] This counting principle connects to symmetry groups via the action of the symmetric group on the set of all sequences of length with the specified multiplicities. By the orbit-stabilizer theorem, the size of the orbit corresponding to the distinct permutations equals divided by the order of the stabilizer subgroup, which is the direct product of order .[25] The interpretation extends to general arrangements in sequences, where the multinomial coefficient quantifies the ways to order elements under repetition constraints, such as in forming distinct strings or linear compositions from repeated components.[26] This is combinatorially equivalent to assigning distinct positions to the types with fixed counts per type.[27]Pascal's Triangle Generalization
The multinomial triangle serves as a higher-dimensional analog to Pascal's triangle, extending the binomial structure to account for expansions involving m variables. In this array, rows are indexed by the non-negative integer n, with entries corresponding to all non-negative integer compositions , typically ordered lexicographically. There are entries in row n. Each multinomial coefficient in row n is computed recursively as the sum of the up to m predecessor coefficients from row n-1 obtained by decreasing exactly one by 1 (for to ), provided .[7] A key property of the multinomial triangle arises from the multinomial theorem: the sum of all entries in row n equals , reflecting the evaluation of at . This generalizes the fact that rows in Pascal's triangle (m=2) sum to . For visualization, consider m=3, where the row for n=2 in lexicographic order is [, , , , , ] = [1, 2, 2, 1, 2, 1], summing to 9 = 3^2. The multinomial triangle inherits and extends several properties from Pascal's triangle, including symmetry and generalized summation identities. Symmetry manifests in the equality for any permutation of the indices, ensuring the array is invariant under reordering of the parts.Applications
Probability Distributions
The multinomial distribution arises in probability theory as a generalization of the binomial distribution to scenarios involving multiple categories. It models the outcomes of independent trials, each with possible results having probabilities where . Let denote the counts of occurrences for each category, with . The probability mass function is given by where satisfies and are integers, and the multinomial coefficient is .[23][28] This probability mass function derives its validity directly from the multinomial theorem. The theorem states that , where the sum is over all non-negative integers with . Since , the left side equals , confirming that the probabilities sum to unity and normalize the distribution.[23] The moments of the multinomial distribution reflect its structure as counts from categorized trials. The expected value for each component is . The variance of each component is , and the covariance between distinct components is for , indicating negative dependence since the total count is fixed at .[23][28] As a natural extension of the binomial distribution, which corresponds to the case , the multinomial arises from independent trials each with outcomes. Moreover, the marginal distribution of any single follows a binomial distribution , obtained by grouping all other categories as a single "failure" outcome.[23][28]Generating Functions in Algebra
The multinomial theorem provides a foundational expansion for ordinary generating functions in multiple variables, particularly for fixed degree . It states that where is the multinomial coefficient. This equation directly represents the ordinary generating function whose coefficients are the multinomial coefficients, enabling algebraic manipulation and coefficient extraction in polynomial rings. Extending to an infinite series, the generating function incorporates the theorem's expansion term by term, facilitating the study of formal power series and their algebraic properties in combinatorics.[29] In the realm of exponential generating functions, the multinomial theorem plays a crucial role in enumerating labeled algebraic structures through series expansions. The key identity is where the inner power is expanded via the multinomial theorem to yield coefficients for each monomial term with . This form arises in the multiplication principle for exponential generating functions: the product of such functions has coefficients given by \sum_{\substack{i_1 + \dots + i_d = n \\ i_j \geq 0}} \binom{n}{i_1, \dots, i_d} a_1^{(1)}_{i_1} \cdots a_d^{(d)}_{i_d}, directly invoking the theorem to partition labels among components in algebraic compositions.[30][31] These expansions extend to applications in combinatorial enumeration, such as species theory and cycle indices. In combinatorial species, the exponential formula constructs generating functions for composite structures, where multinomial coefficients account for labeled partitions into connected components, as in the EGF for assemblies of typed atoms. Similarly, the cycle index of the symmetric group , used in algebraic enumeration of symmetries, groups terms by cycle types via multinomial-like factorials: the number of permutations with cycle lengths specified by multiplicities is , derived from partitioning elements. The theorem thus enables inversion by expanding powers to isolate specific coefficients in these series, supporting algebraic derivations in enumeration problems.[31]Computational Aspects
The multinomial coefficient for can be computed directly as the product over successive binomial coefficients, avoiding the direct evaluation of large factorials and that may cause intermediate overflow. This recursive decomposition leverages efficient algorithms for individual binomial coefficients, such as the multiplicative formula that computes in arithmetic operations by iteratively multiplying terms for to .[32] For computing multiple multinomial coefficients or the full expansion of , dynamic programming algorithms construct a table of intermediate values using the recurrence relation , with memoization to avoid redundant calculations; this achieves time complexity , equivalent to operations per coefficient when is the number of terms and the table is built sequentially by adding one unit at a time.[33] Such approaches are particularly useful for generating all coefficients in the expansion efficiently, especially when is small relative to . Symbolic mathematics libraries provide built-in functions for exact computation and expansion under the multinomial theorem. In SymPy, themultinomial function computes coefficients symbolically, while the expand method with multivariate polynomials handles full expansions; these support arbitrary precision via the underlying MPmath library.[34] Similarly, Mathematica's Multinomial[{k1, k2, ..., km}] evaluates coefficients exactly, and Expand[(x1 + x2 + ... + xm)^n] generates the full multinomial expansion, with built-in support for high-precision arithmetic.[35]
Computing multinomial coefficients for large poses challenges due to their rapid growth, often exceeding fixed-precision integer limits and causing overflow. To address this, multiprecision arithmetic libraries such as GMP in C++ or Python's built-in int type (with arbitrary precision) are employed to store and manipulate the large integers required. Alternatively, logarithmic forms compute using summed log-gamma functions for approximation without overflow, suitable for probabilistic applications. For extremely large , where exact computation is infeasible, Stirling's approximation extended to multinomials provides estimates via , drawing on asymptotic behavior for scaling and validation.[36]