Recent from talks
Contribute something
Nothing was collected or created yet.
Multiset
View on WikipediaIn mathematics, a multiset (or bag, or mset) is a modification of the concept of a set that, unlike a set,[1] allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that element in the multiset. As a consequence, an infinite number of multisets exist that contain only elements a and b, but vary in the multiplicities of their elements:
- The set {a, b} contains only elements a and b, each having multiplicity 1 when {a, b} is seen as a multiset.
- In the multiset {a, a, b}, the element a has multiplicity 2, and b has multiplicity 1.
- In the multiset {a, a, a, b, b, b}, a and b both have multiplicity 3.
These objects are all different when viewed as multisets, although they are the same set, since they all consist of the same elements. As with sets, and in contrast to tuples, the order in which elements are listed does not matter in discriminating multisets, so {a, a, b} and {a, b, a} denote the same multiset. To distinguish between sets and multisets, a notation that incorporates square brackets is sometimes used: the multiset {a, a, b} can be denoted by [a, a, b].[2]
The cardinality or "size" of a multiset is the sum of the multiplicities of all its elements. For example, in the multiset {a, a, b, b, b, c} the multiplicities of the members a, b, and c are respectively 2, 3, and 1, and therefore the cardinality of this multiset is 6.
Nicolaas Govert de Bruijn coined the word multiset in the 1970s, according to Donald Knuth.[3]: 694 However, the concept of multisets predates the coinage of the word multiset by many centuries. Knuth himself attributes the first study of multisets to the Indian mathematician Bhāskarāchārya, who described permutations of multisets around 1150. Other names have been proposed or used for this concept, including list, bunch, bag, heap, sample, weighted set, collection, and suite.[3]: 694
History
[edit]Wayne Blizard traced multisets back to the very origin of numbers, arguing that "in ancient times, the number n was often represented by a collection of n strokes, tally marks, or units."[4] These and similar collections of objects can be regarded as multisets, because strokes, tally marks, or units are considered indistinguishable. This shows that people implicitly used multisets even before mathematics emerged.
Practical needs for this structure have caused multisets to be rediscovered several times, appearing in literature under different names.[5]: 323 For instance, they were important in early AI languages, such as QA4, where they were referred to as bags, a term attributed to Peter Deutsch.[6] A multiset has been also called an aggregate, heap, bunch, sample, weighted set, occurrence set, and fireset (finitely repeated element set).[5]: 320 [7]
Although multisets were used implicitly from ancient times, their explicit exploration happened much later. The first known study of multisets is attributed to the Indian mathematician Bhāskarāchārya circa 1150, who described permutations of multisets.[3]: 694 The work of Marius Nizolius (1498–1576) contains another early reference to the concept of multisets.[8] Athanasius Kircher found the number of multiset permutations when one element can be repeated.[9] Jean Prestet published a general rule for multiset permutations in 1675.[10] John Wallis explained this rule in more detail in 1685.[11]
Multisets appeared explicitly in the work of Richard Dedekind.[12][13]
Other mathematicians formalized multisets and began to study them as precise mathematical structures in the 20th century. For example, Hassler Whitney (1933) described generalized sets ("sets" whose characteristic functions may take any integer value: positive, negative or zero).[5]: 326 [14]: 405 Monro (1987) investigated the category Mul of multisets and their morphisms, defining a multiset as a set with an equivalence relation between elements "of the same sort", and a morphism between multisets as a function that respects sorts. He also introduced a multinumber: a function f (x) from a multiset to the natural numbers, giving the multiplicity of element x in the multiset. Monro argued that the concepts of multiset and multinumber are often mixed indiscriminately, though both are useful.[5]: 327–328 [15]
Examples
[edit]One of the simplest and most natural examples is the multiset of prime factors of a natural number n. Here the underlying set of elements is the set of prime factors of n. For example, the number 120 has the prime factorization which gives the multiset {2, 2, 2, 3, 5}.
A related example is the multiset of solutions of an algebraic equation. A quadratic equation, for example, has two solutions. However, in some cases they are both the same number. Thus the multiset of solutions of the equation could be {3, 5}, or it could be {4, 4}. In the latter case it has a solution of multiplicity 2. More generally, the fundamental theorem of algebra asserts that the complex solutions of a polynomial equation of degree d always form a multiset of cardinality d.
A special case of the above are the eigenvalues of a matrix, whose multiplicity is usually defined as their multiplicity as roots of the characteristic polynomial. However two other multiplicities are naturally defined for eigenvalues, their multiplicities as roots of the minimal polynomial, and the geometric multiplicity, which is defined as the dimension of the kernel of A − λI (where λ is an eigenvalue of the matrix A). These three multiplicities define three multisets of eigenvalues, which may be all different: Let A be a n × n matrix in Jordan normal form that has a single eigenvalue. Its multiplicity is n, its multiplicity as a root of the minimal polynomial is the size of the largest Jordan block, and its geometric multiplicity is the number of Jordan blocks.
Definition
[edit]A multiset may be formally defined as an ordered pair (U, m) where U is a set called a universe or the underlying set, and is a function from U to the nonnegative integers. The value for an element is called the multiplicity of in the multiset and intepreted as the number of occurrences of in the multiset.
The support of a multiset is the subset of formed by the elements such that . A finite multiset is a multiset with a finite support. Most authors define multisets as finite multisets. This is the case in this article, where, unless otherwise stated, all multisets are finite multisets.
Some authors[16] define multisets with the additional constraint that for every , or, equivalently, the support equals the underlying set. Multisets with infinite multiplicities have also been studied;[17] they are not considered in this article. Some authors[who?] define a multiset in terms of a finite index set and a function where the multiplicity of an element is , the number of elements of that get mapped to by .
Multisets may be represented as sets, with some elements repeated. For example, the multiset with support and multiplicity function such that can be represented as {a, a, b}. A more compact notation, in case of high multiplicities is for the same multiset.
If a multiset with support included in is often represented as to which the computation rules of indeterminates can be applied; that is, exponents 1 and factors with exponent 0 can be removed, and the multiset does not depend on the order of the factors. This allows extending the notation to infinite underlying sets as An advantage of notation is that it allows using the notation without knowing the exact support. For example, the prime factors of a natural number form a multiset such that
The finite subsets of a set are exactly the multisets with underlying set , such that for every .
Basic properties and operations
[edit]Elements of a multiset are generally taken in a fixed set U, sometimes called a universe, which is often the set of natural numbers. An element of U that does not belong to a given multiset is said to have a multiplicity 0 in this multiset. This extends the multiplicity function of the multiset to a function from U to the set of non-negative integers. This defines a one-to-one correspondence between these functions and the multisets that have their elements in U.
This extended multiplicity function is commonly called simply the multiplicity function, and suffices for defining multisets when the universe containing the elements has been fixed. This multiplicity function is a generalization of the indicator function of a subset, and shares some properties with it.
The support of a multiset in a universe U is the underlying set of the multiset. Using the multiplicity function , it is characterized as
A multiset is finite if its support is finite, or, equivalently, if its cardinality is finite. The empty multiset is the unique multiset with an empty support (underlying set), and thus a cardinality 0.
The usual operations of sets may be extended to multisets by using the multiplicity function, in a similar way to using the indicator function for subsets. In the following, A and B are multisets in a given universe U, with multiplicity functions and
- Inclusion: A is included in B, denoted A ⊆ B, if
- Union: the union (called, in some contexts, the maximum or lowest common multiple) of A and B is the multiset C with multiplicity function[13]
- Intersection: the intersection (called, in some contexts, the infimum or greatest common divisor) of A and B is the multiset C with multiplicity function
- Sum: the sum of A and B is the multiset C with multiplicity function It may be viewed as a generalization of the disjoint union of sets. It defines a commutative monoid structure on the finite multisets in a given universe. This monoid is a free commutative monoid, with the universe as a basis.
- Difference: the difference of A and B is the multiset C with multiplicity function
Two multisets are disjoint if their supports are disjoint sets. This is equivalent to saying that their intersection is the empty multiset or that their sum equals their union.
There is an inclusion–exclusion principle for finite multisets (similar to the one for sets), stating that a finite union of finite multisets is the difference of two sums of multisets: in the first sum we consider all possible intersections of an odd number of the given multisets, while in the second sum we consider all possible intersections of an even number of the given multisets.[citation needed]
Counting multisets
[edit]
and 3-multisets with elements from a 5-set (right)
So this illustrates that
The number of multisets of cardinality k, with elements taken from a finite set of cardinality n, is sometimes called the multiset coefficient or multiset number. This number is written by some authors as , a notation that is meant to resemble that of binomial coefficients; it is used for instance in (Stanley, 1997), and could be pronounced "n multichoose k" to resemble "n choose k" for Like the binomial distribution that involves binomial coefficients, there is a negative binomial distribution in which the multiset coefficients occur. Multiset coefficients should not be confused with the multinomial coefficients that occur in the multinomial theorem.
The value of multiset coefficients can be given explicitly as where the second expression is as a binomial coefficient;[a] many authors in fact avoid separate notation and just write binomial coefficients. So, the number of such multisets is the same as the number of subsets of cardinality k of a set of cardinality n + k − 1. The analogy with binomial coefficients can be stressed by writing the numerator in the above expression as a rising factorial power to match the expression of binomial coefficients using a falling factorial power:
For example, there are 4 multisets of cardinality 3 with elements taken from the set {1, 2} of cardinality 2 (n = 2, k = 3), namely {1, 1, 1}, {1, 1, 2}, {1, 2, 2}, {2, 2, 2}. There are also 4 subsets of cardinality 3 in the set {1, 2, 3, 4} of cardinality 4 (n + k − 1), namely {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}.
One simple way to prove the equality of multiset coefficients and binomial coefficients given above involves representing multisets in the following way. First, consider the notation for multisets that would represent {a, a, a, a, a, a, b, b, c, c, c, d, d, d, d, d, d, d} (6 as, 2 bs, 3 cs, 7 ds) in this form:
- • • • • • • | • • | • • • | • • • • • • •
This is a multiset of cardinality k = 18 made of elements of a set of cardinality n = 4. The number of characters including both dots and vertical lines used in this notation is 18 + 4 − 1. The number of vertical lines is 4 − 1. The number of multisets of cardinality 18 is then the number of ways to arrange the 4 − 1 vertical lines among the 18 + 4 − 1 characters, and is thus the number of subsets of cardinality 4 − 1 of a set of cardinality 18 + 4 − 1. Equivalently, it is the number of ways to arrange the 18 dots among the 18 + 4 − 1 characters, which is the number of subsets of cardinality 18 of a set of cardinality 18 + 4 − 1. This is thus is the value of the multiset coefficient and its equivalencies:
From the relation between binomial coefficients and multiset coefficients, it follows that the number of multisets of cardinality k in a set of cardinality n can be written Additionally,
Recurrence relation
[edit]A recurrence relation for multiset coefficients may be given as with
The above recurrence may be interpreted as follows. Let be the source set. There is always exactly one (empty) multiset of size 0, and if n = 0 there are no larger multisets, which gives the initial conditions.
Now, consider the case in which n, k > 0. A multiset of cardinality k with elements from [n] might or might not contain any instance of the final element n. If it does appear, then by removing n once, one is left with a multiset of cardinality k − 1 of elements from [n], and every such multiset can arise, which gives a total of possibilities.
If n does not appear, then our original multiset is equal to a multiset of cardinality k with elements from [n − 1], of which there are
Thus,
Generating series
[edit]The generating function of the multiset coefficients is very simple, being As multisets are in one-to-one correspondence with monomials, is also the number of monomials of degree d in n indeterminates. Thus, the above series is also the Hilbert series of the polynomial ring
As is a polynomial in n, it and the generating function are well defined for any complex value of n.
Generalization and connection to the negative binomial series
[edit]The multiplicative formula allows the definition of multiset coefficients to be extended by replacing n by an arbitrary number α (negative, real, or complex):
With this definition one has a generalization of the negative binomial formula (with one of the variables set to 1), which justifies calling the negative binomial coefficients:
This Taylor series formula is valid for all complex numbers α and X with |X| < 1. It can also be interpreted as an identity of formal power series in X, where it actually can serve as definition of arbitrary powers of series with constant coefficient equal to 1; the point is that with this definition all identities hold that one expects for exponentiation, notably
and formulas such as these can be used to prove identities for the multiset coefficients.
If α is a nonpositive integer n, then all terms with k > −n are zero, and the infinite series becomes a finite sum. However, for other values of α, including positive integers and rational numbers, the series is infinite.
Applications
[edit]Multisets have various applications.[7] They are becoming fundamental in combinatorics.[18][19][20][21] Multisets have become an important tool in the theory of relational databases, which often uses the synonym bag.[22][23][24] For instance, multisets are often used to implement relations in database systems. In particular, a table (without a primary key) works as a multiset, because it can have multiple identical records. Similarly, SQL operates on multisets and returns identical records. For instance, consider "SELECT name FROM Student". In the case that there are multiple records with name "Sara" in the student table, all of them are shown. That means the result of an SQL query is a multiset; if the result were instead a set, the repetitive records in the result set would have been eliminated. Another application of multisets is in modeling multigraphs. In multigraphs there can be multiple edges between any two given vertices. As such, the entity that specifies the edges is a multiset, and not a set.
There are also other applications. For instance, Richard Rado used multisets as a device to investigate the properties of families of sets. He wrote, "The notion of a set takes no account of multiple occurrence of any one of its members, and yet it is just this kind of information that is frequently of importance. We need only think of the set of roots of a polynomial f (x) or the spectrum of a linear operator."[5]: 328–329
Generalizations
[edit]Different generalizations of multisets have been introduced, studied and applied to solving problems.
- Signed multisets (in which multiplicity of an element can be any integer)[25]
- Real-valued multisets (in which multiplicity of an element can be any real number)[26]
- Fuzzy multisets[27]
- Rough multisets[28]
- Hybrid sets[29]
- Multisets whose multiplicity is any real-valued step function[30]
- Soft multisets[31]
- Soft fuzzy multisets[32]
- Named sets (unification of all generalizations of sets)[33][34][35][36]
See also
[edit]- Frequency (statistics) as multiplicity analog
- Quasi-sets
- Set theory
- Bag-of-words model
Learning materials related to Partitions of multisets at Wikiversity
Notes
[edit]- ^ The formula does not work for n = 0 (where necessarily also k = 0) if viewed as an ordinary binomial coefficient since it evaluates to , however the formula n(n+1)(n+2)...(n+k−1)/k! does work in this case because the numerator is an empty product giving 1/0! = 1. However does make sense for n = k = 0 if interpreted as a generalized binomial coefficient; indeed seen as a generalized binomial coefficient equals the extreme right-hand side of the above equation.
References
[edit]- ^ Cantor, Georg; Jourdain, Philip E.B. (Translator) (1895). "beiträge zur begründung der transfiniten Mengenlehre" [contributions to the founding of the theory of transfinite numbers]. Mathematische Annalen (in German). xlvi, xlix. New York Dover Publications (1954 English translation): 481–512, 207–246. Archived from the original on 2011-06-10.
By a set (Menge) we are to understand any collection into a whole (Zusammenfassung zu einem Gansen) M of definite and separate objects m (p.85)
- ^ Hein, James L. (2003). Discrete mathematics. Jones & Bartlett Publishers. pp. 29–30. ISBN 0-7637-2210-3.
- ^ a b c Knuth, Donald E. (1998). Seminumerical Algorithms. The Art of Computer Programming. Vol. 2 (3rd ed.). Addison Wesley. ISBN 0-201-89684-2.
- ^ Blizard, Wayne D (1989). "Multiset theory". Notre Dame Journal of Formal Logic. 30 (1): 36–66. doi:10.1305/ndjfl/1093634995.
- ^ a b c d e Blizard, Wayne D. (1991). "The Development of Multiset Theory". Modern Logic. 1 (4): 319–352.
- ^ Rulifson, J. F.; Derkson, J. A.; Waldinger, R. J. (November 1972). QA4: A Procedural Calculus for Intuitive Reasoning (Technical report). SRI International. 73.
- ^ a b Singh, D.; Ibrahim, A. M.; Yohanna, T.; Singh, J. N. (2007). "An overview of the applications of multisets". Novi Sad Journal of Mathematics. 37 (2): 73–92.
- ^ Angelelli, I. (1965). "Leibniz's misunderstanding of Nizolius' notion of 'multitudo'". Notre Dame Journal of Formal Logic (6): 319–322.
- ^ Kircher, Athanasius (1650). Musurgia Universalis. Rome: Corbelletti.
- ^ Prestet, Jean (1675). Elemens des Mathematiques. Paris: André Pralard.
- ^ Wallis, John (1685). A treatise of algebra. London: John Playford.
- ^ Dedekind, Richard (1888). Was sind und was sollen die Zahlen?. Braunschweig: Vieweg. p. 114.
- ^ a b Syropoulos, Apostolos (2000). "Mathematics of multisets". In Calude, Cristian; Paun, Gheorghe; Rozenberg, Grzegorz; Salomaa, Arto (eds.). Multiset Processing, Mathematical, Computer Science, and Molecular Computing Points of View [Workshop on Multiset Processing, WMP 2000, Curtea de Arges, Romania, August 21–25, 2000]. Lecture Notes in Computer Science. Vol. 2235. Springer. pp. 347–358. doi:10.1007/3-540-45523-X_17. ISBN 978-3-540-43063-6.
- ^ Whitney, Hassler (1933). "Characteristic Functions and the Algebra of Logic". Annals of Mathematics. 34 (3): 405–414. doi:10.2307/1968168. JSTOR 1968168.
- ^ Monro, G. P. (1987). "The Concept of Multiset". Zeitschrift für Mathematische Logik und Grundlagen der Mathematik. 33 (2): 171–178. doi:10.1002/malq.19870330212.
- ^ Aluffi, Paolo (2009). Algebra: Chapter 0. American Mathematical Society. ISBN 978-0821847817.
- ^ Cf., for instance, Richard Brualdi, Introductory Combinatorics, Pearson.
- ^ Aigner, M. (1979). Combinatorial Theory. New York/Berlin: Springer Verlag.
- ^ Anderson, I. (1987). Combinatorics of Finite Sets. Oxford: Clarendon Press. ISBN 978-0-19-853367-2.
- ^ Stanley, Richard P. (1997). Enumerative Combinatorics. Vol. 1. Cambridge University Press. ISBN 0-521-55309-1.
- ^ Stanley, Richard P. (1999). Enumerative Combinatorics. Vol. 2. Cambridge University Press. ISBN 0-521-56069-1.
- ^ Grumbach, S.; Milo, T (1996). "Towards tractable algebras for bags". Journal of Computer and System Sciences. 52 (3): 570–588. doi:10.1006/jcss.1996.0042.
- ^ Libkin, L.; Wong, L. (1994). "Some properties of query languages for bags". Proceedings of the Workshop on Database Programming Languages. Springer Verlag. pp. 97–114.
- ^ Libkin, L.; Wong, L. (1995). "On representation and querying incomplete information in databases with bags". Information Processing Letters. 56 (4): 209–214. doi:10.1016/0020-0190(95)00154-5.
- ^ Blizard, Wayne D. (1990). "Negative Membership". Notre Dame Journal of Formal Logic. 31 (3): 346–368. doi:10.1305/ndjfl/1093635499. S2CID 42766971.
- ^ Blizard, Wayne D. (1989). "Real-valued Multisets and Fuzzy Sets". Fuzzy Sets and Systems. 33 (1): 77–97. doi:10.1016/0165-0114(89)90218-2.
- ^ Yager, R. R. (1986). "On the Theory of Bags". International Journal of General Systems. 13 (1): 23–37. doi:10.1080/03081078608934952.
- ^ Grzymala-Busse, J. (1987). "Learning from examples based on rough multisets". Proceedings of the 2nd International Symposium on Methodologies for Intelligent Systems. Charlotte, North Carolina. pp. 325–332.
{{cite book}}: CS1 maint: location missing publisher (link) - ^ Loeb, D. (1992). "Sets with a negative numbers of elements". Advances in Mathematics. 91 (1): 64–74. doi:10.1016/0001-8708(92)90011-9.
- ^ Miyamoto, S. (2001). "Fuzzy Multisets and Their Generalizations". Multiset Processing. Lecture Notes in Computer Science. Vol. 2235. Berlin, Heidelberg: Springer. pp. 225–235. doi:10.1007/3-540-45523-X_11. ISBN 978-3-540-43063-6.
- ^ Alkhazaleh, S.; Salleh, A. R.; Hassan, N. (2011). "Soft Multisets Theory". Applied Mathematical Sciences. 5 (72): 3561–3573.
- ^ Alkhazaleh, S.; Salleh, A. R. (2012). "Fuzzy Soft Multiset Theory". Abstract and Applied Analysis. 2012 350603: 1–20. doi:10.1155/2012/350603.
- ^ Burgin, Mark (1990). "Theory of Named Sets as a Foundational Basis for Mathematics". Structures in Mathematical Theories. San Sebastian. pp. 417–420.
- ^ Burgin, Mark (1992). "On the concept of a multiset in cybernetics". Cybernetics and Systems Analysis. 3: 165–167.
- ^ Burgin, Mark (2004). "Unified Foundations of Mathematics". arXiv:math/0403186.
- ^ Burgin, Mark (2011). Theory of Named Sets. Mathematics Research Developments. Nova Science Pub Inc. ISBN 978-1-61122-788-8.
Multiset
View on GrokipediaFundamentals
Definition
In mathematics, a multiset is a generalization of the concept of a set that allows multiple instances, or multiplicities, of elements, thereby distinguishing it from standard sets where elements are unique. Formally, a multiset is defined as an ordered pair , where is a set (often called the underlying set or the domain) and is a multiplicity function that assigns to each element a non-negative integer representing the number of times occurs in the multiset.[9] This definition assumes familiarity with basic set theory, including the notions of sets and functions.[9] Alternative formalizations of a multiset include representing it as a set in which elements may appear more than once, such as , where repetitions are explicitly listed; however, this notation is informal and less suitable for infinite or abstract contexts. Another equivalent approach specifies a multiset via its support—the subset of elements with positive multiplicity—and the corresponding multiplicity map restricted to that support, ensuring only for in the support.[9] In a more general transfinite setting, as introduced by Richard Rado, a multiset can be viewed as a function from a set to the class of cardinal numbers, with the domain consisting of elements assigned nonzero cardinals.[9] The cardinality of a multiset is defined as the sum of the multiplicities over its underlying set, given by which counts the total number of element occurrences, including repetitions.[9] The empty multiset, denoted or , is the pair where the underlying set is empty (equivalently, for all in some universe), and it has cardinality 0.[9] A singleton multiset consists of an underlying set with one element and , yielding cardinality 1.[9]Examples
A shopping basket containing two apples and one banana exemplifies a multiset, which can be denoted as {apple, apple, banana} to indicate repetitions or more compactly as {apple: 2, banana: 1}, preserving the multiplicity of each item.[10][11] In comparison, viewing the same basket through the lens of a standard set yields {apple, banana}, which eliminates multiplicity information and thus fails to capture the full contents.[12] A deck of cards, particularly when multiple decks are combined in games allowing duplicates of suits and ranks, represents a multiset where identical cards coexist without regard to order.[13] In mathematics, the prime factorization of an integer provides a clear example: the number 12 decomposes into the multiset {2, 2, 3}, reflecting the repeated prime factor 2. Multisets are commonly notated by listing elements with repetitions, such as {a, a, b}, or by specifying multiplicities, as in {a: 2, b: 1} to denote two instances of a and one of b.[11] The cardinality of a multiset, or its total size, sums the multiplicities of its elements; for instance, the multiset {1: 3, 2: 1} has cardinality 4.[12]Properties and Operations
Basic Properties
A multiset over a base set is defined by its multiplicity function , where denotes the non-negative integers, and the multiplicity indicates the number of occurrences of each element . Two multisets and over the same base set are equal, denoted , if and only if their multiplicity functions agree for every element, that is, for all .[14][15] This equality condition directly follows from the definition of multisets via multiplicity functions, as any difference in multiplicities would distinguish the collections.[16] The inclusion relation for multisets generalizes set inclusion to account for multiplicities. A multiset is a submultiset of another multiset , denoted , if for every .[14][15] This relation is reflexive () and transitive ( and imply ), forming a partial order on the collection of multisets over .[16] A proper submultiset holds when the inequality is strict for at least one , ensuring . To verify inclusion, one compares multiplicities element-wise; if all and equality holds everywhere, then , otherwise is a proper submultiset if the condition holds without equality.[14] The support of a multiset , denoted , is the underlying set consisting of elements with positive multiplicity: .[14][16] This support forms an ordinary set without repetitions and determines the distinct elements present in , relating directly to the base set as a subset where multiplicities vanish outside . For submultisets, if , since zero multiplicity in is compatible with positive in , but the converse does not hold without multiplicity checks.[15] When the base set is equipped with a total order , an ordered multiset inherits this order on its elements, allowing identification of maximal and minimal elements based on the order of distinct items in the support. The maximal element of is the largest under , regardless of multiplicity, and similarly the minimal element is the smallest such .[17] In this totally ordered context, these extremal elements coincide with the greatest and least elements of the support, providing endpoints for the ordered structure.[17]Arithmetic Operations
Multisets support a variety of arithmetic operations that extend their set-theoretic counterparts by accounting for element multiplicities, enabling the construction of new multisets from existing ones. These operations are defined via the multiplicity function , where is the universal set and denotes non-negative integers, ensuring that the resulting multiplicity for each element is computed based on the input multiplicities. Such operations are fundamental in combinatorics and computer science for modeling collections with repetitions, like bags in databases or weighted selections.[16] The union of two multisets and , denoted , is the multiset whose multiplicity function is given by for all . This operation retains the highest occurrence count for each element, analogous to set union but preserving multiplicities. For example, if and , then . Union is commutative, as ; associative, since for non-negative integers ; and idempotent, with .[16] The intersection takes the minimum multiplicity: for all , including only elements common to both with their overlapping counts. Using the prior example, . Intersection shares the commutativity and associativity of union, via and , and is idempotent since .[16] The sum, or disjoint union, adds multiplicities directly: for all , useful in combinatorial enumeration for combining independent selections. For the example multisets, . This operation is commutative, as addition of non-negative integers is commutative, and associative, since .[16] The difference subtracts multiplicities, flooring at zero: for all , removing elements from as much as possible from . In the example, , but , highlighting its non-commutativity. Unlike the others, difference lacks general associativity or idempotence.[16] The Cartesian product extends pairs across elements, with multiplicity as the product: for all , forming a multiset over . For and , includes with multiplicity 2 and with multiplicity 4. This operation is commutative in the sense that up to relabeling of pairs and is associative when extended to multiple factors.[18]Enumeration
Recurrence Relations
The number of multisets of cardinality that can be formed from a base set of distinct types, often denoted , counts the combinations where repetitions of types are permitted and the order of elements is irrelevant.[19] A recursive approach computes via the relation for and , with base cases for all (corresponding to the single empty multiset), , and for (as no positive-sized multiset is possible without types).[20] This recurrence derives from partitioning multisets based on the highest type label, assuming types are labeled through . Multisets excluding type match those of size from types, yielding cases. Multisets including at least one type map bijectively to those of size from types by removing one instance of , yielding cases. The addition principle then combines these disjoint cases.[20] For illustration, the table below shows computed values of for small and , obtained by applying the recurrence iteratively from the base cases:| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 1 | 0 | 0 | 0 | 0 |
| 1 | 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 3 | 4 | 5 |
| 3 | 1 | 3 | 6 | 10 | 15 |
Generating Functions
Generating functions provide a powerful tool for enumerating multisets by encoding the number of such objects of various sizes into a formal power series. For multisets formed from distinct types, the ordinary generating function arises from the independent choice of multiplicity for each type, where the multiplicity of a type contributes the geometric series . Thus, the overall generating function is the product , which counts multisets of any size by the exponent of . This product form reflects the independence of choices across types, as each factor corresponds to one type and the coefficient of sums over all ways to distribute indistinguishable elements into types allowing repetitions.[19] The coefficient of in , denoted , gives the number of -element multisets from types, which equals the binomial coefficient . This follows from the generalized binomial theorem, where . For unrestricted multiset size, the full series , with denoting the number of -multisets, provides a compact representation for enumeration problems.[19] In the multivariate case, when types are labeled and multiplicities tracked separately, the generating function becomes . Here, the coefficient of is 1 if each , corresponding to the unique multiset with those multiplicities. This form extends the univariate case and is useful for distinguishing types in applications like partition enumeration.[19] For example, consider integer partitions, which can be viewed as multisets of positive integers. The generating function for unrestricted partitions (allowing repeated parts) is , reflecting unlimited repetitions of each part size. In contrast, partitions into distinct parts (equivalent to sets, not multisets) have generating function , where each part appears at most once, leading to different coefficient growth rates.[21]Connections to Binomial Expansions
The enumeration of multisets is intimately connected to the generalized binomial theorem, particularly through expansions involving negative exponents. The number of multisets of cardinality chosen from distinct elements is given by the binomial coefficient , which arises as the coefficient in the series expansion .[22] This identity follows from the generalized binomial theorem, where the binomial coefficient for negative upper index is , linking the algebraic expansion directly to multiset counting upon accounting for the sign alternation.[23] The negative binomial series converges for , providing an infinite power series whose coefficients precisely enumerate unrestricted multisets of varying sizes from types.[22] For instance, the coefficient of counts the ways to select elements with repetition allowed from options, equivalent to the number of non-negative integer solutions to . A combinatorial proof of this coefficient uses the stars and bars theorem: represent the selected elements as stars () and the dividers between the types as bars (), forming a sequence of positions where are chosen for bars, yielding arrangements.[24] This visualization maps directly to distributing indistinguishable selections into distinct bins, confirming the multiset count. For generalizations, when multiplicities are weighted or bounded, the exponent in the binomial expansion can be adjusted to reflect the structure, such as using fractional or variable exponents in the series to model weighted selections.[23] An example of the negative upper index connection is that the number of multisets of size from elements equals (-1)^k \binom{-n}{k}, illustrating how the combinatorial interpretation "negates" the usual subset selection to allow repetitions. The hockey-stick identity further relates multiset counts by summing them: , which enumerates all multisets of cardinality at most from elements, as the partial sum of the negative binomial series coefficients.[25] This identity provides a closed form for cumulative multiset enumerations, bridging individual counts to total selections up to a limit.Applications
Combinatorics and Algebra
In partition theory, multisets provide a natural framework for representing integer partitions, where the parts of the partition correspond to the elements of the multiset, allowing repetitions to indicate multiplicity in the summands. This perspective is particularly useful for typed or colored partitions, where distinct types distinguish otherwise identical parts, extending classical unrestricted partitions to more general combinatorial objects. For instance, a multiset partition algebra has been developed to handle such structures, generalizing the Robinson-Schensted-Knuth correspondence to insertions on multiset partitions.[26] Multisets play a key role in the theory of symmetric functions by extending classical bases like power sums and elementary symmetric polynomials to account for multiplicities. The power sum symmetric functions and elementary symmetric polynomials can be generalized to multisets, where variables are weighted by their multiplicities, enabling the enumeration of symmetric objects with repeated labels. This extension aligns with Joyal's symmetric species framework, where multisets serve as the domain for functors mapping finite sets to combinatorial structures, yielding generating functions that capture multiplicity-adjusted symmetries. Seminal work in this area establishes operations like sum, product, and plethysm for symmetric species, directly corresponding to those on symmetric functions. In graph theory, multisets model the edges of multigraphs, where parallel edges between vertices form a multiset of edge incidences, allowing multiple connections without loops in simple cases. The degree sequence of a multigraph, defined as the nonincreasing list of vertex degrees, is inherently a multiset, as degrees can repeat across vertices, facilitating the study of graphical realizations and extremal properties. For example, realizing a given multiset as a degree sequence in a multigraph requires the sum of degrees to be even and no degree exceeding the possible connections.[27] Combinatorial identities involving multisets often center on permutation counts, where the number of distinct permutations of a multiset with elements and multiplicities for distinct types is given by the multinomial coefficient . This formula arises from dividing the total permutations of a set by the repetitions within each type, providing a core identity for counting arrangements with indistinguishable objects. Such identities underpin broader results, like those in the multinomial theorem, and extend to colored multisets for multifactorial generalizations.[28] In algebraic structures, multisets appear in commutative algebra through multiset ideals, which generalize monomial ideals by allowing multiplicities in generators, particularly in zero-dimensional cases where they locally resemble monomial ideals under Gröbner bases. These ideals support efficient computations via algorithms like Cerlienco-Mureddu for lexicographic standard monomials. Additionally, in combinatorial species theory, multisets form the basis for species functors, as introduced by Joyal, enabling the algebraic manipulation of labeled structures through exponential generating functions that account for relabeling and multiplicity. Multisets are treated via extensions such as symmetric species, facilitating compositions like substitutions for complex enumerative problems.[29][30] A specific application arises in necklace counting, where multisets encode color repetitions to determine distinct configurations under rotation. Bijections to multisets with divisible subset sums simplify enumeration for bounded colors. This approach equates necklaces with at most colors to multisets of integers modulo whose subset sums are divisible by , providing an explicit combinatorial correspondence.[31]Computer Science and Data Structures
In computer science, multisets are implemented using data structures that efficiently handle duplicate elements and their multiplicities, enabling operations like insertion, deletion, and querying counts. Balanced binary search trees, such as red-black trees, are commonly used for ordered multisets, as in the C++ standard library'sstd::multiset, which maintains elements in sorted order while allowing multiples of the same key. This structure supports logarithmic time complexity for key operations, making it suitable for scenarios requiring sorted access or range queries. Alternatively, hash maps provide an unordered implementation by associating each unique element with a counter for its multiplicity, as seen in libraries like Apache Commons Collections' HashMultiSet, which offers average-case constant-time insertions and lookups at the expense of order preservation.
Algorithms for multiset operations, such as union and intersection, leverage the underlying structure for efficiency. For sorted multisets, union and intersection can be computed in linear time O(n + m) using merge-like procedures on sorted sequences, similar to those in C++'s <algorithm> header for set operations adapted to handle multiplicities by taking maximum or minimum counts, respectively. These algorithms find applications in sorting with duplicates, where multisets preserve order and multiplicity during stable sorts, avoiding the need for auxiliary tracking structures. Arithmetic operations like multiset union and intersection, as defined mathematically, align with these implementations for practical computation.
Programming languages provide built-in or library support for multisets through specialized classes. In Python, the collections.Counter class implements a multiset as a dictionary subclass, enabling frequency counting of hashable objects with methods like update() for adding elements and most_common() for retrieving elements by count. Similarly, Java's Guava library offers a Multiset interface with implementations like HashMultiset and TreeMultiset, supporting operations such as count(), add(), and remove() to manage multiplicities efficiently.
In databases, multisets model duplicate records naturally, with SQL's GROUP BY clause combined with COUNT() aggregating rows by key while tracking frequencies, as standardized in SQL-92 and implemented in systems like SQL Server. For example, querying SELECT column, COUNT(*) FROM table GROUP BY column produces a multiset view of unique values and their occurrences, facilitating duplicate analysis without explicit multiplicity storage.
The time and space complexities of multiset operations vary by implementation. Tree-based multisets, like std::multiset, achieve O(log n) time for insert, delete, and search operations, with O(n) space for n elements, due to the balanced tree overhead. Hash-based versions offer amortized O(1) time for these operations but may degrade to O(n) in worst-case scenarios from hash collisions, still using O(n) space. As of 2025, modern applications include machine learning, where bag-of-words models in scikit-learn use CountVectorizer to represent documents as multisets of terms for frequency-based feature extraction in tasks like text classification. In search engines, term frequencies are handled as multisets to compute relevance scores, such as in TF-IDF weighting, powering inverted indexes in systems like Elasticsearch.
