Recent from talks
Nothing was collected or created yet.
Ranking
View on WikipediaThis article needs additional citations for verification. (June 2023) |
A ranking is a relationship between a set of items, often recorded in a list, such that, for any two items, the first is either "ranked higher than", "ranked lower than", or "ranked equal to" the second.[1] In mathematics, this is known as a weak order or total preorder of objects. It is not necessarily a total order of objects because two different objects can have the same ranking. The rankings themselves are totally ordered. For example, materials are totally preordered by hardness, while degrees of hardness are totally ordered. If two items are the same in rank it is considered a tie.
By reducing detailed measures to a sequence of ordinal numbers, rankings make it possible to evaluate complex information according to certain criteria. Thus, for example, an Internet search engine may rank the pages it finds according to an estimation of their relevance, making it possible for the user quickly to select the pages they are likely to want to see.
Analysis of data obtained by ranking commonly requires non-parametric statistics.
Strategies for handling ties
[edit]It is not always possible to assign rankings uniquely. For example, in a race or competition two (or more) entrants might tie for a place in the ranking.[2] When computing an ordinal measurement, two (or more) of the quantities being ranked might measure equal. In these cases, one of the strategies below for assigning the rankings may be adopted.
A common shorthand way to distinguish these ranking strategies is by the ranking numbers that would be produced for four items, with the first item ranked ahead of the second and third (which compare equal) which are both ranked ahead of the fourth.[3] These names are also shown below.
Standard competition ranking ("1224" ranking)
[edit]In competition ranking, items that compare equal receive the same ranking number, and then a gap is left in the ranking numbers. The number of ranking numbers that are left out in this gap is one less than the number of items that compared equal. Equivalently, each item's ranking number is 1 plus the number of items ranked above it. This ranking strategy is frequently adopted for competitions, as it means that if two (or more) competitors tie for a position in the ranking, the position of all those ranked below them is unaffected (i.e., a competitor only comes second if exactly one person scores better than them, third if exactly two people score better than them, fourth if exactly three people score better than them, etc.).
Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 4 ("fourth").
This method is called "Low" by IBM SPSS[4] and "min" by the R programming language[5] in their methods to handle ties.
Modified competition ranking ("1334" ranking)
[edit]Sometimes, competition ranking is done by leaving the gaps in the ranking numbers before the sets of equal-ranking items (rather than after them as in standard competition ranking). The number of ranking numbers that are left out in this gap remains one less than the number of items that compared equal. Equivalently, each item's ranking number is equal to the number of items ranked equal to it or above it. This ranking ensures that a competitor only comes second if they score higher than all but one of their opponents, third if they score higher than all but two of their opponents, etc.
Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 3 ("joint third"), C also gets ranking number 3 ("joint third") and D gets ranking number 4 ("fourth"). In this case, nobody would get ranking number 2 ("second") and that would be left as a gap.
This method is called "High" by IBM SPSS[4] and "max" by the R programming language[5] in their methods to handle ties.
Dense ranking ("1223" ranking)
[edit]In dense ranking, items that compare equally receive the same ranking number, and the next items receive the immediately following ranking number. Equivalently, each item's ranking number is 1 plus the number of items ranked above it that are distinct with respect to the ranking order.
Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 3 ("Third").
This method is called "Sequential" by IBM SPSS[4] and "dense" by the R programming language[6] in their methods to handle ties.
Ordinal ranking ("1234" ranking)
[edit]In ordinal ranking, all items receive distinct ordinal numbers, including items that compare equal. The assignment of distinct ordinal numbers to items that compare equal can be done at random, or arbitrarily, but it is generally preferable to use a system that is arbitrary but consistent, as this gives stable results if the ranking is done multiple times. An example of an arbitrary but consistent system would be to incorporate other attributes into the ranking order (such as alphabetical ordering of the competitor's name) to ensure that no two items exactly match.
With this strategy, if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first") and D gets ranking number 4 ("fourth"), and either B gets ranking number 2 ("second") and C gets ranking number 3 ("third") or C gets ranking number 2 ("second") and B gets ranking number 3 ("third").
In computer data processing, ordinal ranking is also referred to as "row numbering".
This method corresponds to the "first", "last", and "random" methods in the R programming language[5] to handle ties.
Fractional ranking ("1 2.5 2.5 4" ranking)
[edit]Items that compare equal receive the same ranking number, which is the mean of what they would have under ordinal rankings; equivalently, the ranking number of 1 plus the number of items ranked above it plus half the number of items equal to it. This strategy has the property that the sum of the ranking numbers is the same as under ordinal ranking. For this reason, it is used in computing Borda counts and in statistical tests (see below).
Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B and C each get ranking number 2.5 (average of "joint second/third") and D gets ranking number 4 ("fourth").
Here is an example: Suppose you have the data set 1.0, 1.0, 2.0, 3.0, 3.0, 4.0, 5.0, 5.0, 5.0.
The ordinal ranks are 1, 2, 3, 4, 5, 6, 7, 8, 9.
For v = 1.0, the fractional rank is the average of the ordinal ranks: (1 + 2) / 2 = 1.5. In a similar manner, for v = 5.0, the fractional rank is (7 + 8 + 9) / 3 = 8.0.
Thus the fractional ranks are: 1.5, 1.5, 3.0, 4.5, 4.5, 6.0, 8.0, 8.0, 8.0
This method is called "Mean" by IBM SPSS[4] and "average" by the R programming language[5] in their methods to handle ties.
Statistics
[edit]In statistics, ranking is the data transformation in which numerical or ordinal values are replaced by their rank when the data are sorted.
For example, the ranks of the numerical data 3.4, 5.1, 2.6, 7.3 are 2, 3, 1, 4.
As another example, the ordinal data hot, cold, warm would be replaced by 3, 1, 2. In these examples, the ranks are assigned to values in ascending order, although descending ranks can also be used.
Ranks are related to the indexed list of order statistics, which consists of the original dataset rearranged into ascending order.Sports
[edit]
Education
[edit]This section needs additional citations for verification. (June 2023) |
League tables are used to compare the academic achievements of different institutions. College and university rankings order institutions in higher education by combinations of factors. In addition to entire institutions, specific programs, departments, and schools are ranked. These rankings usually are conducted by magazines, newspapers, governments and academics. For example, league tables of British universities are published annually by The Independent, The Sunday Times, and The Times.[7] The primary aim of these rankings is to inform potential applicants about British universities based on a range of criteria. Similarly, in countries like India, league tables are being developed and a popular magazine, Education World, published them based on data from TheLearningPoint.net. [citation needed]
It is complained that the ranking of England's schools to rigid guidelines that fail to take into account wider social conditions actually makes failing schools even worse. This is because the most involved parents will then avoid such schools, leaving only the children of non-ambitious parents to attend.[8]
Business
[edit]This section needs additional citations for verification. (May 2025) |
In business, league tables list the leaders in the business activity within a specific industry, ranking companies based on different criteria including revenue, earnings, and other relevant key performance indicators (such as market share and meeting customer expectations) enabling people to quickly analyze significant data.[9]
Applications
[edit]The rank methodology based on some specific indices is one of the most common systems used by policy makers and international organizations in order to assess the socio-economic context of the countries. Some notable examples include the Human Development Index (United Nations), Doing Business Index (World Bank), Corruption Perceptions Index (Transparency International), and Index of Economic Freedom (the Heritage Foundation). For instance, the Doing Business Indicator of the World Bank measures business regulations and their enforcement in 190 countries. Countries are ranked according to ten indicators that are synthesized to produce the final rank. Each indicator is composed of sub-indicators; for instance, the Registering Property Indicator is composed of four sub-indicators measuring time, procedures, costs, and quality of the land registration system. These kinds of ranks are based on subjective criteria for assigning the score. Sometimes, the adopted parameters may produce discrepancies with the empirical observations, therefore potential biases and paradox may emerge from the application of these criteria.[10]
Other examples
[edit]- In politics, rankings may focus on the comparison of economic, social, environmental and governance performance of countries. Politicians themselves have also been ranked, based on the extent of their activities.[11]
- In relation to credit standing, the ranking of a security refers to where that particular security would stand in a wind up of the issuing company, i.e., its seniority in the company's capital structure. For instance, capital notes are subordinated securities; they would rank behind senior debt in a wind up. In other words, the holders of senior debt would be paid out before subordinated debt holders received any funds.
- Search engines rank web pages by their expected relevance to a user's query using a combination of query-dependent and query-independent methods. Query-independent methods attempt to measure the estimated importance of a page, independent of any consideration of how well it matches the specific query. Query-independent ranking is usually based on link analysis; examples include the HITS algorithm, PageRank and TrustRank. Query-dependent methods attempt to measure the degree to which a page matches a specific query, independent of the importance of the page. Query-dependent ranking is usually based on heuristics that consider the number and locations of matches of the various query words on the page itself, in the URL or in any anchor text referring to the page.
- In webometrics, it is possible to rank institutions according to their presence in the web (number of webpages) and the impact of these contents, such as the Webometrics Ranking of World Universities.
- In video gaming, players may be given a ranking. To "rank up" is to achieve a higher ranking relative to other players, especially with strategies that do not depend on the player's skill.
- The TrueSkill ranking system is a skill based ranking system for Xbox Live developed at Microsoft Research.
- A bibliogram ranks common noun phrases in a piece of text.
- In language, the status of an item (usually through what is known as "downranking" or "rank-shifting") in relation to the uppermost rank in a clause; for example, in the sentence "I want to eat the cake you made today", "eat" is on the uppermost rank, but "made" is downranked as part of the nominal group "the cake you made today"; this nominal group behaves as though it were a single noun (i.e., I want to eat it), and thus the verb within it ("made") is ranked differently from "eat".
- Academic journal[broken anchor]s are sometimes ranked according to impact factor; the number of later articles that cite articles in a given journal.
See also
[edit]References
[edit]- ^ "Definition of RANKING".
- ^ Sulich, Adam. "The young people's labour market and crisis of integration in European Union". Retrieved 2017-03-04.
- ^ "The Data School - How to Rank by Group in Alteryx - Part 1 - Standard Competition, Dense, Ordinal Ranking". www.thedataschool.co.uk. Retrieved 2023-07-23.
- ^ a b c d "Rank Cases: Ties". www.ibm.com. Retrieved 2023-07-23.
- ^ a b c d "rank function - RDocumentation". www.rdocumentation.org. Retrieved 2023-07-23.
- ^ "R: Fast Sample Ranks". search.r-project.org. Retrieved 2023-07-23.
- ^ "Rankings of universities in the United Kingdom", Wikipedia, 2024-06-05, retrieved 2024-06-15
- ^ Chris Roberts, Heavy Words Lightly Thrown: The Reason Behind Rhyme, Thorndike Press, 2006 (ISBN 0-7862-8517-6)
- ^ Business Ranking Annual. Gale Research International, Limited. October 2000. p. 740. ISBN 9780787640255.
- ^ RIEDS, Italian Review of Economics Demography and Statistics (2014). "World Bank Doing Business Project and the statistical methods based on ranks: the paradox of the time indicator". Rieds - Rivista Italiana di Economia, Demografia e Statistica - the Italian Journal of Economic, Demographic and Statistical Studies. 68 (1): 79–86.
- ^ Tofallis, Chris (2022). "A Multidimensional Ranking of Members of Parliament" (PDF). Radical Statistics (133): 3–29.
External links
[edit]Ranking
View on GrokipediaFundamentals
Definition and Basic Principles
In mathematics and related fields, a ranking refers to the assignment of positions to elements of a set based on a comparative relation, typically yielding a total order where every pair of distinct elements is strictly comparable, ensuring a complete linear arrangement without ambiguities in relative positioning./07%3A_Relations/7.04%3A_Partial_and_Total_Ordering) This structure contrasts with partial orders, which permit incomparabilities between some elements, as seen in applications like preference aggregation where not all items can be directly ranked against each other./07%3A_Relations/7.04%3A_Partial_and_Total_Ordering) The foundational principle derives from order theory, where the relation must satisfy totality (for any two elements and , either or ), transitivity (if and , then ), and irreflexivity (no ) for strict rankings.[7] Basic principles of ranking emphasize the ordinal nature of the output, focusing on relative positions rather than cardinal differences in magnitude, which distinguishes rankings from interval or ratio scales in measurement theory.[1] In statistical contexts, rankings transform raw data by sorting observations and assigning integers corresponding to their order, with ties often resolved via methods like average ranking (e.g., for tied values at positions 3 and 4 in a list of 5, both receive rank 3.5) to maintain consistency.[8] This process preserves the underlying order while mitigating the influence of outliers or non-normal distributions, enabling non-parametric analyses such as the Wilcoxon rank-sum test, which relies on the ranks themselves rather than original values for hypothesis testing.[9] Rankings underpin applications across domains, from electoral systems—where voter preferences form individual total orders aggregated into a collective ranking—to search engine algorithms that prioritize results based on relevance scores converted to ranks.[1] However, real-world rankings frequently encounter challenges like intransitivities (e.g., Condorcet cycles in voting, where , , ) or incomplete information, necessitating extensions beyond pure total orders, such as weak orders that incorporate equivalence classes for ties.[8] These principles ensure rankings reflect causal priorities or empirical comparisons faithfully, prioritizing comparability and stability over absolute quantification.[10]Types of Rankings
Rankings in mathematics and related fields are broadly classified by their completeness, allowing for distinctions between total rankings, where every pair of elements is comparable, and partial rankings, where some elements may remain incomparable.[11] Total rankings correspond to linear orders or total preorders, ensuring a complete serialization of elements, as seen in applications like tournament outcomes or preference aggregation where all items must be ordered relative to one another.[11][1] Partial rankings, modeled as partial orders, permit incomparabilities, which arise in scenarios such as hierarchical structures or incomplete preference data, and can be extended to total rankings via theorems like Szpilrajn's extension theorem.[11] Another key classification distinguishes ordinal from cardinal rankings based on the informational content encoded. Ordinal rankings capture only relative orderings without quantifying the magnitude of differences, as in ranking candidates by pairwise preferences in voting systems.[1] Cardinal rankings incorporate numerical values to represent intensities or utilities, enabling computations like weighted averages, as utilized in methods such as the Analytic Hierarchy Process for decision-making under multiple criteria.[1] This distinction is central in social choice theory, where ordinal approaches avoid interpersonal utility comparisons but may lose efficiency, while cardinal methods can optimize aggregate welfare but risk strategic manipulation.[12] Rankings may further vary by handling of equivalences or ties, leading to strict rankings (antisymmetric, no ties) versus weak rankings (preorders allowing indifference classes). Strict rankings enforce unique positions, suitable for competitive zero-sum settings like sports leagues, whereas weak rankings group tied elements, preserving transitivity in broader preference models.[11] These types intersect; for instance, a total ordinal ranking might use dense numbering to assign consecutive integers to tied groups, while cardinal variants could assign identical scores to equivalents.[1]Historical Development
Origins in Mathematics and Early Applications
The mathematical foundations of ranking emerged in the late 18th century amid debates over fair methods for aggregating individual preferences into collective orders, particularly within the French Academy of Sciences. Jean-Charles de Borda, a mathematician and naval engineer, proposed an early positional ranking system in his 1784 Mémoire sur les élections au scrutin, motivated by flaws in plurality voting observed in academy elections.[13] Borda's method assigned points to candidates based on their ranked positions across voter ballots: in an election with m candidates, the highest-ranked receives m-1 points, the next m-2, down to 0 for the lowest, with the aggregate score determining the overall ranking.[14] This approach aimed to account for the intensity of preferences rather than mere first-place votes, providing a quantitative basis for ordinal comparisons.[13] The Marquis de Condorcet critiqued Borda's system shortly thereafter, publishing his Essai sur l'application de l'analyse à la probabilité des décisions rendues à la pluralité des voix in 1785, which introduced pairwise comparison as a foundational ranking technique.[14] Condorcet's method evaluated candidates by conducting hypothetical head-to-head contests between each pair, declaring a candidate the Condorcet winner if it defeated every opponent by majority vote; rankings followed from the transitive closure of these pairwise majorities where possible.[13] He also identified the Condorcet paradox, where cyclic preferences (e.g., A beats B, B beats C, C beats A by majorities) prevent a stable ranking, highlighting inherent challenges in deriving total orders from partial voter inputs.[14] These innovations drew on probability theory to assess decision reliability, framing ranking as a problem of probabilistic consensus rather than deterministic tallying.[13] Early applications centered on internal academy proceedings, where the French Academy served as a testing ground for refining electoral processes during the Enlightenment and French Revolution. Borda's method gained traction post-Revolution, influencing the Academy's adoption of ranked voting for membership selections by 1795, as endorsed by Pierre Daunou, and was used in electing figures like Napoleon to the National Institute between 1795 and 1815.[13] Condorcet's pairwise approach, though less immediately implemented due to paradox risks, informed probabilistic analyses of jury decisions and legislative voting, extending ranking principles to broader decision-making under uncertainty.[14] These developments marked the shift from ad hoc ordering to axiomatic, preference-based ranking, laying groundwork for later extensions in social choice without reliance on cardinal utilities.[13]Evolution in Statistics and Social Choice
In social choice theory, the formal study of aggregating individual rankings into collective orderings originated in the late 18th century amid French Enlightenment debates on electoral reform. Jean-Charles de Borda introduced the Borda count in 1781, a method assigning points to candidates proportional to the number of alternatives ranked below them by voters, aiming to reflect the intensity of preferences through positional scoring.[15] Shortly thereafter, in 1785, the Marquis de Condorcet advanced pairwise majority comparisons, defining a Condorcet winner as the alternative that defeats every rival in head-to-head contests, while also demonstrating the Condorcet paradox wherein cyclic preferences prevent transitive social rankings despite individual transitivity.[16] These foundational approaches highlighted tensions between majority rule and coherent aggregation, influencing subsequent voting systems like approval and range voting. The 20th century brought rigorous impossibilities to social choice rankings. Kenneth Arrow's 1951 theorem proved that no non-dictatorial method can aggregate three or more ordinal rankings into a social welfare function satisfying unanimity, independence of irrelevant alternatives, and Pareto efficiency, underscoring inherent limitations in deriving transitive group orderings from diverse individual preferences.[17] This result spurred developments in Condorcet extensions and scoring rules, such as the Kemeny-Young method minimizing inversions relative to an ideal ranking, though computational intractability limited practical adoption for large electorates.[17] Parallel evolution occurred in statistics, where rankings provided robust, distribution-free tools for inference amid growing data from non-normal sources. Charles Spearman formulated the rank correlation coefficient in 1904, measuring monotonic associations between paired observations by correlating their ranks, thus avoiding parametric assumptions like linearity required by Pearson's correlation and proving effective for ordinal data in psychology and biometrics.[18] Building on this, Frank Wilcoxon developed the rank-sum test in 1945 for two-sample comparisons, ranking combined observations and summing ranks within groups to test location shifts without normality, offering greater power against heavy-tailed distributions than t-tests.[19] Mid-century extensions generalized ranking to multi-group settings. William Kruskal and W. Allen Wallis introduced the Kruskal-Wallis test in 1952, an analog to one-way ANOVA that ranks all observations across k groups and computes between-group variance in average ranks, detecting differences in medians under minimal assumptions and applicable to heterogeneous variances.[20] These non-parametric advances, rooted in permutation invariance, proliferated in experimental sciences by the 1960s, as evidenced by their integration into standard texts like Hollander and Wolfe's, emphasizing empirical superiority in small samples or outliers over parametric rivals.[21] The interplay between social choice and statistical ranking intensified post-1950s, with shared concerns over robustness to preference heterogeneity; statistical ranks informed social choice simulations of voting paradoxes, while aggregation challenges inspired rank-based robust estimation in econometrics, such as median-based orderings over means.[22] This convergence yielded hybrid methods, like probabilistic rankings via bootstrap resampling of preferences, prioritizing causal interpretability over idealized axioms.Core Methods
Strategies for Handling Ties
In ranking systems, ties arise when two or more entities receive identical scores or evaluations, complicating the assignment of distinct ordinal positions. This issue is prevalent in statistical analysis, competitions, and decision-making processes, where unresolved ties can distort measures like rank correlations or aggregate standings. Standard approaches aim to maintain consistency with the underlying data distribution while minimizing bias in downstream computations, such as variance estimates in non-parametric tests.[23][24] The most common statistical strategy is the mid-rank or average rank method, where tied values are assigned the mean of the ranks they would occupy if ordered distinctly. For instance, if two observations tie for positions 3 and 4 in a list of five, both receive rank 3.5, preserving the overall mean rank while reducing variance compared to untied data. This approach is widely adopted in rank-order statistics and non-parametric methods like Spearman's rank correlation coefficient, as it avoids inflating or deflating the sum of ranks and facilitates unbiased estimation of parameters.[23][25] Empirical studies show it performs robustly for tied datasets, though it requires corrections for correlation coefficients to account for reduced variability.[25] Alternative methods include minimum rank assignment, which grants tied entities the lowest possible rank (e.g., both receive rank 3 in the above example, with subsequent items starting at 5), or maximum rank, assigning the highest (e.g., both get 4, skipping none). These are less common in pure statistical contexts due to their asymmetry, which can bias order statistics toward over- or under-ranking, but they appear in software implementations for specific applications like competition scoring. Dense ranking assigns identical ranks without skipping (e.g., 3,3,4), preserving sequential order and minimizing gaps, while standard competition ranking skips ranks (e.g., 3,3,5) to reflect the "lost" positions. The choice impacts aggregation; dense ranking suits dense datasets, whereas competition ranking aligns with ordinal scarcity in events like athletics.[26][24] In domains requiring decisive outcomes, such as matching algorithms or resource allocation, ties are often resolved via hierarchical tie-breakers—secondary criteria like additional metrics, random lotteries, or predefined priorities—or stochastic methods like single or multiple tie-breaking lotteries. For example, in school choice mechanisms, single tie-breaking uses a uniform random order across all ties, while multiple applies independent lotteries per group, with hybrid variants shown to dominate in efficiency under certain dominance criteria. These prevent indeterminacy but introduce variability, necessitating evaluation against stability metrics. Random tie-breaking, while equitable in expectation, can amplify noise in small samples and is critiqued for lacking reproducibility unless seeded deterministically.[27][28] Selection of a strategy depends on the ranking's purpose: statistical neutrality favors mid-ranks for preserving distributional properties, while operational contexts prioritize resolvability via tie-breakers to enable actions like winner selection. No universal method eliminates all distortions, as ties inherently compress information, and empirical validation—such as comparing correlation coefficients pre- and post-correction—is recommended for quantitative rankings.[25][29]Statistical and Non-Parametric Techniques
Statistical techniques for ranking often involve parametric models that assign underlying probabilities or strengths to items, enabling inference on rankings from observed preferences or comparisons. The Bradley-Terry model, introduced in 1952, posits that in pairwise comparisons, the probability that item outranks item equals , where represents the latent strength of item .[30] Parameter estimates are obtained via maximum likelihood, allowing derivation of overall rankings by ordering the . This model underpins applications in sports ratings and paired preference studies, assuming independence of comparisons.[31] The Plackett-Luce model extends this framework to full or partial rankings by modeling the probability of a specific ordering as the product, over successive positions, of the strength of the chosen item divided by the sum of strengths of remaining items: , where is the ranking.[32] Developed independently by Plackett in 1975 and generalizing Luce's 1959 choice axiom, it accommodates ties and incomplete data through extensions.[33] Estimation typically uses iterative methods like minorization-maximization, with applications in recommender systems and sensory evaluation.[34] Non-parametric techniques, by contrast, eschew distributional assumptions, relying instead on ranks or permutations for robustness against outliers and non-normality. Spearman's rank correlation coefficient, where are rank differences, quantifies monotonic association between two rankings, serving as the Pearson correlation on ranked data.[35] It tests concordance without assuming linearity, with significance via permutation or t-approximation for large .[36] Kendall's tau, another rank-based measure, counts concordant and discordant pairs: , where and are agreeing and disagreeing pairs across rankings.[37] Developed by Kendall in 1938 and refined in 1945, adjusts for ties, offering sensitivity to order inversions over Spearman's distance-based approach. Both are distribution-free, with exact p-values computable via hypergeometric probabilities for small samples, and find use in validating aggregated rankings or detecting preference shifts.[38] For hypothesis testing on rankings, non-parametric procedures like the Friedman test extend to multiple related samples, ranking items within blocks and comparing average ranks via chi-squared approximation: , where items are ranked times.[39] Post-hoc pairwise Wilcoxon signed-rank tests assess specific differences, preserving ordinal structure without parametric forms. These methods, detailed in texts on ranking statistics, prioritize empirical rank distributions over modeled probabilities.[40]Computational Approaches
Algorithmic Ranking Models
Algorithmic ranking models refer to deterministic computational frameworks that generate total or partial orderings from input data, such as numerical scores, pairwise preferences, or graph structures, by applying fixed rules or optimization procedures rather than data-driven parameter learning. These models prioritize efficiency and interpretability, often solving ranking as a sorting, aggregation, or minimization problem. Common implementations include score aggregation, pairwise tournament resolution, and iterative propagation methods, with roots in operations research and graph theory. Unlike statistical models, they emphasize exact or heuristic solutions to well-defined objectives, though many face challenges from computational intractability for large inputs.[41] Score-based ranking algorithms assign a scalar utility or relevance value to each item and sort them in descending order, providing a straightforward mechanism for ordinal output. For instance, in information retrieval, the Vector Space Model computes cosine similarity between query and document term vectors, weighting terms by frequency and inverse document frequency (TF-IDF), to rank documents by projected relevance; this approach, formalized in 1975, scales linearly with vector dimensions after indexing. Similarly, BM25, an enhancement of the binary independence model introduced in the 1990s, refines probabilistic scoring with term saturation functions to mitigate length bias, achieving state-of-the-art performance on benchmarks like TREC without iterative training. These methods assume additivity of features, enabling O(n log n) sorting via standard algorithms like quicksort, but they falter when scores lack cardinal meaning or interdependencies exist.[42][43] Optimization-based models, such as the Kemeny-Young method, aggregate multiple partial rankings by selecting the total order that minimizes the sum of pairwise disagreements (equivalent to Kendall-tau distance) across inputs, formulated as a minimum feedback arc set problem on a weighted tournament graph. Proposed in 1959, it yields a Condorcet-efficient solution when cycles are absent but is NP-hard in general, with exact branch-and-bound solvers achieving feasibility for up to 20-30 items via symmetry reduction and bounding techniques as of 2023. Approximations, including local search or integer programming relaxations, trade optimality for scalability, often within 5-10% of the global minimum on real-world datasets like elections or sports outcomes. This approach excels in consensus-seeking scenarios but requires complete pairwise data, exposing vulnerabilities to strategic manipulation via Arrow's impossibility theorem implications.[44] Graph-based iterative algorithms propagate rankings through network structures, computing stationary scores via matrix operations or fixed-point convergence. PageRank, deployed by Google since 1998, models web pages as nodes in a Markov chain, assigning authority proportional to incoming links damped by a teleportation factor (typically 0.15), solved via power iteration in O(E log n) time for sparse graphs with E edges. Variants like HITS (Hyperlink-Induced Topic Search), introduced in 1998, alternately optimize hub and authority eigenvectors, converging in 10-20 iterations for typical corpora but susceptible to spam through link farms. These models capture transitive influences causally, outperforming flat scoring in directed acyclic graphs, yet demand damping to ensure ergodicity and handle dangling nodes explicitly. Empirical evaluations on citation networks confirm their robustness to noise when link quality is high.[45][46]Learning to Rank in Machine Learning
Learning to rank (LTR) constitutes a supervised machine learning framework designed to train models that generate relevance-based orderings of candidate items, such as documents or products, in response to a query. The core objective is to learn a scoring function from training data comprising queries , feature vectors for items , and associated relevance labels (typically ordinal grades like 0 for irrelevant to 4 for perfect match), such that sorting items by predicted scores minimizes discrepancies with the ground-truth ranking implied by .[47] This approach addresses the ordinal nature of rankings, where absolute scores matter less than relative positions, and has been empirically validated in information retrieval tasks through datasets like LETOR, which provide thousands of query-document pairs for benchmarking.[48] LTR methods diverge into three paradigms based on how they formulate the loss during training: pointwise, pairwise, and listwise. Pointwise methods regress or classify individual items' relevance scores independently, akin to standard regression tasks, using losses like squared error or cross-entropy on versus ; subsequent ranking follows from sorting these scores. This simplifies computation but overlooks inter-item dependencies, often yielding suboptimal performance on ranking metrics, as evidenced by comparisons in benchmark evaluations where pointwise models lag behind relational approaches by 5-10% in normalized discounted cumulative gain (NDCG).[49] Pairwise approaches, by contrast, focus on relative orders by considering document pairs where , optimizing a loss that penalizes violations of , such as pairwise logistic loss or hinge loss in support vector machines for ranking (RankSVM). Pioneered in models like RankNet, which employs neural networks with gradient-based updates approximating probabilistic pairwise preferences via cross-entropy, these methods directly enforce ordinal constraints and have demonstrated superior empirical results in search engine applications, reducing ranking errors by capturing pairwise swaps more effectively than pointwise alternatives.[47][50] Listwise methods extend this by treating the entire permutation of items as the instance, directly approximating ranking metrics like NDCG or mean average precision (MAP) through surrogate losses, such as soft-ranking formulations or permutation probability distributions (e.g., ListNet's Kullback-Leibler divergence between predicted and ideal list probabilities). These capture global list structure and global optimization, often outperforming pairwise methods in large-scale evaluations—for instance, LambdaMART, a gradient-boosted tree variant incorporating listwise lambda gradients, achieved state-of-the-art NDCG scores on Yahoo! Learning to Rank datasets in 2009 competitions.[47][51] Modern implementations, including those in libraries like XGBoost and LightGBM, support listwise objectives with NDCG approximations, enabling scalable training on millions of examples while maintaining statistical consistency with evaluation metrics.[51] Evaluation in LTR emphasizes position-aware metrics over classification accuracy, with NDCG prioritizing higher ranks for highly relevant items via discounted gains (e.g., NDCG@10 weights top positions exponentially) and MAP averaging precision across recall levels; empirical studies confirm these better correlate with user satisfaction in retrieval tasks than mean squared error.[49] Despite computational demands—listwise methods scaling quadratically or worse without approximations—advances in gradient estimation and sampling have rendered LTR viable for production systems, as deployed in search engines since the mid-2000s.[48]Applications in Specific Domains
Sports and Competitive Events
In sports and competitive events, rankings serve to order participants or teams by performance, influencing seeding, playoff qualification, and resource allocation. These systems aggregate outcomes from matches or competitions, often incorporating factors like opponent strength to mitigate schedule variability. Objective methods, such as rating systems and statistical models, predominate in professional leagues, while subjective polls persist in contexts like collegiate athletics where human judgment accounts for qualitative elements.[52][53] Elo rating systems, originally developed for chess in the 1960s by Arpad Elo, adapt probabilistic models to predict match outcomes based on rating differentials, adjusting ratings post-game to reflect actual results against expectations. In soccer, FIFA employs a modified Elo variant known as the "SUM" method since March 2018, calculating points exchanged per match via the formula incorporating actual result minus expected result, scaled by match importance (e.g., friendlies yield 5-10 points, World Cup finals 60) and opponent ranking. This yields monthly global rankings for 210+ national teams, with Argentina holding the top spot as of October 2025 after their 2022 World Cup victory. Chess federations like FIDE update Elo ratings after rated games, with top players exceeding 2800 points, emphasizing pairwise comparisons over cumulative points.[54][55][56] Tennis rankings, managed by the ATP for men, accumulate points from up to 19 tournaments over 52 weeks, with Grand Slam winners earning 2000 points and ATP Masters 1000 champions 1000, decaying older results to prioritize recency. This "race" system ensures dynamic shifts, as seen in Novak Djokovic's record 428 weeks at No. 1 through 2024. In baseball, Major League Baseball uses win-loss records for divisional standings, supplemented by Pythagorean expectation to forecast "true" talent: expected win percentage equals (runs scored)^1.83 divided by [(runs scored)^1.83 + (runs allowed)^1.83], revealing teams like the 2007 Boston Red Sox outperforming their record en route to a World Series title.[57][58] American college football rankings blend human and computational inputs; the Associated Press Poll, since 1936, aggregates votes from journalists for a top-25 list, while the College Football Playoff committee, introduced in 2014, selects four semifinalists considering strength of schedule, head-to-head results, and conference championships alongside computer models like the Colley Matrix, which solves a linear system minimizing bias in win-loss adjustments. Ties in standings often resolve via head-to-head records, conference winning percentage, or multi-team tiebreakers prioritizing comparative victories. These methods, while predictive, face critiques for undermeasuring intangibles like home-field advantage, prompting ongoing refinements toward hybrid objective-subjective frameworks.[53][59]| Sport | Primary Ranking Method | Key Features | Example Citation |
|---|---|---|---|
| Soccer (FIFA) | Modified Elo (SUM) | Points exchange based on expected vs. actual outcome, match importance multiplier | [55] |
| Tennis (ATP) | Accumulated points (52-week rolling) | Tournament-specific awards, best-of-19 events | [57] |
| Baseball (MLB) | Win-loss + Pythagorean expectation | Runs scored/allowed ratio for expected wins | [58] |
| College Football | Human polls + computer models | Votes adjusted for schedule strength | [53] |
