Hubbry Logo
Pairwise comparison (psychology)Pairwise comparison (psychology)Main
Open search
Pairwise comparison (psychology)
Community hub
Pairwise comparison (psychology)
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Pairwise comparison (psychology)
Pairwise comparison (psychology)
from Wikipedia

Pairwise comparison generally is any process of comparing entities in pairs to judge which of each entity is preferred, or has a greater amount of some quantitative property, or whether or not the two entities are identical. The method of pairwise comparison is used in the scientific study of preferences, attitudes, voting systems, social choice, public choice, requirements engineering and multiagent AI systems. In psychology literature, it is often referred to as paired comparison.

Prominent psychometrician L. L. Thurstone first introduced a scientific approach to using pairwise comparisons for measurement in 1927, which he referred to as the law of comparative judgment. Thurstone linked this approach to psychophysical theory developed by Ernst Heinrich Weber and Gustav Fechner. Thurstone demonstrated that the method can be used to order items along a dimension such as preference or importance using an interval-type scale.

Mathematician Ernst Zermelo (1929) first described a model for pairwise comparisons for chess ranking in incomplete tournaments, which serves as the basis (even though not credited for a while) for methods such as the Elo rating system and is equivalent to the Bradley–Terry model that was proposed in 1952.

Overview

[edit]

If an individual or organization expresses a preference between two mutually distinct alternatives, this preference can be expressed as a pairwise comparison. If the two alternatives are x and y, the following are the possible pairwise comparisons:

The agent prefers x over y: "x > y" or "xPy"

The agent prefers y over x: "y > x" or "yPx"

The agent is indifferent between both alternatives: "x = y" or "xIy"

Probabilistic models

[edit]

In terms of modern psychometric theory probabilistic models, which include Thurstone's approach (also called the law of comparative judgment), the Bradley–Terry–Luce (BTL) model, and general stochastic transitivity models,[1] are more aptly regarded as measurement models. The Bradley–Terry–Luce (BTL) model is often applied to pairwise comparison data to scale preferences. The BTL model is identical to Thurstone's model if the simple logistic function is used. Thurstone used the normal distribution in applications of the model. The simple logistic function varies by less than 0.01 from the cumulative normal ogive across the range, given an arbitrary scale factor.

In the BTL model, the probability that object j is judged to have more of an attribute than object i is:

where is the scale location of object ; is the logistic function (the inverse of the logit). For example, the scale location might represent the perceived quality of a product, or the perceived weight of an object.

The BTL model, the Thurstonian model as well as the Rasch model for measurement are all closely related and belong to the same class of stochastic transitivity.

Thurstone used the method of pairwise comparisons as an approach to measuring perceived intensity of physical stimuli, attitudes, preferences, choices, and values. He also studied implications of the theory he developed for opinion polls and political voting (Thurstone, 1959).

Transitivity

[edit]

For a given decision agent, if the information, objective, and alternatives used by the agent remain constant, then it is generally assumed that pairwise comparisons over those alternatives by the decision agent are transitive. Most agree upon what transitivity is, though there is debate about the transitivity of indifference. The rules of transitivity are as follows for a given decision agent.

  • If xPy and yPz, then xPz
  • If xPy and yIz, then xPz
  • If xIy and yPz, then xPz
  • If xIy and yIz, then xIz

This corresponds to (xPy or xIy) being a total preorder, P being the corresponding strict weak order, and I being the corresponding equivalence relation.

Probabilistic models also give rise to stochastic variants of transitivity, all of which can be verified to satisfy (non-stochastic) transitivity within the bounds of errors of estimates of scale locations of entities. Thus, decisions need not be deterministically transitive in order to apply probabilistic models. However, transitivity will generally hold for a large number of comparisons if models such as the BTL can be effectively applied.

Using a transitivity test[2] one can investigate whether a data set of pairwise comparisons contains a higher degree of transitivity than expected by chance.

Argument for intransitivity of indifference

[edit]

Some contend that indifference is not transitive. Consider the following example. Suppose you like apples and you prefer apples that are larger. Now suppose there exists an apple A, an apple B, and an apple C which have identical intrinsic characteristics except for the following. Suppose B is larger than A, but it is not discernible without an extremely sensitive scale. Further suppose C is larger than B, but this also is not discernible without an extremely sensitive scale. However, the difference in sizes between apples A and C is large enough that you can discern that C is larger than A without a sensitive scale. In psychophysical terms, the size difference between A and C is above the just noticeable difference ('jnd') while the size differences between A and B and B and C are below the jnd.

You are confronted with the three apples in pairs without the benefit of a sensitive scale. Therefore, when presented A and B alone, you are indifferent between apple A and apple B; and you are indifferent between apple B and apple C when presented B and C alone. However, when the pair A and C are shown, you prefer C over A.

Preference orders

[edit]

If pairwise comparisons are in fact transitive in respect to the four mentioned rules, then pairwise comparisons for a list of alternatives (A1A2A3, ..., An−1, and An) can take the form:

A1(>XOR=)A2(>XOR=)A3(>XOR=) ... (>XOR=)An−1(>XOR=)An

For example, if there are three alternatives a, b, and c, then the possible preference orders are:

If the number of alternatives is n, and indifference is not allowed, then the number of possible preference orders for any given n-value is n!. If indifference is allowed, then the number of possible preference orders is the number of total preorders. It can be expressed as a function of n:

where S2(nk) is the Stirling number of the second kind.

Applications

[edit]

One important application of pairwise comparisons is the widely used Analytic Hierarchy Process, a structured technique for helping people deal with complex decisions. It uses pairwise comparisons of tangible and intangible factors to construct ratio scales that are useful in making important decisions.[3][4]

Another important application is the Potentially All Pairwise RanKings of all possible Alternatives (PAPRIKA) method.[5] The method involves the decision-maker repeatedly pairwise comparing and ranking alternatives defined on two criteria or attributes at a time and involving a trade-off, and then, if the decision-maker chooses to continue, pairwise comparisons of alternatives defined on successively more criteria. From the pairwise rankings, the relative importance of the criteria to the decision-maker, represented as weights, is determined.

See also

[edit]

References

[edit]

Further reading

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Pairwise comparison is a psychophysical and scaling technique in wherein participants are presented with successive pairs of stimuli, attitudes, or alternatives and required to judge which one is superior, larger, or preferred on a specified , yielding probabilistic proportions that can be modeled to derive quantitative scales. The method assumes judgments arise from underlying psychological continua with discriminable differences normally distributed, allowing ordinal pairwise data to construct approximate interval scales through inverse normal transformations of choice frequencies. Formalized by in his 1927 Law of Comparative Judgment, the approach addressed limitations of earlier psychophysical methods like Fechnerian constants by accommodating fallible, context-dependent human discrimination rather than assuming perfect transitivity or . Thurstone's model, P(A>B)=Φ(δAδB)P(A > B) = \Phi(\delta_A - \delta_B) where Φ\Phi is the cumulative and δ\delta represents scale values, enabled applications beyond sensory intensities to subjective domains such as intelligence testing and social attitudes, influencing subsequent frameworks like the Bradley-Terry logistic model for handling ties and intransitivities. Notable for revealing empirical regularities in judgment errors and cyclic preferences—challenging rational axioms—the technique has been extended in modern for adaptive testing and , though critiques highlight sensitivity to violations of independence and unidimensionality assumptions, prompting refinements via . Empirical validations in controlled experiments underscore its utility for in perceptual and preferential hierarchies, prioritizing observable data over reports.

Introduction

Definition and Principles

In , pairwise comparison is a method of measurement in which subjects evaluate two stimuli, alternatives, or attributes at a time, selecting which one exhibits greater magnitude, preference, or quality on a specified dimension. This approach, rooted in and applied to domains such as attitude scaling and , generates data from binary judgments that can be aggregated to infer underlying psychological continua or preference orders. Unlike holistic rankings, pairwise comparisons decompose complex evaluations into simpler, direct contrasts, reducing while enabling quantitative scaling. The foundational principle is Thurstone's Law of Comparative Judgment, introduced in , which models judgments as probabilistic outcomes influenced by latent psychological values. For stimuli ii and jj with scale values δi\delta_i and δj\delta_j, the probability that jj is preferred over ii is given by the cumulative of their difference, accounting for variability in perceptual or judgmental errors assumed to be normally distributed. This yields Pr{Xji=1}=Φ(δjδi)\Pr\{X_{ji}=1\} = \Phi(\delta_j - \delta_i), where Φ\Phi is the standard normal CDF; approximations using the σ(z)=11+ez\sigma(z) = \frac{1}{1 + e^{-z}} are common for computational simplicity and closely align with empirical data. The method assumes stimuli are discriminable and judgments are independent across pairs, allowing scale construction via or unfolding techniques from the proportion of affirmative responses. Key principles include unidimensionality, where comparisons reflect a single underlying trait, and the comparability of judgments across subjects or sessions, often addressed through group averaging or individual difference models like Case V of Thurstone's framework, which relaxes equality of dispersion. Empirical validation relies on goodness-of-fit tests against predicted proportions, with deviations signaling multidimensionality or effects. This framework underpins extensions in and , emphasizing that raw preference frequencies alone do not suffice without probabilistic modeling to derive interval-level measures.

Historical Development

The method of pairwise comparisons emerged in through early psychophysical experiments aimed at quantifying subjective sensory experiences. In the mid-19th century, Ernst Heinrich Weber's investigations into just noticeable differences (1830s) laid groundwork by emphasizing comparative judgments of stimuli intensities, though not strictly pairwise. Gustav Theodor Fechner formalized pairwise comparisons in his 1860 work Elements of Psychophysics, using them to derive functions from direct head-to-head assessments of sensory magnitudes, such as weight or brightness differences, assuming judgments followed probabilistic discrimination laws akin to Weber's ratio. A pivotal advancement occurred in 1927 when introduced the Law of Comparative Judgment, extending psychophysical principles to broader psychological scaling, including attitudes and preferences. Thurstone posited that pairwise choices reflect differences along an underlying psychological continuum, modeled probabilistically via the cumulative , where the proportion of preferences for one stimulus over another estimates scale separations. This framework, detailed in his seminal paper, enabled quantitative measurement of subjective values, such as the relative seriousness of crimes through aggregated judge responses to paired offenses. Thurstone's approach addressed limitations in earlier absolute scaling by leveraging comparative data to mitigate individual biases, producing interval-level scales via the inverse normal transformation of preference proportions. Early applications included social attitude surveys in the and , influencing fields like educational testing. Refinements followed, such as Warren Torgerson's 1958 geometric interpretations for multidimensional cases, but Thurstone's 1927 formulation remains foundational for handling judgment variability and enabling empirical validation against first-principles assumptions of discriminable continua.

Theoretical Models

Probabilistic Frameworks

Thurstone's Law of Comparative Judgment, formulated in , provides a foundational probabilistic approach to pairwise comparisons by modeling stimuli as points on an underlying psychological continuum subject to random perceptual error. Each stimulus is associated with a , often assumed Gaussian, representing noisy psychological representations; the probability of judging stimulus ii greater than jj equals the probability that a random draw from ii's distribution exceeds one from jj's. In the restrictive Case V, which assumes identical variances across stimuli, this simplifies to P(i>j)=Φ(μiμj2σ)P(i > j) = \Phi\left( \frac{\mu_i - \mu_j}{\sqrt{2} \sigma} \right)
Add your contribution
Related Hubs
User Avatar
No comments yet.