Recent from talks
Nothing was collected or created yet.
All-pairs testing
View on WikipediaIn computer science, all-pairs testing or pairwise testing is a combinatorial method of software testing that, for each pair of input parameters to a system (typically, a software algorithm), tests all possible discrete combinations of those parameters. Using carefully chosen test vectors, this can be done much faster than an exhaustive search of all combinations of all parameters by "parallelizing" the tests of parameter pairs.[1]
Rationale
[edit]In most cases, a single input parameter or an interaction between two parameters is what causes a program's bugs.[2] Bugs involving interactions between three or more parameters are both progressively less common [3] and also progressively more expensive to find, such testing has as its limit the testing of all possible inputs.[4] Thus, a combinatorial technique for picking test cases like all-pairs testing is a useful cost-benefit compromise that enables a significant reduction in the number of test cases without drastically compromising functional coverage.[5]
More rigorously, if we assume that a test case has parameters given in a set . The range of the parameters are given by . Let's assume that . We note that the number of all possible test cases is a . Imagining that the code deals with the conditions taking only two parameters at a time, might reduce the number of needed test cases.[clarification needed]
To demonstrate, suppose there are X,Y,Z parameters. We can use a predicate of the form of order 3, which takes all 3 as input, or rather three different order 2 predicates of the form . can be written in an equivalent form of where comma denotes any combination. If the code is written as conditions taking "pairs" of parameters, then the set of choices of ranges can be a multiset[clarification needed], because there can be multiple parameters having same number of choices.
is one of the maximum of the multiset The number of pair-wise test cases on this test function would be:-
Therefore, if the and then the number of tests is typically O(nm), where n and m are the number of possibilities for each of the two parameters with the most choices, and it can be quite a lot less than the exhaustive ·
N-wise testing
[edit]N-wise testing can be considered the generalized form of pair-wise testing.[citation needed]
The idea is to apply sorting to the set so that gets ordered too. Let the sorted set be a tuple :-
Now we can take the set and call it the pairwise testing. Generalizing further we can take the set and call it the 3-wise testing. Eventually, we can say T-wise testing.
The N-wise testing then would just be, all possible combinations from the above formula.
Example
[edit]Consider the parameters shown in the table below.
| Parameter name | Value 1 | Value 2 | Value 3 | Value 4 |
|---|---|---|---|---|
| Enabled | True | False | - | - |
| Choice type | 1 | 2 | 3 | - |
| Category | a | b | c | d |
'Enabled', 'Choice Type' and 'Category' have a choice range of 2, 3 and 4, respectively. An exhaustive test would involve 24 tests (2 x 3 x 4). Multiplying the two largest values (3 and 4) indicates that a pair-wise tests would involve 12 tests. The pairwise test cases, generated by Microsoft's "pict" tool, are shown below.
| Enabled | Choice type | Category |
|---|---|---|
| True | 3 | a |
| True | 1 | d |
| False | 1 | c |
| False | 2 | d |
| True | 2 | c |
| False | 2 | a |
| False | 1 | a |
| False | 3 | b |
| True | 2 | b |
| True | 3 | d |
| False | 3 | c |
| True | 1 | b |
See also
[edit]Notes
[edit]- ^ Berger, Bernie (2003). "Efficient Testing with All-Pairs" (PDF). TechWell Corporation. Retrieved 21 November 2023.
- ^ Black, Rex (2007). Pragmatic Software Testing: Becoming an Effective and Efficient Test Professional. New York: Wiley. p. 240. ISBN 978-0-470-12790-2.
- ^ Kuhn, D. Richard; Wallace, Dolores R.; Gallo, Albert M. Jr. (June 2004). "Software Fault Interactions and Implications for Software Testing" (PDF). IEEE Transactions on Software Engineering. 30 (6): 418–421. doi:10.1109/TSE.2004.24. S2CID 206778290.
- ^ Kuhn, D. Richard; Kacker, Raghu N.; Yu Lei (October 2010). Practical Combinatorial Testing. SP 800-142 (Report). National Institute of Standards and Technology. doi:10.6028/NIST.SP.800-142.
- ^ IEEE 12. Proceedings from the 5th International Conference on Software Testing and Validation (ICST). Software Competence Center Hagenberg. "Test Design: Lessons Learned and Practical Implications. July 18, 2008. pp. 1–150. doi:10.1109/IEEESTD.2008.4578383. ISBN 978-0-7381-5746-7.
{{cite book}}:|journal=ignored (help)
External links
[edit]All-pairs testing
View on GrokipediaOverview
Definition and Principles
All-pairs testing, also known as pairwise testing, is a black-box testing technique in software engineering that systematically exercises every possible pair of input parameter values at least once to uncover defects arising from interactions between parameters.[6] This approach targets interaction faults, which occur when specific combinations of parameter values trigger failures that would not manifest in single-parameter tests.[7] The core principle of all-pairs testing is that the majority of software defects stem from interactions involving one or two parameters, with fewer faults requiring three or more, allowing testers to prioritize pairwise coverage for efficient fault detection.[7] It leverages combinatorial designs, such as covering arrays, to generate a minimal set of test cases that ensure comprehensive pair coverage while avoiding the redundancy of testing higher-order interactions unless necessary.[6] This method contrasts with exhaustive testing by focusing on t-way combinations where t=2, providing a balance between thoroughness and practicality in detecting the most common fault types. Key terminology includes input parameters, also called factors, which represent independent variables like operating systems or browsers; values, or levels, denoting the discrete options for each parameter (e.g., Windows or Linux); and test cases, which are specific tuples combining values across parameters.[6] Unlike exhaustive testing, which requires evaluating the full Cartesian product of all parameter values—leading to an exponential number of tests—all-pairs testing reduces the suite size dramatically by covering only pairwise interactions.[6] Mathematically, for n parameters where the i-th parameter has mi values, exhaustive testing demands ∏i=1n mi test cases, which grows rapidly (e.g., 210 = 1,024 for ten binary parameters).[6] In contrast, all-pairs testing typically requires on the order of v2 log n tests, where v is the average number of values per parameter, enabling coverage of all pairs with far fewer cases (e.g., around 10-20 tests for systems with 5-10 parameters and 2-4 values each).[6] All-pairs testing forms the foundation of broader n-wise testing, where higher n extends coverage to t-way interactions for t > 2.[6]Historical Development
The roots of all-pairs testing trace back to the application of orthogonal arrays in statistical design of experiments, particularly through Genichi Taguchi's methods developed in the 1980s for manufacturing quality control, which emphasized efficient experimentation to identify parameter interactions with minimal trials.[8] These concepts were adapted to software testing in the late 1980s and early 1990s, when engineers like Madhav S. Phadke introduced tools such as the Orthogonal Array Testing System (OATS) in 1989 to generate test suites that covered pairwise interactions in software parameters, drawing directly from Taguchi's orthogonal arrays to address the combinatorial explosion of test cases.[9] A pivotal milestone occurred around 2000 with research from the National Institute of Standards and Technology (NIST), where studies from 1999 to 2004 demonstrated that the majority of software faults arise from interactions involving at most two parameters, prompting the formal introduction of combinatorial testing—including all-pairs as a core 2-way approach—in software engineering contexts.[10] This was solidified in the 2004 publication by D. Richard Kuhn, Dolores R. Wallace, and Amit M. Gallo, which analyzed fault interactions and advocated pairwise coverage as an effective strategy for detecting defects triggered by parameter combinations.[11] The approach gained practical traction in the mid-2000s through tools like Microsoft's Pairwise Independent Combinatorial Tool (PICT), released in the early 2000s to automate test case generation for pairwise coverage in industrial settings.[12] The evolution accelerated in the early 2000s with a shift from manual orthogonal array construction to automated algorithms, exemplified by James Bach's Allpairs tool in 2006, a Perl script that efficiently produced compact test sets ensuring all parameter pairs were covered, making the technique accessible to practitioners.[13] By the 2010s, NIST advanced this further with the Automated Combinatorial Testing for Software (ACTS) framework, initially developed as FireEye and formally introduced in 2013, which supported higher-strength combinatorial testing beyond pairs while building on pairwise foundations.[14] Integration into international standards, such as ISO/IEC/IEEE 29119 published in 2013, formalized pairwise testing as a recommended combinatorial technique within software testing processes, emphasizing its role in systematic test design. Subsequent revisions to ISO/IEC/IEEE 29119 as of 2023 have further incorporated guidance on combinatorial methods, reflecting ongoing refinements as of November 2025.[15] Key contributions came from figures like D. Richard Kuhn, whose NIST-led empirical studies established the empirical basis for pairwise efficacy; James Bach, who critiqued and refined the method in works such as "Pairwise Testing: A Best Practice That Isn't" (2004), highlighting practical limitations while promoting its adoption.[16] These efforts, grounded in seminal publications like Cohen et al.'s 1996 work on combinatorial test design, underscore the transition from statistical borrowing to a mature software engineering practice.[1]Rationale and Benefits
Justification for Pairwise Coverage
Pairwise coverage in all-pairs testing is justified by empirical evidence demonstrating that the majority of software defects arise from interactions involving at most two parameters. Studies analyzing real-world systems, including medical devices, aerospace software, and web browsers, indicate that 70-97% of faults can be triggered by single-parameter issues or pairwise interactions.[17] For instance, an examination of FDA recall data for medical device software revealed that 97% of 109 faults were detectable through pairwise testing, while a NASA Goddard Space Flight Center analysis of 329 error reports from a distributed system found 93.3% fault coverage with pairs.[18] These findings underscore that higher-order interactions (three or more parameters) are progressively rarer, accounting for only a small fraction of defects in diverse domains such as browsers (70.3% pairwise coverage) and compilers.[17] This empirical basis highlights pairwise testing's superiority in detecting interaction faults overlooked by traditional methods like random testing or equivalence partitioning. In case studies of configuration-heavy systems, such as web browsers and servers, pairwise suites identified 70-76% of faults that single-parameter approaches missed, particularly those stemming from mismatched combinations like operating system and browser versions.[18] For example, testing OS-browser pairs exposed defects in rendering and compatibility that isolated parameter checks failed to reveal, demonstrating pairwise coverage's role in capturing synergistic errors without exhaustive enumeration.[17] Overall, these results position pairwise testing as an effective strategy for achieving substantial fault detection—often 50-90% of total flaws—while focusing resources on prevalent defect patterns.[10] Theoretically, the rationale for prioritizing pairwise interactions rests on the rarity of complex fault triggers in modern software architectures. Modular design principles and fault isolation mechanisms in contemporary systems limit the propagation of errors to higher-order combinations, making interactions beyond pairs uncommon.[17] As a result, pairwise testing functions as a practical heuristic for "pseudo-exhaustive" coverage, simulating comprehensive interaction scrutiny at a fraction of the cost of full combinatorial explosion. This approach aligns with observed fault patterns across industries, where no reviewed failures required more than six-way interactions, and most were confined to one or two parameters.[18] By targeting these dominant modes, pairwise methods provide robust defect detection grounded in both data and architectural realities.[6]Efficiency Gains
All-pairs testing, also known as pairwise combinatorial testing, significantly reduces the number of test cases required compared to exhaustive testing. In exhaustive testing, for a system with parameters each having possible values, the total number of test cases is , which grows exponentially with . In contrast, all-pairs testing generates a covering array that ensures every pair of parameter values is tested at least once, resulting in a test suite size that is roughly linear in . A practical approximation for the number of tests is , where is the number of values for parameter ; more precisely, the size of the minimal covering array for uniform is asymptotically as grows large. This typically yields test suites that are 10-50% the size of exhaustive ones for moderate , with even greater reductions (up to 99%) in larger configurations.[19][20] These reductions translate to substantial resource and time savings in practice. For instance, studies have shown pairwise testing can cut testing time by 80-90%, transforming scenarios from thousands of exhaustive cases to hundreds, thereby lowering costs in regression and configuration testing. In industrial applications, such as at Lockheed Martin, pairwise suites comprised less than 13% of exhaustive sets while achieving comparable fault detection, saving up to 20% in test planning and design efforts. Similarly, NIST evaluations across domains like cryptography and web browsers report reductions by factors of 20 to 700, enabling 12-fold efficiency gains—detecting three times more defects in one-quarter the time.[19] Regarding scalability, all-pairs testing is particularly effective for systems with 5 to 20 parameters, where exhaustive testing becomes computationally infeasible due to exponential growth. For larger numbers of parameters, the method remains a valuable baseline, often requiring only modest increases in test suite size relative to exhaustive alternatives, though higher-strength (t-way) extensions may supplement it for deeper interactions. This makes it suitable for configuration testing in software, networks, and embedded systems, where parameter counts typically fall in this range.[19]Methods and Techniques
Covering Array Construction
A covering array serves as the foundational mathematical structure for all-pairs testing, enabling the construction of compact test sets that guarantee pairwise interaction coverage. Formally, a covering array CA(N; 2, k, v) is defined as an N × k array whose entries are from a set of v symbols, such that for every choice of 2 columns, each possible ordered pair (2-tuple) of symbols appears in at least one row.[21] This ensures that all pairwise combinations of parameter values are exercised in the test suite, minimizing redundancy while achieving complete t-way coverage for t=2.[22] In practice, parameters often have varying numbers of possible values, leading to mixed-level covering arrays, denoted MCA(N; 2, k, (v_1, v_2, \dots, v_k)), where the i-th column uses exactly v_i distinct symbols. Unlike uniform-level arrays that assume a fixed v across all k parameters, mixed-level variants accommodate real-world scenarios in software testing where factors exhibit heterogeneous domains, such as binary options alongside multi-valued configurations. Covering arrays differ from orthogonal arrays, which are a stricter subclass providing balanced coverage. An orthogonal array OA(N; 2, k, v) is a covering array where every 2-tuple appears exactly λ times (typically λ=1 for index unity), ensuring uniform distribution but often resulting in larger N and requiring uniform v levels.[23] In contrast, covering arrays prioritize minimality and flexibility, allowing uneven tuple frequencies and mixed levels without the balance constraint, which makes them more suitable for efficient test generation in combinatorial testing.[24] The core property of a strength-2 covering array is its guarantee of t-way coverage for t=2, meaning every possible pair across any two parameters is represented at least once, though higher-order interactions (e.g., some triples) may incidentally be covered depending on the construction.[24] This structure supports extensions to mixed-strength coverage, where specific parameter subsets require higher t (e.g., triples for critical pairs), but the primary focus remains on pairwise exhaustiveness to detect interaction faults economically.[25] For illustration, consider three parameters A (2 values), B (3 values), and C (4 values). The exhaustive test set would require 2 × 3 × 4 = 24 tests to cover all combinations, but a mixed covering array MCA(12; 2, 3, (2,3,4)) achieves full pairwise coverage—encompassing all 2×3 + 2×4 + 3×4 = 26 unique pairs—with only 12 rows, demonstrating substantial reduction in test suite size.[26] Such arrays form the basis for scalable test designs in systems with dozens of parameters.Algorithms for Test Case Generation
All-pairs testing relies on algorithms that generate test cases to ensure every possible pair of parameter values is covered at least once, typically producing a covering array of strength 2, denoted as CA(N; 2, k, v), where N is the number of tests, k the number of parameters, and v the number of values per parameter.[27] These algorithms balance computational efficiency with complete pairwise coverage, often starting from a model of parameters and their values. Common approaches include greedy methods, algebraic constructions, and heuristic searches, each suited to different model complexities. Greedy algorithms iteratively build the test set by selecting test cases that maximize the coverage of uncovered pairs at each step. In a typical greedy procedure, the algorithm begins with an initial test case, such as a random selection or one that covers a high number of pairs, and then generates candidates that prioritize newly covered interactions. For instance, it may enumerate possible extensions of existing tests and choose the one covering the most remaining pairs, repeating until all pairs are satisfied. This approach, as implemented in early tools like AETG, ensures rapid generation but may not always yield the minimal test set size due to local optimization.[28][29] Algebraic methods leverage mathematical structures like orthogonal arrays (OAs) to construct test sets deterministically, ensuring balanced coverage without exhaustive search. An orthogonal array of strength 2 provides pairwise coverage where every pair appears exactly λ times across the tests, often derived from finite fields or recursive constructions for parameters with prime power levels. For example, for parameters with equal numbers of values that are powers of primes, OAs can be built using Galois fields, guaranteeing optimal or near-optimal sizes. These methods, adapted from design of experiments, are particularly efficient for uniform models but require adjustments for mixed levels via techniques like orthogonal mates.[27][11][28] Heuristic search algorithms, such as genetic algorithms and simulated annealing, address the NP-complete nature of minimizing test set size by exploring solution spaces probabilistically. In genetic algorithms for pairwise generation, test sets are encoded as chromosomes (arrays of value assignments), with fitness based on uncovered pairs; operations like crossover and mutation evolve populations toward full coverage. Simulated annealing, conversely, starts with a random test set and iteratively perturbs it, accepting worse solutions probabilistically to escape local minima, guided by a cooling schedule to converge on minimal covering arrays. These methods excel in handling large or irregular models where exact solutions are infeasible.[30][31] For models with mixed levels (varying numbers of values per parameter), the In-Parameter-Order (IPO) algorithm generalizes greedy construction by incrementally adding parameters. It begins with a pairwise test set for the first m parameters, then extends it by generating additional tests for the (m+1)th parameter using a greedy selection to cover all new pairs involving prior parameters. This horizontal growth repeats, with vertical extensions refining coverage if needed, ensuring scalability for non-uniform cases. IPO's deterministic nature allows reuse of prior tests when models evolve.[32] Post-generation verification confirms pairwise coverage by constructing a pairwise interaction table or matrix, where rows and columns represent parameter values, and entries track test coverage for each pair. Algorithms scan the test set against all possible pairs, computing the proportion covered (simple 2-way coverage metric) to validate completeness, often using bit matrices for efficiency in large models. Any gaps prompt regeneration or refinement.[33]Examples and Applications
Illustrative Example
To illustrate all-pairs testing, consider a hypothetical scenario involving the compatibility testing of a web form application. The parameters under test are Browser (2 values: Chrome, Firefox), Operating System (OS; 3 values: Windows, Linux, macOS), and Screen Resolution (4 values: 800x600, 1024x768, 1280x1024, 1920x1080). An exhaustive testing approach would require evaluating all possible combinations, resulting in test cases, each representing a unique triplet of parameter values. In contrast, all-pairs testing employs a covering array of strength 2 to generate a reduced set of test cases that ensures every possible pair of parameter values (across any two parameters) appears at least once, without needing the full exhaustive suite. For this scenario, a covering array can be constructed with just 12 test cases, as the minimum size required is bounded by the largest pairwise set (OS-Resolution pairs, totaling 12 unique combinations). The following table presents one such covering array, where each row denotes a test case:| Test Case | Browser | OS | Resolution |
|---|---|---|---|
| 1 | Chrome | Windows | 800x600 |
| 2 | Firefox | Windows | 1024x768 |
| 3 | Chrome | Windows | 1280x1024 |
| 4 | Firefox | Windows | 1920x1080 |
| 5 | Chrome | Linux | 800x600 |
| 6 | Chrome | Linux | 1024x768 |
| 7 | Firefox | Linux | 1280x1024 |
| 8 | Firefox | Linux | 1920x1080 |
| 9 | Firefox | macOS | 800x600 |
| 10 | Firefox | macOS | 1024x768 |
| 11 | Chrome | macOS | 1280x1024 |
| 12 | Chrome | macOS | 1920x1080 |
- Browser-OS pairs: Chrome-Windows (Tests 1, 3), Firefox-Windows (Tests 2, 4), Chrome-Linux (Tests 5, 6), Firefox-Linux (Tests 7, 8), Chrome-macOS (Tests 11, 12), Firefox-macOS (Tests 9, 10).
- Browser-Resolution pairs: Chrome-800x600 (Tests 1, 5), Firefox-800x600 (Test 9), Chrome-1024x768 (Test 6), Firefox-1024x768 (Tests 2, 10), Chrome-1280x1024 (Tests 3, 11), Firefox-1280x1024 (Test 7), Chrome-1920x1080 (Test 12), Firefox-1920x1080 (Tests 4, 8).
- OS-Resolution pairs: Each of the 12 combinations appears exactly once (e.g., Windows-800x600 in Test 1, Linux-1920x1080 in Test 8, macOS-1024x768 in Test 10).
