Hubbry Logo
P versus NP problemP versus NP problemMain
Open search
P versus NP problem
Community hub
P versus NP problem
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
P versus NP problem
P versus NP problem
from Wikipedia
Unsolved problem in computer science
If the solution to a problem can be checked in polynomial time, must the problem be solvable in polynomial time?

The P versus NP problem is a major unsolved problem in theoretical computer science. Informally, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

Here, "quickly" means an algorithm exists that solves the task and runs in polynomial time (as opposed to, say, exponential time), meaning the task completion time is bounded above by a polynomial function on the size of the input to the algorithm. The general class of questions that some algorithm can answer in polynomial time is "P" or "class P". For some questions, there is no known way to find an answer quickly, but if provided with an answer, it can be verified quickly. The class of questions where an answer can be verified in polynomial time is "NP", standing for "nondeterministic polynomial time".[Note 1]

An answer to the P versus NP question would determine whether problems that can be verified in polynomial time can also be solved in polynomial time. If P ≠ NP, which is widely believed, it would mean that there are problems in NP that are harder to compute than to verify: they could not be solved in polynomial time, but the answer could be verified in polynomial time.

The problem has been called the most important open problem in computer science.[1] Aside from being an important problem in computational theory, a proof either way would have profound implications for mathematics, cryptography, algorithm research, artificial intelligence, game theory, multimedia processing, philosophy, economics and many other fields.[2]

It is one of the seven Millennium Prize Problems selected by the Clay Mathematics Institute, each of which carries a US$1,000,000 prize for the first correct solution.

Example

[edit]

Consider the following yes/no problem: given an incomplete Sudoku grid of size , is there at least one legal solution where every row, column, and square contains the integers 1 through ? It is straightforward to verify "yes" instances of this generalized Sudoku problem given a candidate solution. However, it is not known whether there is a polynomial-time algorithm that can correctly answer "yes" or "no" to all instances of this problem. Therefore, generalized Sudoku is in NP (quickly verifiable), but may or may not be in P (quickly solvable). (It is necessary to consider a generalized version of Sudoku, as any fixed size Sudoku has only a finite number of possible grids. In this case the problem is in P, as the answer can be found by table lookup.)

History

[edit]

The precise statement of the P versus NP problem was introduced in 1971 by Stephen Cook in his seminal paper "The complexity of theorem proving procedures"[3] (and independently by Leonid Levin in 1973[4]).

Although the P versus NP problem was formally defined in 1971, there were previous inklings of the problems involved, the difficulty of proof, and the potential consequences. In 1955, mathematician John Nash wrote a letter to the National Security Agency, speculating that the time required to crack a sufficiently complex code would increase exponentially with the length of the key.[5] If proved (and Nash was suitably skeptical), this would imply what is now called P ≠ NP, since a proposed key can be verified in polynomial time. Another mention of the underlying problem occurred in a 1956 letter written by Kurt Gödel to John von Neumann. Gödel asked whether theorem-proving (now known to be co-NP-complete) could be solved in quadratic or linear time,[6] and pointed out one of the most important consequences—that if so, then the discovery of mathematical proofs could be automated.

Context

[edit]

The relation between the complexity classes P and NP is studied in computational complexity theory, the part of the theory of computation dealing with the resources required during computation to solve a given problem. The most common resources are time (how many steps it takes to solve a problem) and space (how much memory it takes to solve a problem).

In such analysis, a model of the computer for which time must be analyzed is required. Typically such models assume that the computer is deterministic (given the computer's present state and any inputs, there is only one possible action that the computer might take) and sequential (it performs actions one after the other).

In this theory, the class P consists of all decision problems (defined below) solvable on a deterministic sequential machine in a duration polynomial in the size of the input; the class NP consists of all decision problems whose positive solutions are verifiable in polynomial time given the right information, or equivalently, whose solution can be found in polynomial time on a non-deterministic machine.[7] Clearly, P ⊆ NP. Arguably, the biggest open question in theoretical computer science concerns the relationship between those two classes:

Is P equal to NP?

Since 2002, William Gasarch has conducted three polls of researchers concerning this and related questions.[8][9][10] Confidence that P ≠ NP has been increasing – in 2019, 88% believed P ≠ NP, as opposed to 83% in 2012 and 61% in 2002. When restricted to experts, the 2019 answers became 99% believed P ≠ NP.[10] These polls do not imply whether P = NP, Gasarch himself stated: "This does not bring us any closer to solving P=?NP or to knowing when it will be solved, but it attempts to be an objective report on the subjective opinion of this era."

NP-completeness

[edit]
Euler diagram for P, NP, NP-complete, and NP-hard set of problems (excluding the empty language and its complement, which belong to P but are not NP-complete)

To attack the P = NP question, the concept of NP-completeness is very useful. NP-complete problems are problems that any other NP problem is reducible to in polynomial time and whose solution is still verifiable in polynomial time. That is, any NP problem can be transformed into any NP-complete problem. Informally, an NP-complete problem is an NP problem that is at least as "tough" as any other problem in NP.

NP-hard problems are those at least as hard as NP problems; i.e., all NP problems can be reduced (in polynomial time) to them. NP-hard problems need not be in NP; i.e., they need not have solutions verifiable in polynomial time.

For instance, the Boolean satisfiability problem is NP-complete by the Cook–Levin theorem, so any instance of any problem in NP can be transformed mechanically into a Boolean satisfiability problem in polynomial time. The Boolean satisfiability problem is one of many NP-complete problems. If any NP-complete problem is in P, then it would follow that P = NP. However, many important problems are NP-complete, and no fast algorithm for any of them is known.

From the definition alone it is unintuitive that NP-complete problems exist; however, a trivial NP-complete problem can be formulated as follows: given a Turing machine M guaranteed to halt in polynomial time, does a polynomial-size input that M will accept exist?[11] It is in NP because (given an input) it is simple to check whether M accepts the input by simulating M; it is NP-complete because the verifier for any particular instance of a problem in NP can be encoded as a polynomial-time machine M that takes the solution to be verified as input. Then the question of whether the instance is a yes or no instance is determined by whether a valid input exists.

The first natural problem proven to be NP-complete was the Boolean satisfiability problem, also known as SAT. As noted above, this is the Cook–Levin theorem; its proof that satisfiability is NP-complete contains technical details about Turing machines as they relate to the definition of NP. However, after this problem was proved to be NP-complete, proof by reduction provided a simpler way to show that many other problems are also NP-complete, including the game Sudoku discussed earlier. In this case, the proof shows that a solution of Sudoku in polynomial time could also be used to complete Latin squares in polynomial time.[12] This in turn gives a solution to the problem of partitioning tri-partite graphs into triangles,[13] which could then be used to find solutions for the special case of SAT known as 3-SAT,[14] which then provides a solution for general Boolean satisfiability. So a polynomial-time solution to Sudoku leads, by a series of mechanical transformations, to a polynomial time solution of satisfiability, which in turn can be used to solve any other NP-problem in polynomial time. Using transformations like this, a vast class of seemingly unrelated problems are all reducible to one another, and are in a sense "the same problem".

Harder problems

[edit]

Although it is unknown whether P = NP, problems outside of P are known. Just as the class P is defined in terms of polynomial running time, the class EXPTIME is the set of all decision problems that have exponential running time. In other words, any problem in EXPTIME is solvable by a deterministic Turing machine in O(2p(n)) time, where p(n) is a polynomial function of n. A decision problem is EXPTIME-complete if it is in EXPTIME, and every problem in EXPTIME has a polynomial-time many-one reduction to it. A number of problems are known to be EXPTIME-complete. Because it can be shown that P ≠ EXPTIME, these problems are outside P, and so require more than polynomial time. In fact, by the time hierarchy theorem, they cannot be solved in significantly less than exponential time. Examples include finding a perfect strategy for chess positions on an N × N board[15] and similar problems for other board games.[16]

The problem of deciding the truth of a statement in Presburger arithmetic requires even more time. Fischer and Rabin proved in 1974[17] that every algorithm that decides the truth of Presburger statements of length n has a runtime of at least for some constant c. Hence, the problem is known to need more than exponential run time. Even more difficult are the undecidable problems, such as the halting problem. They cannot be completely solved by any algorithm, in the sense that for any particular algorithm there is at least one input for which that algorithm will not produce the right answer; it will either produce the wrong answer, finish without giving a conclusive answer, or otherwise run forever without producing any answer at all.

It is also possible to consider questions other than decision problems. One such class, consisting of counting problems, is called #P: whereas an NP problem asks "Are there any solutions?", the corresponding #P problem asks "How many solutions are there?". Clearly, a #P problem must be at least as hard as the corresponding NP problem, since a count of solutions immediately tells if at least one solution exists, if the count is greater than zero. Surprisingly, some #P problems that are believed to be difficult correspond to easy (for example linear-time) P problems.[18] For these problems, it is very easy to tell whether solutions exist, but thought to be very hard to tell how many. Many of these problems are #P-complete, and hence among the hardest problems in #P, since a polynomial time solution to any of them would allow a polynomial time solution to all other #P problems.

Problems in NP not known to be in P or NP-complete

[edit]

In 1975, Richard E. Ladner showed that if P ≠ NP, then there exist problems in NP that are neither in P nor NP-complete.[19] Such problems are called NP-intermediate problems. The graph isomorphism problem, the discrete logarithm problem, and the integer factorization problem are examples of problems believed to be NP-intermediate. They are some of the very few NP problems not known to be in P or to be NP-complete.

The graph isomorphism problem is the computational problem of determining whether two finite graphs are isomorphic. An important unsolved problem in complexity theory is whether the graph isomorphism problem is in P, NP-complete, or NP-intermediate. The answer is not known, but it is believed that the problem is at least not NP-complete.[20] If graph isomorphism is NP-complete, the polynomial time hierarchy collapses to its second level.[21] Since it is widely believed that the polynomial hierarchy does not collapse to any finite level, it is believed that graph isomorphism is not NP-complete. The best algorithm for this problem, due to László Babai, runs in quasi-polynomial time.[22]

The integer factorization problem is the computational problem of determining the prime factorization of a given integer. Phrased as a decision problem, it is the problem of deciding whether the input has a factor less than k. No efficient integer factorization algorithm is known, and this fact forms the basis of several modern cryptographic systems, such as the RSA algorithm. The integer factorization problem is in NP and in co-NP (and even in UP and co-UP[23]). If the problem is NP-complete, the polynomial time hierarchy will collapse to its first level (i.e., NP = co-NP). The most efficient known algorithm for integer factorization is the general number field sieve, which takes expected time

to factor an n-bit integer. The best known quantum algorithm for this problem, Shor's algorithm, runs in polynomial time, although this does not indicate where the problem lies with respect to non-quantum complexity classes.

Does P mean "easy"?

[edit]
The graph shows the running time vs. problem size for a knapsack problem of a state-of-the-art, specialized algorithm. The quadratic fit suggests that the algorithmic complexity of the problem is O((log(n))2).[24]

All of the above discussion has assumed that P means "easy" and "not in P" means "difficult", an assumption known as Cobham's thesis. It is a common assumption in complexity theory; but there are caveats.

First, it can be false in practice. A theoretical polynomial algorithm may have extremely large constant factors or exponents, rendering it impractical. For example, the problem of deciding whether a graph G contains H as a minor, where H is fixed, can be solved in a running time of O(n2),[25] where n is the number of vertices in G. However, the big O notation hides a constant that depends superexponentially on H. The constant is greater than (using Knuth's up-arrow notation), and where h is the number of vertices in H.[26]

On the other hand, even if a problem is shown to be NP-complete, and even if P ≠ NP, there may still be effective approaches to the problem in practice. There are algorithms for many NP-complete problems, such as the knapsack problem, the traveling salesman problem, and the Boolean satisfiability problem, that can solve to optimality many real-world instances in reasonable time. The empirical average-case complexity (time vs. problem size) of such algorithms can be surprisingly low. An example is the simplex algorithm in linear programming, which works surprisingly well in practice; despite having exponential worst-case time complexity, it runs on par with the best known polynomial-time algorithms.[27]

Finally, there are types of computations which do not conform to the Turing machine model on which P and NP are defined, such as quantum computation and randomized algorithms.

Reasons to believe P ≠ NP or P = NP

[edit]

Cook provides a restatement of the problem in The P Versus NP Problem as "Does P = NP?"[28] According to polls,[8][29] most computer scientists believe that P ≠ NP. A key reason for this belief is that after decades of studying these problems no one has been able to find a polynomial-time algorithm for any of more than 3,000 important known NP-complete problems (see List of NP-complete problems). These algorithms were sought long before the concept of NP-completeness was even defined (Karp's 21 NP-complete problems, among the first found, were all well-known existing problems at the time they were shown to be NP-complete). Furthermore, the result P = NP would imply many other startling results that are currently believed to be false, such as NP = co-NP and P = PH.

It is also intuitively argued that the existence of problems that are hard to solve but whose solutions are easy to verify matches real-world experience.[30]

If P = NP, then the world would be a profoundly different place than we usually assume it to be. There would be no special value in "creative leaps", no fundamental gap between solving a problem and recognizing the solution once it's found.

On the other hand, some researchers believe that it is overconfident to believe P ≠ NP and that researchers should also explore proofs of P = NP. For example, in 2002 these statements were made:[8]

The main argument in favor of P ≠ NP is the total lack of fundamental progress in the area of exhaustive search. This is, in my opinion, a very weak argument. The space of algorithms is very large and we are only at the beginning of its exploration. [...] The resolution of Fermat's Last Theorem also shows that very simple questions may be settled only by very deep theories.

Being attached to a speculation is not a good guide to research planning. One should always try both directions of every problem. Prejudice has caused famous mathematicians to fail to solve famous problems whose solution was opposite to their expectations, even though they had developed all the methods required.

DLIN vs NLIN

[edit]

When one substitutes "linear time on a multitape Turing machine" for "polynomial time" in the definitions of P and NP, one obtains the classes DLIN and NLIN. It is known[31] that DLIN ≠ NLIN.

Consequences of solution

[edit]

One of the reasons the problem attracts so much attention is the consequences of the possible answers. Either direction of resolution would advance theory enormously, and perhaps have huge practical consequences as well.

P = NP

[edit]

A proof that P = NP could have stunning practical consequences if the proof leads to efficient methods for solving some of the important problems in NP. The potential consequences, both positive and negative, arise since various NP-complete problems are fundamental in many fields.

It is also very possible that a proof would not lead to practical algorithms for NP-complete problems. The formulation of the problem does not require that the bounding polynomial be small or even specifically known. A non-constructive proof might show a solution exists without specifying either an algorithm to obtain it or a specific bound. Even if the proof is constructive, showing an explicit bounding polynomial and algorithmic details, if the polynomial is not very low-order the algorithm might not be sufficiently efficient in practice. In this case the initial proof would be mainly of interest to theoreticians, but the knowledge that polynomial time solutions are possible would surely spur research into better (and possibly practical) methods to achieve them.

A solution showing P = NP could upend the field of cryptography, which relies on certain problems being difficult. A constructive and efficient solution[Note 2] to an NP-complete problem such as 3-SAT would break most existing cryptosystems including:

  • Existing implementations of public-key cryptography,[32] a foundation for many modern security applications such as secure financial transactions over the Internet.
  • Symmetric ciphers such as AES or 3DES,[33] used for the encryption of communications data.
  • Cryptographic hashing, which underlies blockchain cryptocurrencies such as Bitcoin, and is used to authenticate software updates. For these applications, finding a pre-image that hashes to a given value must be difficult, ideally taking exponential time. If P = NP, then this can take polynomial time, through reduction to SAT.[34]

These would need modification or replacement with information-theoretically secure solutions that do not assume P ≠ NP.

There are also enormous benefits that would follow from rendering tractable many currently mathematically intractable problems. For instance, many problems in operations research are NP-complete, such as types of integer programming and the travelling salesman problem. Efficient solutions to these problems would have enormous implications for logistics. Many other important problems, such as some problems in protein structure prediction, are also NP-complete;[35] making these problems efficiently solvable could considerably advance life sciences and biotechnology.

These changes could be insignificant compared to the revolution that efficiently solving NP-complete problems would cause in mathematics itself. Gödel, in his early thoughts on computational complexity, noted that a mechanical method that could solve any problem would revolutionize mathematics:[36][37]

If there really were a machine with φ(n) ∼ kn (or even ∼ kn2), this would have consequences of the greatest importance. Namely, it would obviously mean that in spite of the undecidability of the Entscheidungsproblem, the mental work of a mathematician concerning Yes-or-No questions could be completely replaced by a machine. After all, one would simply have to choose the natural number n so large that when the machine does not deliver a result, it makes no sense to think more about the problem.

Similarly, Stephen Cook (assuming not only a proof, but a practically efficient algorithm) says:[28]

... it would transform mathematics by allowing a computer to find a formal proof of any theorem which has a proof of a reasonable length, since formal proofs can easily be recognized in polynomial time. Example problems may well include all of the CMI prize problems.

Research mathematicians spend their careers trying to prove theorems, and some proofs have taken decades or even centuries to find after problems have been stated—for instance, Fermat's Last Theorem took over three centuries to prove. A method guaranteed to find a proof if a "reasonable" size proof exists, would essentially end this struggle.

Donald Knuth has stated that he has come to believe that P = NP, but is reserved about the impact of a possible proof:[38]

[...] if you imagine a number M that's finite but incredibly large—like say the number 10↑↑↑↑3 discussed in my paper on "coping with finiteness"—then there's a humongous number of possible algorithms that do nM bitwise or addition or shift operations on n given bits, and it's really hard to believe that all of those algorithms fail. My main point, however, is that I don't believe that the equality P = NP will turn out to be helpful even if it is proved, because such a proof will almost surely be nonconstructive.

Diagram of complexity classes provided that P  NP. The existence of problems within NP but outside both P and NP-complete, under that assumption, was established by Ladner's theorem.[19]

P ≠ NP

[edit]

A proof of P ≠ NP would lack the practical computational benefits of a proof that P = NP, but would represent a great advance in computational complexity theory and guide future research. It would demonstrate that many common problems cannot be solved efficiently, so that the attention of researchers can be focused on partial solutions or solutions to other problems. Due to widespread belief in P ≠ NP, much of this focusing of research has already taken place.[39]

P ≠ NP still leaves open the average-case complexity of hard problems in NP. For example, it is possible that SAT requires exponential time in the worst case, but that almost all randomly selected instances of it are efficiently solvable. Russell Impagliazzo has described five hypothetical "worlds" that could result from different possible resolutions to the average-case complexity question.[40] These range from "Algorithmica", where P = NP and problems like SAT can be solved efficiently in all instances, to "Cryptomania", where P ≠ NP and generating hard instances of problems outside P is easy, with three intermediate possibilities reflecting different possible distributions of difficulty over instances of NP-hard problems. The "world" where P ≠ NP but all problems in NP are tractable in the average case is called "Heuristica" in the paper. A Princeton University workshop in 2009 studied the status of the five worlds.[41]

Results about difficulty of proof

[edit]

Although the P = NP problem itself remains open despite a million-dollar prize and a huge amount of dedicated research, efforts to solve the problem have led to several new techniques. In particular, some of the most fruitful research related to the P = NP problem has been in showing that existing proof techniques are insufficient for answering the question, suggesting novel technical approaches are required.

As additional evidence for the difficulty of the problem, essentially all known proof techniques in computational complexity theory fall into one of the following classifications, all insufficient to prove P ≠ NP:

Classification Definition
Relativizing proofs Imagine a world where every algorithm is allowed to make queries to some fixed subroutine called an oracle (which can answer a fixed set of questions in constant time, such as an oracle that solves any traveling salesman problem in 1 step), and the running time of the oracle is not counted against the running time of the algorithm. Most proofs (especially classical ones) apply uniformly in a world with oracles regardless of what the oracle does. These proofs are called relativizing. In 1975, Baker, Gill, and Solovay showed that P = NP with respect to some oracles, while P ≠ NP for other oracles.[42] As relativizing proofs can only prove statements that are true for all possible oracles, these techniques cannot resolve P = NP.
Natural proofs In 1993, Alexander Razborov and Steven Rudich defined a general class of proof techniques for circuit complexity lower bounds, called natural proofs.[43] At the time, all previously known circuit lower bounds were natural, and circuit complexity was considered a very promising approach for resolving P = NP. However, Razborov and Rudich showed that if one-way functions exist, P and NP are indistinguishable to natural proof methods. Although the existence of one-way functions is unproven, most mathematicians believe that they exist, and a proof of their existence would be a much stronger statement than P ≠ NP. Thus, it is unlikely that natural proofs alone can resolve P = NP.
Algebrizing proofs After the Baker–Gill–Solovay result, new non-relativizing proof techniques were successfully used to prove that IP = PSPACE. However, in 2008, Scott Aaronson and Avi Wigderson showed that the main technical tool used in the IP = PSPACE proof, known as arithmetization, was also insufficient to resolve P = NP.[44] Arithmetization converts the operations of an algorithm to algebraic and basic arithmetic symbols and then uses those to analyze the workings. In the IP = PSPACE proof, they convert the black box and the Boolean circuits to an algebraic problem.[44] As mentioned previously, it has been proven that this method is not viable to solve P = NP and other time complexity problems.

These barriers are another reason why NP-complete problems are useful: if a polynomial-time algorithm can be demonstrated for an NP-complete problem, this would solve the P = NP problem in a way not excluded by the above results.

These barriers lead some computer scientists to suggest the P versus NP problem may be independent of standard axiom systems like ZFC (cannot be proved or disproved within them). An independence result could imply that either P ≠ NP and this is unprovable in (e.g.) ZFC, or that P = NP but it is unprovable in ZFC that any polynomial-time algorithms are correct.[45] However, if the problem is undecidable even with much weaker assumptions extending the Peano axioms for integer arithmetic, then nearly polynomial-time algorithms exist for all NP problems.[46] Therefore, assuming (as most complexity theorists do) some NP problems don't have efficient algorithms, proofs of independence with those techniques are impossible. This also implies proving independence from PA or ZFC with current techniques is no easier than proving all NP problems have efficient algorithms.

Logical characterizations

[edit]

The P = NP problem can be restated as certain classes of logical statements, as a result of work in descriptive complexity.

Consider all languages of finite structures with a fixed signature including a linear order relation. Then, all such languages in P are expressible in first-order logic with the addition of a suitable least fixed-point combinator. Recursive functions can be defined with this and the order relation. As long as the signature contains at least one predicate or function in addition to the distinguished order relation, so that the amount of space taken to store such finite structures is actually polynomial in the number of elements in the structure, this precisely characterizes P.

Similarly, NP is the set of languages expressible in existential second-order logic—that is, second-order logic restricted to exclude universal quantification over relations, functions, and subsets. The languages in the polynomial hierarchy, PH, correspond to all of second-order logic. Thus, the question "is P a proper subset of NP" can be reformulated as "is existential second-order logic able to describe languages (of finite linearly ordered structures with nontrivial signature) that first-order logic with least fixed point cannot?".[47] The word "existential" can even be dropped from the previous characterization, since P = NP if and only if P = PH (as the former would establish that NP = co-NP, which in turn implies that NP = PH).

Polynomial-time algorithms

[edit]

No known algorithm for a NP-complete problem runs in polynomial time. However, there are algorithms known for NP-complete problems that if P = NP, the algorithm runs in polynomial time on accepting instances (although with enormous constants, making the algorithm impractical). However, these algorithms do not qualify as polynomial time because their running time on rejecting instances are not polynomial. The following algorithm, due to Levin (without any citation), is such an example below. It correctly accepts the NP-complete language SUBSET-SUM. It runs in polynomial time on inputs that are in SUBSET-SUM if and only if P = NP:

// Algorithm that accepts the NP-complete language SUBSET-SUM.
//
// this is a polynomial-time algorithm if and only if P = NP.
//
// "Polynomial-time" means it returns "yes" in polynomial time when
// the answer should be "yes", and runs forever when it is "no".
//
// Input: S = a finite set of integers
// Output: "yes" if any subset of S adds up to 0.
// Runs forever with no output otherwise.
// Note: "Program number M" is the program obtained by
// writing the integer M in binary, then
// considering that string of bits to be a
// program. Every possible program can be
// generated this way, though most do nothing
// because of syntax errors.
FOR K = 1...∞
  FOR M = 1...K
    Run program number M for K steps with input S
    IF the program outputs a list of distinct integers
      AND the integers are all in S
      AND the integers sum to 0
    THEN
      OUTPUT "yes" and HALT

This is a polynomial-time algorithm accepting an NP-complete language only if P = NP. "Accepting" means it gives "yes" answers in polynomial time, but is allowed to run forever when the answer is "no" (also known as a semi-algorithm).

This algorithm is enormously impractical, even if P = NP. If the shortest program that can solve SUBSET-SUM in polynomial time is b bits long, the above algorithm will try at least 2b − 1 other programs first.

Formal definitions

[edit]

P and NP

[edit]

A decision problem is a problem that takes as input some string w over an alphabet Σ, and outputs "yes" or "no". If there is an algorithm (say a Turing machine, or a computer program with unbounded memory) that produces the correct answer for any input string of length n in at most cnk steps, where k and c are constants independent of the input string, then we say that the problem can be solved in polynomial time and we place it in the class P. Formally, P is the set of languages that can be decided by a deterministic polynomial-time Turing machine. Meaning,

where

and a deterministic polynomial-time Turing machine is a deterministic Turing machine M that satisfies two conditions:

  1. M halts on all inputs w and
  2. there exists such that , where O refers to the big O notation and

NP can be defined similarly using nondeterministic Turing machines (the traditional way). However, a modern approach uses the concept of certificate and verifier. Formally, NP is the set of languages with a finite alphabet and verifier that runs in polynomial time. The following defines a "verifier":

Let L be a language over a finite alphabet, Σ.

L ∈ NP if, and only if, there exists a binary relation and a positive integer k such that the following two conditions are satisfied:

  1. For all , such that (x, y) ∈ R and ; and
  2. the language over is decidable by a deterministic Turing machine in polynomial time.

A Turing machine that decides LR is called a verifier for L and a y such that (x, y) ∈ R is called a certificate of membership of x in L.

Not all verifiers must be polynomial-time. However, for L to be in NP, there must be a verifier that runs in polynomial time.

Example

[edit]

Let

Whether a value of x is composite is equivalent to of whether x is a member of COMPOSITE. It can be shown that COMPOSITE ∈ NP by verifying that it satisfies the above definition (if we identify natural numbers with their binary representations).

COMPOSITE also happens to be in P, a fact demonstrated by the invention of the AKS primality test.[48]

NP-completeness

[edit]

There are many equivalent ways of describing NP-completeness.

Let L be a language over a finite alphabet Σ.

L is NP-complete if, and only if, the following two conditions are satisfied:

  1. L ∈ NP; and
  2. any L′ in NP is polynomial-time-reducible to L (written as ), where if, and only if, the following two conditions are satisfied:
    1. There exists f : Σ* → Σ* such that for all w in Σ* we have: ; and
    2. there exists a polynomial-time Turing machine that halts with f(w) on its tape on any input w.

Alternatively, if L ∈ NP, and there is another NP-complete problem that can be polynomial-time reduced to L, then L is NP-complete. This is a common way of proving some new problem is NP-complete.

Claimed solutions

[edit]

While the P versus NP problem is generally considered unsolved,[49] many amateur and some professional researchers have claimed solutions. Gerhard J. Woeginger compiled a list of 116 purported proofs from 1986 to 2016, of which 61 were proofs of P = NP, 49 were proofs of P ≠ NP, and 6 proved other results, e.g. that the problem is undecidable.[50] Some attempts at resolving P versus NP have received brief media attention,[51] though these attempts have been refuted.

[edit]

The film Travelling Salesman, by director Timothy Lanzone, is the story of four mathematicians hired by the US government to solve the P versus NP problem.[52]

In the sixth episode of The Simpsons' seventh season "Treehouse of Horror VI", the equation P = NP is seen shortly after Homer accidentally stumbles into the "third dimension".[53][54]

In the second episode of season 2 of Elementary, "Solve for X" Holmes and Watson investigate the murders of mathematicians who were attempting to solve P versus NP.[55][56]

Similar problems

[edit]
  • R vs. RE problem, where R is analog of class P, and RE is analog class NP. These classes are not equal, because undecidable but verifiable problems do exist, for example, Hilbert's tenth problem which is RE-complete.[57]
  • A similar problem exists in the theory of algebraic complexity: VP vs. VNP problem. Like P vs. NP, the answer is currently unknown.[58][57]

See also

[edit]

Notes

[edit]

References

[edit]

Sources

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The P versus NP problem is a central unsolved question in theoretical computer science that asks whether every decision problem for which a proposed solution can be verified quickly, in polynomial time, can also be solved quickly, in polynomial time. Formally, it seeks to determine whether the complexity class P—the set of problems solvable by a deterministic in polynomial time—equals the class NP—the set of problems verifiable by such a machine in polynomial time, or equivalently, solvable in polynomial time by a nondeterministic . This question, stated precisely as "Does P = NP?", was first posed in this form by in 1971, building on earlier ideas about . The origins of the problem trace back to foundational work in , with Alan Turing's 1936 introduction of the as a providing the formal basis for defining time-bounded complexity classes. Cook's seminal paper, "The Complexity of Theorem-Proving Procedures," not only articulated the P versus NP question but also introduced the concept of , demonstrating that the (SAT) is NP-complete and that thousands of practical problems reduce to it. Subsequent contributions, such as Richard Karp's 1972 enumeration of 21 NP-complete problems and independent work by , solidified the theory's foundations and highlighted its broad applicability. Recognized as one of the seven by the in 2000, solving the P versus NP problem carries a US$1,000,000 reward and would profoundly impact numerous fields. If P = NP, efficient algorithms could be developed for NP-complete problems, revolutionizing optimization in logistics, drug design, and scheduling, while enabling automated theorem proving and via principles like to find minimal consistent models. Conversely, proving P ≠ NP would confirm inherent computational hardness for these problems, validating the security foundations of modern , such as RSA , which relies on the difficulty of factoring large numbers—an NP problem believed not to be in P. Most experts conjecture that P ≠ NP, though no proof exists after over 50 years of intensive study.

Introduction and Examples

Illustrative Example

The traveling salesman problem (TSP) is a classic problem where a salesman must visit each of a given set of cities exactly once and return to the starting city, seeking the route with the minimum total distance. Formally, the input consists of a set of nn cities and a specifying the pairwise distances between them, with the goal of finding a Hamiltonian cycle of minimal length. Solving TSP exactly requires enumerating all possible tours, which grows factorially with the number of cities ((n1)!/2(n-1)! / 2 for symmetric distances), making it computationally intensive even for moderate nn, as no known polynomial-time exists for the general case. In contrast, verifying a proposed solution is straightforward: one simply checks that the tour visits each city exactly once and computes the total distance by summing the edge lengths, which takes linear time in nn. This asymmetry—difficulty in solving versus ease in verification—illustrates the core intuition behind the P versus NP question, and TSP is known to be NP-complete. Consider a small instance with four cities labeled A, B, C, and D, and the following symmetric (in arbitrary units):
ABCD
A0264
B2037
C6305
D4750
To solve this brute-force, evaluate all distinct tours starting and ending at A (fixing the start to avoid rotations and assuming undirected edges):
  • A → B → C → D → A: total distance = 2 + 3 + 5 + 4 = 14
  • A → B → D → C → A: total = 2 + 7 + 5 + 6 = 20
  • A → C → B → D → A: total = 6 + 3 + 7 + 4 = 20
  • A → C → D → B → A: total = 6 + 5 + 7 + 2 = 20
  • A → D → B → C → A: total = 4 + 7 + 3 + 6 = 20
  • A → D → C → B → A: total = 4 + 5 + 3 + 2 = 14
The minimal tours are A-B-C-D-A or its reverse A-D-C-B-A, both with length 14. Verification of, say, A-B-C-D-A requires confirming the sequence includes all cities without repetition (O(1) check for small n) and summing four distances (O(n) time), far quicker than generating and checking all six tours.

Importance and Motivation

The P versus NP problem stands as one of the most profound open questions in , serving as a cornerstone of and recognized as one of the seven by the , which offers a $1,000,000 reward for a correct solution. This designation underscores its fundamental importance, as resolving whether every problem whose solution can be verified efficiently (in time) can also be solved efficiently would redefine the boundaries of algorithmic feasibility across and engineering. At its core, the problem probes a philosophical tension: does the ease of verifying a proposed solution to a challenging problem imply that finding such a solution is equally tractable? This question, often framed as whether creative (solving) is reducible to mere recognition (verification), has far-reaching implications for fields reliant on computational limits. In , for instance, modern encryption schemes like RSA depend on the presumed intractability of problems such as , which would collapse if P = NP, potentially rendering secure online transactions and data protection obsolete in minutes using polynomial-time algorithms. Similarly, in optimization domains like and scheduling—where NP-complete problems such as the traveling salesman or arise—resolving P = NP could enable efficient global solutions, transforming and from approximations to exact computations. In , the problem intersects with search and planning tasks, many of which are NP-complete, where efficient verification of optimal paths or configurations contrasts with the computational explosion in exploring solution spaces; a proof of P = NP might unlock scalable AI systems capable of tackling complex under uncertainty. Industries across , , and cybersecurity currently operate under the widespread assumption that P ≠ NP, basing their protocols and software on the hardness of NP problems to ensure reliability and security. Thus, the stakes extend beyond academia, influencing the practical architecture of technology-dependent societies.

Formal Foundations

Class P

In , the class denotes the set of all decision problems that can be solved efficiently by a deterministic in time. Formally, is defined as the class of languages L such that there exists a deterministic M and a function p(n) where, for every input string x ∈ {0,1}*, the machine M accepts x if x ∈ L and rejects x if x ∉ L, halting in at most p(|x|) steps. This polynomial time bound means the running time T(n) of the Turing machine satisfies T(n) ≤ n^k + k for some constant integer k ≥ 1 and all input lengths n ≥ 1, or equivalently, T(n) = O(n^k). Such bounds capture computations where the resource requirements grow at a rate that remains practical even for large input sizes n, distinguishing P from classes requiring superpolynomial time. Problems in P represent those deemed efficiently solvable on classical digital computers, as polynomial-time algorithms scale feasibly with input size, enabling applications in areas like optimization and data processing. Representative examples illustrate this efficiency. The problem of sorting n numbers using comparisons can be solved in O(n log n) time, which is polynomial, via algorithms such as . Another example is finding the shortest path from a source vertex to all other vertices in a graph with V vertices and non-negative edge weights, solvable in O(V^2) time using .

Class NP

The class NP consists of decision problems that can be solved in polynomial time by a nondeterministic Turing machine (NTM). Formally, a language L{0,1}L \subseteq \{0,1\}^* is in NP if there exists an NTM MM and a pp such that for every input xLx \in L, MM accepts xx with some computation path of length at most p(x)p(|x|), and for every xLx \notin L, MM rejects xx on all computation paths. In an NTM, at each step, the machine may branch into multiple possible next states, allowing it to explore multiple paths in parallel, but acceptance requires at least one accepting path within the time bound. An equivalent characterization of NP uses the verifier model, where a LL is in NP if there exists a deterministic polynomial-time verifier VV and a qq such that for all xx, xLx \in L there exists a certificate yy with yq(x)|y| \leq q(|x|) for which V(x,y)=1V(x, y) = 1. The verifier VV runs in time bounded by O(nk)O(n^k) for some constant kk, where n=xn = |x|, ensuring that "yes" instances can be certified efficiently. This model highlights that NP problems are those where solutions can be verified quickly, even if finding them is hard. Examples of problems in NP include the subset sum problem, which asks whether there exists a subset of given positive integers that sums exactly to a target value, and the 3-coloring problem, which determines whether the vertices of a graph can be colored with at most three colors such that no adjacent vertices share the same color. Class P is contained in NP, as any deterministic polynomial-time machine can be simulated nondeterministically in the same time bound.

Verification and Decision Problems

The classes and NP are defined exclusively for decision problems, which are computational tasks that require a yes or no answer regarding whether a given input satisfies a certain property. For instance, the decision version of the Hamiltonian cycle problem asks whether a contains a cycle that visits each vertex exactly once, whereas the corresponding search version seeks to output such a cycle if it exists. This focus on decision problems stems from their foundational role in complexity theory, where the yes/no format enables precise classifications based on time bounds for acceptance or rejection. Central to the class NP is the verification paradigm, which posits that for every yes-instance of a problem in NP, there exists a short "witness" or certificate that can be checked in polynomial time by a deterministic verifier to confirm membership. Formally, a language LNPL \in \text{NP} if there is a polynomial-time relation RΣ×ΣR \subseteq \Sigma^* \times \Sigma^* such that for input ww, wLw \in L if and only if there exists a certificate yy with ywk|y| \leq |w|^k (for some constant kk) satisfying R(w,y)R(w, y), and the verifier runs in time polynomial in w|w|. In the Hamiltonian cycle example, the certificate is a proposed ordering of the vertices forming the cycle; verification involves checking that each consecutive pair is connected by an edge and that all vertices appear exactly once, which can be done in O(n2)O(n^2) time for a graph with nn vertices. This process highlights that membership in NP guarantees efficient checkability of solutions but does not necessarily provide an efficient means to generate them from scratch. Decision problems are prioritized in the P versus NP framework because search problems—those requiring the output of a witness or solution—can often be reduced to their decision counterparts in polynomial time, particularly for self-reducible NP problems. For example, techniques such as self-reducibility allow constructing the search solution by iteratively querying the decision oracle on modified instances, effectively performing a binary search over possible partial solutions to build the full witness. If the decision problem is solvable in polynomial time (i.e., in P), this reduction ensures the search problem is also solvable in polynomial time, linking the complexities of the two formulations without altering the core P versus NP question.

NP-Completeness

Definition

In , a decision problem LL (or ) is classified as NP-complete if it belongs to the NP and every problem in NP is polynomial-time reducible to LL. This means LL captures the hardest problems within NP under a specific notion of reduction. The standard reductions used for establishing NP-completeness are s, also known as Karp reductions, introduced by Richard Karp in 1972. A from a LL' to LL is a polynomial-time ff such that for every input xx, xLx \in L' f(x)Lf(x) \in L: xL    f(x)L,x \in L' \iff f(x) \in L, where ff runs in time polynomial in x|x|. In contrast, Stephen Cook's original 1971 definition of NP-completeness employed Turing reductions (also called Cook reductions), which allow a polynomial-time oracle machine for LL to decide LL' by making multiple adaptive queries to LL. Turing reductions are more general than many-one reductions, as the latter can be viewed as a special case where a single query suffices, but for the purposes of NP-completeness, the two notions coincide due to the structure of nondeterministic -time computations. Today, many-one reductions are preferred for their simplicity in proofs and implementations. The foundational result establishing the existence of NP-complete problems is the Cook-Levin theorem, which proves that the (SAT)—the problem of determining whether a given formula in has a satisfying assignment—is NP-complete. To show SAT is in NP, note that a satisfying assignment serves as a polynomial-time verifiable certificate. For NP-hardness, the proof reduces any NP language LL to SAT via (which implies many-one hardness for SAT). Given an NP machine MM that decides LL with polynomial-time verifiers, for input xx, construct a formula ϕx\phi_x encoding a valid of MM on xx that accepts. Specifically, the reduction simulates MM's nondeterministic over p(n)p(n) steps, where n=xn = |x|, by representing the machine's configuration at each step as a sequence of variables for tape contents, head position, and state. Local transitions between consecutive configurations are enforced by clauses ensuring that if configuration ziz_i holds, then zi+1z_{i+1} follows the Turing machine's rules (e.g., symbol read, write, move, and state update). Additional clauses fix the initial configuration (input xx on tape, starting state and head position) and require the final configuration to be accepting. The resulting ϕx\phi_x has size polynomial in nn, is computable in polynomial time, and is satisfiable xLx \in L. This simulation demonstrates that any NP can be locally verified by a SAT instance, highlighting the "local" nature of nondeterministic verification.

Examples

One of the foundational NP-complete problems is the (SAT), which determines whether there exists a truth assignment to the variables of a given in that satisfies the . proved SAT to be NP-complete in 1971, establishing it as the first such problem and the basis for reductions to others. A prominent variant is 3-SAT, where each clause contains exactly three literals; this restriction does not alter the NP-completeness, as general SAT reduces to 3-SAT in polynomial time by splitting longer clauses into multiple three-literal clauses using auxiliary variables. Richard Karp included 3-SAT among his 21 NP-complete problems in 1972, demonstrating its reduction explicitly. The traveling salesman problem (TSP) in decision form—given a with edge weights and integer k, does there exist a Hamiltonian cycle of total weight at most k?—is NP-complete. Karp showed this by reducing from the Hamiltonian cycle problem (itself NP-complete via reduction from 3-SAT), where edges in the original graph receive weight 1 and non-edges receive a large weight M, ensuring any optimal tour corresponds to a cycle if its is at most n (with k = n + (n-1)M or similar). The clique problem asks whether an undirected graph contains a complete subgraph () of at least k. It is NP-complete, with Karp providing a polynomial-time reduction from 3-SAT that constructs a graph gadget for each variable (representing true/false choices) and (enforcing satisfaction constraints via overlapping cliques). The vertex cover problem determines if there is a set of at most k vertices incident to every edge in a graph. Karp established its NP-completeness via reduction from 3-SAT, creating a graph where vertices correspond to literals and edges enforce clause coverage. The independent set problem—finding k vertices with no edges between them—is also NP-complete and directly related, as an independent set of size k in graph G corresponds to a vertex cover of size n-k in the complement graph \overline{G}. Over 3000 NP-complete problems are now known, cataloged in seminal compendiums and subsequent that extend the original lists from the .

The Core Question

Statement

The P versus NP problem is the question of whether the P equals the NP, that is, whether every for which a proposed solution can be verified in time can also be solved in time. Formally, P is the class of languages accepted by a deterministic in time, while NP is the class of languages for which membership can be verified by a deterministic in time given a -length certificate. The problem asks: Does P=NPP = \text{NP}? A key variant arises with NP-complete problems, which are the hardest problems in NP under polynomial-time reductions; if P=NPP = \text{NP}, then all NP-complete problems would be solvable in polynomial time and thus belong to P. Additionally, if P=NPP = \text{NP}, then NP=co-NP\text{NP} = \text{co-NP}, where co-NP consists of the complements of NP languages, meaning that for every problem with polynomial-time verifiable yes-instances, the no-instances would also have polynomial-time verifiable certificates. The problem also has a function version concerning search problems. In this formulation, the class comprises functions computable in polynomial time by a deterministic , while FNP includes functions where solutions can be verified in polynomial time given a polynomial-length . The analogous question is whether FP = FNP, which is equivalent to P=NPP = \text{NP}: if every verifiable solution can be found in polynomial time, then search problems with polynomial-time verification would be solvable in polynomial time. As of 2025, the P versus NP problem remains unsolved and is one of the seven posed by the , with a $1$ million prize for a correct solution.

Implications of P = NP

If P = NP, every problem in NP would admit a polynomial-time for decision, allowing efficient solutions to all NP-complete problems such as the (SAT) and the traveling salesman problem (TSP). For instance, an solving 3-SAT in roughly n2n^2 steps could be derived, where nn is the input size, enabling rapid determination of whether a Boolean formula has a satisfying assignment. Similarly, search versions of these problems, like finding a satisfying assignment for SAT or an optimal tour for TSP, would also become solvable in polynomial time, as the decision procedure could be extended via self-reducibility techniques. A proof that = NP would imply the collapse of the entire () to P, meaning that levels beyond NP, such as Σ2P\Sigma_2^P and Π2P\Pi_2^P, which involve alternating existential and universal quantifiers over polynomial-time predicates, would no longer introduce additional computational power. This follows from inducting on the hierarchy levels: since P = NP = , higher levels reduce to NP oracles that are themselves in P. In , P = NP would render many systems insecure by placing hard problems like in P, thereby breaking RSA encryption, which relies on the difficulty of factoring large semiprimes. Factoring a 200-digit number could become feasible in minutes using such an algorithm, undermining public-key infrastructure. Zero-knowledge proofs, which demonstrate membership in an NP without revealing the witness, would lose their foundational hardness assumptions, as the underlying barriers dissolve, potentially weakening protocols for . The resolution would revolutionize optimization, providing exact polynomial-time solutions to NP-hard problems in fields like and . For example, exact scheduling for job shops or crew rostering could be computed efficiently, optimizing without approximations. In , —modeled as finding the minimum-energy conformation of a polypeptide chain, proven NP-hard—would yield precise structures in polynomial time, accelerating and molecular simulations. Economically, P = NP would automate a wide array of hard optimization tasks, from to financial , drastically reducing computational costs and enabling real-time in industries like transportation and trading. However, this could disrupt employment in sectors, as routine algorithmic design for approximate solutions becomes obsolete, shifting jobs toward higher-level applications of these efficient tools.

Implications of P ≠ NP

A proof that P ≠ NP would establish that NP-complete problems, such as the traveling salesman problem or 3-SAT, cannot be solved by any deterministic polynomial-time algorithm, requiring super-polynomial time in the worst case under standard models of computation. This hardness result would confirm the inherent intractability of a wide range of decision and optimization problems central to fields like logistics, scheduling, and circuit design. In , P ≠ NP provides a foundational justification for the security of public-key systems, including RSA, which depends on the computational difficulty of factoring large composite numbers—a problem in NP but presumed outside . Without this separation, efficient algorithms could potentially break such schemes, undermining secure communication protocols used in , digital signatures, and technologies. The assumption of P ≠ NP motivates the pursuit of approximation algorithms for NP-hard optimization problems, where exact solutions are infeasible but near-optimal results can be obtained efficiently. For example, polynomial-time approximation schemes (PTAS) achieve arbitrarily close approximations for certain geometric NP-hard problems, such as the Euclidean traveling salesman problem, while heuristics like genetic algorithms address others in practice. This approach balances computational limits with usable outcomes in applications like network routing and . On the practical side, P ≠ NP validates the continued reliance on exponential-time exact solvers, such as branch-and-bound methods, for small-to-medium instances of NP-complete problems, where worst-case analysis highlights hardness but average-case behavior or instance size often permits feasible computation. Theoretically, it prevents the collapse of the (PH) to P, maintaining distinct levels like Σ₂ᵖ and Π₂ᵖ, and supports the expectation that NP ≠ , preserving separations between problems easy to verify positively and those easy to verify negatively.

Historical Context

Origins

The origins of the P versus NP problem trace back to the early 1970s, amid a burgeoning interest in computational complexity theory driven by the need to understand the limits of algorithmic efficiency as computers became more prevalent in scientific and industrial applications. In the preceding decade, foundational work laid the groundwork for these developments. In 1959, Michael O. Rabin and Dana Scott introduced the concept of nondeterministic finite automata in their paper "Finite Automata and Their Decision Problems," which explored machines that could accept languages through multiple computational paths, foreshadowing the role of nondeterminism in complexity classes. Building on this, Juris Hartmanis and Richard E. Stearns formalized measures of time complexity in 1965 with "On the Computational Complexity of Algorithms," establishing hierarchies of problems based on deterministic Turing machine running times and highlighting the need to classify computational resources systematically. Additionally, Walter J. Savitch's 1970 theorem demonstrated that nondeterministic space classes are bounded by deterministic ones, further emphasizing the distinctions between deterministic and nondeterministic computation. These 1960s advancements, supported by funding from the Advanced Research Projects Agency (ARPA)—which invested heavily in computer science research to address growing algorithmic demands in defense and academia—created the theoretical framework for probing the boundaries of efficient solvability. The pivotal formulation of the P versus NP question emerged in 1971 through Stephen A. Cook's seminal paper, "The Complexity of Theorem-Proving Procedures," presented at the Third Annual ACM Symposium on Theory of Computing. In this work, Cook defined the class NP as the set of decision problems verifiable in polynomial time by a and introduced the notion of , proving that the (SAT) is NP-complete via polynomial-time reductions. This established a benchmark for hardness: if any NP-complete problem could be solved in polynomial time, then all problems in NP could be, raising the central question of whether , the class of problems solvable in polynomial time by deterministic Turing machines, equals NP. Cook's analysis was motivated by , where he sought to measure proof procedure efficiency, but it generalized to a broad array of combinatorial and logical problems. The following year, Richard M. Karp significantly expanded the scope of in his 1972 paper, "Reducibility Among Combinatorial Problems," published in the proceedings of the Symposium on the Complexity of Computer Computations. Karp demonstrated that 21 diverse problems—from (e.g., and ) to optimization (e.g., traveling salesman)—are NP-complete by reducing SAT to each via polynomial-time transformations, underscoring the ubiquity of intractability in practical computing tasks. This catalog not only popularized Cook's framework but also highlighted the practical implications for algorithm design, as these problems arose in scheduling, routing, and other real-world applications supported by ARPA-funded research initiatives. Together, these contributions crystallized the P versus NP problem as a cornerstone of , shifting focus from individual algorithms to the inherent structure of computational difficulty.

Key Developments

In 1975, Theodore Baker, John Gill, and Robert Solovay introduced the technique of relativization, constructing oracles where = NP holds and others where ≠ NP, thereby establishing an early barrier to non-relativizing proof methods for separating the classes. During the late 1970s and 1980s, Leonid Levin's independent discovery of in the gained wider recognition in the West, with his 1973 paper on universal sequential search problems highlighting the theoretical foundations of complete problems under resource constraints, paralleling Stephen Cook's earlier results. In the 1990s, building on earlier approximations in circuit lower bounds, Alexander Razborov and Steven Rudich formalized the natural proofs barrier in 1994, showing that most proof techniques for separating from NP would imply the existence of strong pseudorandom generators, which are unlikely under standard cryptographic assumptions. The 2000s saw further barriers emerge, including and Avi Wigderson's 2008 algebrization technique, which extended relativization to algebraic oracles and blocked black-box proof methods relying on algebraic extensions, explaining limitations in interactive proof systems and related separations. Concurrently, Ketan Mulmuley and Milind Sohoni initiated geometric complexity theory in 2001, proposing an framework to attack P versus NP by comparing orbit closures of representations, aiming to prove non-equivalence through representation-theoretic invariants. Entering the 2020s, the P versus NP problem remains unresolved, with the Clay Mathematics Institute's Millennium Prize still unclaimed as of 2025. Sporadic claims of resolution, such as those in preprints from , have been scrutinized and dismissed by the community for flaws in reasoning or unverifiability. Recent years have witnessed a surge in AI-assisted attempts to explore proofs, leveraging for pattern discovery in complexity structures, though no breakthroughs have materialized.

Arguments for Resolution

Evidence for P ≠ NP

The absence of polynomial-time algorithms for NP-complete problems, despite extensive research efforts spanning over five decades, provides strong empirical evidence suggesting that P ≠ NP. The P versus NP problem was formalized in , and since then, thousands of researchers have attempted to find efficient solutions for problems like the traveling salesman or Boolean satisfiability, yet none has succeeded in devising a general polynomial-time algorithm. This persistent failure across diverse approaches underscores the inherent difficulty of these problems. A prominent example is the (SAT), the first NP-complete problem identified in 1972. Modern SAT solvers, such as MiniSat or Glucose, leverage techniques like and can solve large industrial instances efficiently in practice, but their worst-case remains exponential, as they explore vast search spaces for hard instances near the . Empirical benchmarks from the SAT competitions show that solver performance degrades rapidly as instance size increases beyond certain thresholds, reinforcing the belief that no polynomial-time method exists. Theoretical evidence from further supports P ≠ NP by establishing lower bounds on the computational power of restricted circuit classes. In particular, Alexander Razborov's 1987 result proved an exponential lower bound on the size of constant-depth circuits (AC⁰) computing the using AND, OR, and NOT gates. Roman Smolensky extended this in 1987 to show that AC⁰ circuits cannot compute the modulo-p function for prime p, using algebraic approximation methods over finite fields. These results demonstrate that even weak models of computation require superpolynomial resources for certain functions in NC¹, implying broader hardness for more powerful models relevant to P. Ketan Mulmuley's Geometric Complexity Theory (GCT) program offers a geometric perspective on P versus NP, reformulating the problem in terms of representation theory and invariant theory. Introduced in 2001 with Milind Sohoni, GCT views the separation as determining whether certain polynomial representations are distinct under group actions of the symmetric and alternating groups, reducing it to questions about orbit closures in algebraic geometry. This approach aims to prove explicit lower bounds by exploiting symmetries, providing intuitive evidence that permanent (complete for #P) cannot be efficiently computed like determinant (in P), thus supporting P ≠ NP. The natural proofs barrier, developed by Razborov and Rudich in 1994, highlights structural challenges in proving P ≠ NP while briefly suggesting why separation is plausible. Their framework shows that most known lower bound techniques are "natural" proofs—combining largeness, constructivity, and usefulness—and these cannot separate P from NP unless one-way functions do not exist, a widely believed cryptographic assumption. This implies that non-natural proofs are needed for resolution, but the existence of such barriers aligns with the hardness expected if P ≠ NP. A strong consensus among complexity theorists favors P ≠ NP, with approximately 88% believing in the separation based on a poll of 124 respondents conducted by William Gasarch. This view has grown from 61% in to 83% in 2012, reflecting accumulated evidence from failed attempts, lower bounds, and theoretical frameworks.

Evidence for P = NP

One line of argument for P = NP draws from proof complexity theory, where equality would imply the existence of polynomial-size proofs in systems like Frege proofs for all propositional tautologies that encode the certificates of NP problems, thereby yielding constructive, efficient algorithms for solving those problems. Specifically, Frege systems, which simulate general via and substitution rules, would then provide verifiable, polynomial-length derivations that directly construct solutions to NP-complete tasks such as , bridging verification and search in a feasible manner. In non-standard computational models, such as quantum or parallel settings, complexity classes like (bounded-error quantum polynomial time) and NC (Nick's class for efficient parallel computation) exhibit potential non-trivial intersections with NP that could support equality. For instance, quantum algorithms have achieved polynomial-time solutions for certain structured problems in NP, such as factoring via , suggesting that extended models might collapse the apparent gap between and NP without contradicting known separations. Similarly, while NC is contained in , its ability to parallelize certain NP verifications hints at scalable solving mechanisms in restricted environments. Average-case analysis provides further suggestive evidence, as NP-complete problems like 3-SAT demonstrate polynomial-time solvability on random instances under uniform distributions, particularly away from the where the clause-to-variable ratio is around 4.26. Below this threshold, formulas are satisfiable with high probability and easy to solve using or local search, while above it, they are typically unsatisfiable and provably so via efficient methods; this easy-hard-easy pattern implies that "typical" instances lack the worst-case hardness often invoked to argue for separation, potentially indicating that P = NP holds universally. Philosophically, the core of NP—polynomial-time verifiable solutions—suggests that search and verification are intimately linked, with no known intrinsic computational barrier preventing the efficient generation of short certificates, as solving reduces to finding succinct proofs in a verifiable framework. This view posits that the observed practical difficulty of NP problems stems from adversarial constructions rather than fundamental limits.

Relativization Barriers

In , an is a variant of a augmented with access to an "oracle," which is a hypothetical that instantaneously decides membership in a given . Formally, the oracle tape allows the machine to query strings, receiving an immediate yes or no answer without computational cost, enabling the study of relativized complexity classes such as PAP^A and NPA\mathrm{NP}^A, where AA denotes the oracle language. This model, introduced to analyze how additional computational power affects class relationships, reveals limitations in proof techniques by constructing worlds where familiar separations or equalities hold or fail independently of the base theory. The seminal result on relativization barriers comes from the 1975 theorem by Baker, Gill, and Solovay, which demonstrates the independence of the P versus NP question from certain proof methods. Specifically, they construct two : one, denoted AA, such that PA=NPAP^A = \mathrm{NP}^A, meaning nondeterministic polynomial-time with AA collapses to deterministic polynomial time; and another, denoted BB, such that PBNPBP^B \neq \mathrm{NP}^B, preserving a strict separation. These constructions use diagonalization to ensure that, relative to AA, every nondeterministic machine has a deterministic simulator, while relative to BB, no such universal simulator exists for all nondeterministic machines. The implications of this theorem are profound: any proof technique that "relativizes"—meaning it holds in the presence of arbitrary —cannot resolve whether P=NPP = \mathrm{NP} in the unrelativized setting, as such a proof would imply both equality and inequality simultaneously, leading to a contradiction. Relativizing techniques, often based on diagonalization like those used in the time hierarchy theorem, are thus blocked for this problem. However, not all complexity proofs relativize; for instance, the equality IP=PSPACE\mathrm{IP} = \mathrm{PSPACE}, established through interactive proof systems developed in the 1980s and completed by Shamir in 1992, relies on arithmetization and non-relativizing elements like the properties of low-degree polynomials over finite fields, avoiding the oracle barrier.

Problems Between P and NP-Complete

Existence of Intermediate Problems

In 1975, Richard Ladner proved that if P ≠ NP, then the complexity class NP contains problems that are neither solvable in polynomial time (in P) nor NP-complete. This result, known as Ladner's theorem, establishes the existence of NP-intermediate problems, which lie strictly between P and the NP-complete problems under polynomial-time reductions. The proof relies on a diagonalization technique applied to the degrees of NP problems under polynomial-time many-one reductions. Ladner constructs such an intermediate language L by starting with a sparse set (one with polynomially bounded ) and iteratively diagonalizing against potential polynomial-time reductions from NP-complete problems, ensuring L is in NP but not reducible to by any NP-complete set while avoiding membership in . This construction exploits the structure of polynomial-time reducibility to "punch holes" in the , guaranteeing the intermediate status. The theorem implies that NP does not collapse into a simple dichotomy of P and NP-complete problems if P ≠ NP, suggesting a potentially infinite hierarchy of complexity degrees within NP. It underscores the richness of the polynomial-time reducibility structure and motivates the study of intermediate problems as a way to probe the boundaries of NP without resolving the full P versus NP question.

Specific Candidates

The graph isomorphism problem, which asks whether two given graphs are isomorphic, is a prominent candidate for an NP-intermediate problem. It is known to be in NP, as a certificate consists of a bijection between the vertices that preserves adjacency. In 2015, László Babai announced a quasi-polynomial time algorithm for solving it, running in time 2O((logn)O(1))2^{O((\log n)^{O(1)})} where nn is the number of vertices, but this falls short of polynomial time and the problem remains unproven to be in P. Furthermore, graph isomorphism cannot be NP-complete unless the polynomial hierarchy collapses to its second level, as it lies in the class Σ2p\Sigma_2^p but exhibits low complexity properties that would force a collapse if it were complete. The decision version of —determining whether a given NN has a nontrivial factor less than or equal to a specified bound kk—is another suspected problem. This formulation is in both NP and : a yes-instance can be certified by providing the factor, while a no-instance can be certified by providing a complete prime factorization of N where all prime factors exceed k, including primality certificates for the factors, verifiable in polynomial time by multiplication and primality testing. Placement in NP \cap coNP implies it cannot be NP-complete unless the collapses, and while no classical polynomial-time algorithm is known, Peter Shor's 1994 solves the search version in polynomial time on a quantum computer. Inverting one-way functions provides a theoretical candidate for status, contingent on their existence. A is a polynomial-time that is easy to evaluate but hard to invert on a significant of inputs, with inversion belonging to NP via nondeterministic guessing of preimages. The existence of such functions would imply P \neq NP, as polynomial-time invertibility would contradict their hardness if P = NP; however, their existence remains an open question central to . (Goldreich, Foundations of Cryptography, Vol. 1) Lattice problems, particularly the shortest vector problem (SVP), are also viewed as potential NP-intermediates. In the decision version of exact SVP, one determines whether a lattice generated by given basis vectors contains a nonzero vector shorter than a specified length λ\lambda, which is clearly in NP via a certificate of such a vector. While exact SVP is not known to be NP-hard, approximation versions (GapSVP) are NP-hard to approximate within factors better than 2\sqrt{2}
Add your contribution
Related Hubs
User Avatar
No comments yet.