Hubbry Logo
Search algorithmSearch algorithmMain
Open search
Search algorithm
Community hub
Search algorithm
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Search algorithm
Search algorithm
from Wikipedia
Visual representation of a hash table, a data structure that allows for fast retrieval of information

In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the search space of a problem domain, with either discrete or continuous values.

Although search engines use search algorithms, they belong to the study of information retrieval, not algorithmics.

The appropriate search algorithm to use often depends on the data structure being searched, and may also include prior knowledge about the data. Search algorithms can be made faster or more efficient by specially constructed database structures, such as search trees, hash maps, and database indexes.[1][2]

Search algorithms can be classified based on their mechanism of searching into three types of algorithms: linear, binary, and hashing. Linear search algorithms check every record for the one associated with a target key in a linear fashion.[3] Binary, or half-interval, searches repeatedly target the center of the search structure and divide the search space in half. Comparison search algorithms improve on linear searching by successively eliminating records based on comparisons of the keys until the target record is found, and can be applied on data structures with a defined order.[4] Digital search algorithms work based on the properties of digits in data structures by using numerical keys.[5] Finally, hashing directly maps keys to records based on a hash function.[6]

Algorithms are often evaluated by their computational complexity, or maximum theoretical run time. Binary search functions, for example, have a maximum complexity of O(log n), or logarithmic time. In simple terms, the maximum number of operations needed to find the search target is a logarithmic function of the size of the search space.

Applications of search algorithms

[edit]

Specific applications of search algorithms include:

Classes

[edit]

For virtual search spaces

[edit]

Algorithms for searching virtual spaces are used in the constraint satisfaction problem, where the goal is to find a set of value assignments to certain variables that will satisfy specific mathematical equations and inequations / equalities. They are also used when the goal is to find a variable assignment that will maximize or minimize a certain function of those variables. Algorithms for these problems include the basic brute-force search (also called "naïve" or "uninformed" search), and a variety of heuristics that try to exploit partial knowledge about the structure of this space, such as linear relaxation, constraint generation, and constraint propagation.

An important subclass are the local search methods, that view the elements of the search space as the vertices of a graph, with edges defined by a set of heuristics applicable to the case; and scan the space by moving from item to item along the edges, for example according to the steepest descent or best-first criterion, or in a stochastic search. This category includes a great variety of general metaheuristic methods, such as simulated annealing, tabu search, A-teams,[7] and genetic programming, that combine arbitrary heuristics in specific ways. The opposite of local search would be global search methods. This method is applicable when the search space is not limited and all aspects of the given network are available to the entity running the search algorithm.[8]

This class also includes various tree search algorithms, that view the elements as vertices of a tree, and traverse that tree in some special order. Examples of the latter include the exhaustive methods such as depth-first search and breadth-first search, as well as various heuristic-based search tree pruning methods such as backtracking and branch and bound. Unlike general metaheuristics, which at best work only in a probabilistic sense, many of these tree-search methods are guaranteed to find the exact or optimal solution, if given enough time. This is called "completeness".

Another important sub-class consists of algorithms for exploring the game tree of multiple-player games, such as chess or backgammon, whose nodes consist of all possible game situations that could result from the current situation. The goal in these problems is to find the move that provides the best chance of a win, taking into account all possible moves of the opponent(s). Similar problems occur when humans or machines have to make successive decisions whose outcomes are not entirely under one's control, such as in robot guidance or in marketing, financial, or military strategy planning. This kind of problem — combinatorial search — has been extensively studied in the context of artificial intelligence. Examples of algorithms for this class are the minimax algorithm, alpha–beta pruning, and the A* algorithm and its variants.

For sub-structures of a given structure

[edit]

An important and extensively studied subclass are the graph algorithms, in particular graph traversal algorithms, for finding specific sub-structures in a given graph — such as subgraphs, paths, circuits, and so on. Examples include Dijkstra's algorithm, Kruskal's algorithm, the nearest neighbour algorithm, and Prim's algorithm.

Another important subclass of this category are the string searching algorithms, that search for patterns within strings. Two famous examples are the Boyer–Moore and Knuth–Morris–Pratt algorithms, and several algorithms based on the suffix tree data structure.

Search for the maximum of a function

[edit]

In 1953, American statistician Jack Kiefer devised Fibonacci search which can be used to find the maximum of a unimodal function and has many other applications in computer science.

For quantum computers

[edit]

There are also search methods designed for quantum computers, like Grover's algorithm, that are theoretically faster than linear or brute-force search even without the help of data structures or heuristics. While the ideas and applications behind quantum computers are still entirely theoretical, studies have been conducted with algorithms like Grover's that accurately replicate the hypothetical physical versions of quantum computing systems.[9]

See also

[edit]

Categories:

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A in is a systematic procedure designed to locate a specific item, value, or solution within a collection of or a defined problem space, often by exploring possible states or elements until the target is found or determined to be absent. These algorithms address core challenges in data retrieval and problem-solving, enabling efficient navigation through large datasets or complex structures like graphs and trees. Search algorithms can be broadly categorized into those for linear data structures and those for state-space or graph-based exploration. For unsorted arrays or lists, linear search iterates through each element sequentially until the target is matched or the end is reached, offering simplicity but with a worst-case time complexity of O(n), where n is the number of elements. In contrast, binary search operates on sorted arrays by repeatedly dividing the search interval in half, comparing the target to the middle element, which reduces the problem size logarithmically and achieves O(log n) efficiency, making it vastly superior for large, ordered datasets. In more advanced applications, such as and , search algorithms traverse graphs or state spaces to find optimal paths from an initial state to a . Breadth-first search (BFS) explores all nodes level by level using a queue, ensuring the shortest path in unweighted graphs while maintaining completeness and optimality, though at the cost of higher . Depth-first search (DFS) prioritizes depth over breadth with a stack or , exploring as far as possible along each branch before , which is memory-efficient but may not guarantee the shortest path. These uninformed (blind) methods form the foundation for informed variants, such as those using heuristics to prune unproductive paths and enhance efficiency in real-world problems like route planning or game AI.

Introduction

Definition and Purpose

is a method for finding a target element or solution within a defined search space, such as arrays, graphs, or abstract state spaces. These algorithms are fundamental in , enabling the systematic exploration of data structures to retrieve specific information or resolve problems. The primary purpose of a search algorithm is to efficiently locate items, paths, or optimal solutions by minimizing computational resources, such as time or memory usage, thereby supporting applications from simple to complex problem-solving. Basic examples illustrate this purpose. involves sequential scanning of an from the first element until the target is found or the end is reached, requiring O(n) time in the worst case, where n is the number of elements. In contrast, hashing provides direct access via hash functions that map keys to array indices, achieving O(1) average-case for lookups, though it may degrade under collisions. Key components of a search algorithm include the input—typically the search space and a query or target—the output, which is either the target itself or an indication of its absence, and termination conditions that define when the search concludes, such as exhausting the space or meeting a success criterion. These elements ensure the algorithm operates predictably within the broader context of search spaces and complexity measures explored in foundational concepts.

Historical Development

The conceptual foundations of search algorithms trace back to 19th-century , where precursors to modern searching emerged in sorting techniques that systematically scanned and compared elements, such as the developed by in 1887 for efficient data organization. These methods laid groundwork for algorithmic traversal and comparison, influencing later computational search. In the 1940s, with the rise of electronic computing, Alan Turing's seminal 1936 work on computable numbers and the introduced formal models of computation that inherently involved searching for halting states or proofs, establishing search as a core element of decidability and algorithmic processes. The 1950s and 1960s marked the emergence of explicit search frameworks in and . Allen Newell, , and J.C. Shaw developed the General Problem Solver (GPS) in 1957, an early AI program that employed tree search and means-ends analysis to simulate human-like problem-solving by exploring state spaces. Complementing this, the Knuth-Morris-Pratt algorithm for string matching, initially conceived by James H. Morris in 1970 and formalized by , Vaughan Pratt, and Morris in 1977, revolutionized pattern search by preprocessing the pattern to enable linear-time execution without over text. During the 1970s and 1980s, graph-based search algorithms solidified their role in optimization and AI. Edsger W. Dijkstra's 1959 algorithm for finding shortest paths in weighted graphs provided an efficient, greedy method for network traversal, gaining widespread application in and thereafter. Nils J. Nilsson's 1980 textbook Principles of Artificial Intelligence systematically described uninformed search strategies like and , framing them within AI problem-solving and influencing subsequent theoretical developments. Richard M. Karp's contributions, including the Edmonds-Karp algorithm for maximum flow in 1972, extended search techniques to network problems by iteratively finding augmenting paths via . From the late 1960s into the 1990s and beyond, informed and quantum search paradigms advanced efficiency in complex domains. Peter Hart, Nils Nilsson, and Bertram Raphael introduced the A* algorithm in 1968, blending Dijkstra's uniform-cost search with admissible heuristics to guide exploration toward goals, with ongoing expansions enhancing its use in and . In 1996, Lov K. Grover developed a offering a quadratic speedup for unstructured database search, marking a pivotal shift toward quantum-enhanced methods for exhaustive searches. These innovations by key figures like Dijkstra, Karp, and underscore the progression from classical to specialized search paradigms.

Fundamental Concepts

Search Space and Problem Formulation

In and , the search space, often synonymous with the state space, refers to the complete set of all possible states or configurations that a problem can assume. This space is typically modeled as a graph where nodes represent individual states and directed edges denote valid transitions between them, induced by permissible actions. For example, in tasks, states might correspond to locations in a , while in problems, the search space encompasses all feasible assignments of values to a set of variables subject to given constraints. The size of the search space can range from small and enumerable to exponentially large, making efficient traversal a core challenge for search algorithms. Formulating a search problem involves specifying a structured framework that captures its essential elements, enabling algorithms to systematically explore the . A standard formulation, as outlined in foundational AI literature, includes the initial state, which denotes the problem's starting configuration; the (or actions and transition model), which defines how to generate adjacent states from any given state; the test, a procedure to check whether a state satisfies the objective; and the path cost function, which measures the cumulative expense of a solution path, incorporating step costs for transitions in weighted environments. These components implicitly or explicitly delineate the boundaries and dynamics of the , allowing for the identification of optimal or feasible solutions. In formal terms, a can be represented as a comprising the set of states SS, the initial state s0Ss_0 \in S, the set of actions A(s)A(s) available in each state ss, the transition model T(s,a,s)T(s, a, s') indicating resulting states, the set GSG \subseteq S, and the step cost function c(s,a,s)c(s, a, s'). This structure applies to deterministic problems where outcomes are predictable, contrasting with variants that incorporate probabilities. State representation plays a crucial role in managing the search space, balancing completeness with computational feasibility. Explicit representations store states directly, such as arrays for board configurations in puzzles like the 8-queens problem, which is practical for small, finite spaces but becomes prohibitive for larger ones due to memory demands. In contrast, implicit representations generate states dynamically via the , avoiding full enumeration; this is the norm in vast domains like chess, where the state space exceeds 104010^{40} positions, and only relevant portions are expanded during search. The choice influences algorithm efficiency, with implicit methods enabling exploration of theoretically infinite or uncountably large spaces, such as continuous configurations in . A representative example of this formulation is the pathfinding problem, where the objective is to determine a sequence of actions leading from an initial state—such as a vehicle's position at a starting —to a goal state, like arriving at a destination , while minimizing travel cost in a road network. Here, states are locations, the successor function yields neighboring cities connected by roads, the goal test verifies arrival at the target, and path costs reflect distances or times; this setup underpins applications from GPS navigation to game AI.

Time and Space Complexity Measures

In search algorithms, quantifies the computational effort required to find a solution, typically measured by the number of nodes generated or expanded in the search tree. For problems formulated in a search space with branching factor bb (average number of successors per node) and solution depth dd, uninformed search methods like exhibit a of O(bd+1)O(b^{d+1}), as they explore all nodes up to depth dd and part of depth d+1d+1. This arises because the algorithm expands layers of the tree sequentially, leading to an in the number of nodes processed. Space complexity assesses the memory usage, often determined by the size of the frontier ( of nodes to explore) and explored set (closed list). In , space requirements reach O(bd+1)O(b^{d+1}) due to storing all nodes at the deepest level in the queue. By contrast, , which delves deeply along one branch before , has a space complexity of O(bm)O(bm), where mm is the maximum depth of the search tree, as it only maintains a stack proportional to the path length plus unexpanded siblings. These measures highlight how algorithm design influences resource demands in navigating the search . Beyond efficiency, search algorithms are evaluated for optimality and completeness. An algorithm is optimal if it guarantees finding a solution with the lowest path cost among all possible solutions, assuming uniform costs or admissible heuristics. Completeness means the algorithm will find a solution if one exists in a finite search space, provided it systematically explores without infinite loops. achieves both properties in uniform-cost environments, while is complete but not optimal due to potential oversight of shorter paths. A key trade-off exists between time and space complexity, as algorithms optimizing one often exacerbate the other. Iterative deepening search addresses this by performing successive depth-limited searches with increasing limits, achieving completeness and optimality like breadth-first search but with time complexity O(bd)O(b^d) and space complexity O(bd)O(bd), thus balancing exponential time growth with linear space in depth. This makes it suitable for memory-constrained settings where full frontier storage is prohibitive. For a concrete example outside tree search, binary search on a sorted of nn elements demonstrates logarithmic efficiency, with a of O(logn)O(\log n) in the worst case, as each step halves the search interval until the target is isolated. Space here is O(1)O(1) for the iterative version, relying only on index pointers without additional storage.

Types of Search Algorithms

Uninformed Search Strategies

Uninformed search strategies, also known as blind search methods, rely solely on the problem definition to systematically explore the search space without incorporating any knowledge or estimates about the proximity to the state. These approaches treat the search as a or , expanding nodes based on predefined rules like order or depth, making them applicable to any well-defined problem but often inefficient in large spaces due to exhaustive . Breadth-first search (BFS) performs a level-order traversal of the search , expanding all nodes at the current depth before proceeding to the next, using a first-in, first-out (FIFO) queue to manage the frontier of unexpanded nodes. This strategy is complete, guaranteeing a solution if one exists in a finite , and optimal for unweighted graphs where all edge costs are uniform, as it finds the shallowest state. The time and of BFS is O(bd)O(b^d), where bb is the and dd is the depth of the shallowest solution, reflecting the exponential growth in nodes explored up to that level. A standard implementation of BFS in the context of tree search follows this pseudocode, adapted for problem-solving agents:

function TREE-SEARCH(problem, frontier) returns a solution or failure frontier ← INSERT(MAKE-NODE(INITIAL-STATE(problem)), frontier) while not IS-EMPTY(frontier) do node ← REMOVE-FRONT(frontier) // FIFO queue for BFS if GOAL-TEST(problem, STATE(node)) then return node for each action in ACTIONS(problem, STATE(node)) do child ← CHILD-NODE(problem, node, action) INSERT(child, frontier) return failure

function TREE-SEARCH(problem, frontier) returns a solution or failure frontier ← INSERT(MAKE-NODE(INITIAL-STATE(problem)), frontier) while not IS-EMPTY(frontier) do node ← REMOVE-FRONT(frontier) // FIFO queue for BFS if GOAL-TEST(problem, STATE(node)) then return node for each action in ACTIONS(problem, STATE(node)) do child ← CHILD-NODE(problem, node, action) INSERT(child, frontier) return failure

Here, the frontier acts as a queue to ensure level-by-level expansion, and the algorithm terminates upon reaching the goal or exhausting the space. Depth-first search (DFS) explores as far as possible along each branch before , using a last-in, first-out (LIFO) stack for the , which makes it more space-efficient than BFS in deep trees. Unlike BFS, DFS is not optimal, as it may find a longer path to the goal, and it is incomplete in infinite spaces due to potential infinite descent without . The is O(bm)O(b^m), where mm is the maximum depth, but space complexity is linear in depth, O(bm)O(bm), for the stack or explicit stack. DFS can be implemented recursively, leveraging the call stack for , or iteratively using an explicit stack to avoid depth limits in deep searches; the iterative variant mimics the recursive one but manages the stack manually for better control in resource-constrained environments. Both variants share the same asymptotic complexities but differ in practical overhead, with recursive DFS simpler to code yet riskier for stack overflows. Uniform-cost search (UCS) extends BFS to weighted graphs by expanding the node with the lowest path cost from the start, using a priority queue ordered by cumulative cost rather than depth. This makes UCS optimal for graphs with nonnegative edge costs, as it guarantees the lowest-cost path, and complete if a finite-cost solution exists. Its complexity is O(bC/ϵ)O(b^{C^*/\epsilon}), where CC^* is the optimal path cost and ϵ\epsilon is the smallest step cost, which can exceed BFS in uniform-cost cases due to varying priorities. In tree-based applications, uninformed strategies like DFS are commonly used for puzzle solving, such as the 8- problem, where the goal is to place eight on a without mutual attacks by incrementally adding queens column by column and on conflicts, without relying on heuristics. This exhaustive exploration generates and tests configurations systematically, yielding solutions like one valid board arrangement among 92 total, demonstrating the method's ability to solve problems through pure enumeration. A primary limitation of uninformed search strategies is their susceptibility to in the search space, as the number of nodes expands rapidly with the , rendering them impractical for problems beyond moderate sizes without additional optimizations like . For instance, in puzzles with high branching factors, the time to explore up to depth dd can reach bdb^d operations, often infeasible for real-world scales.

Informed Search Strategies

Informed search strategies, also known as search methods, enhance the efficiency of search algorithms by incorporating domain-specific knowledge through functions that estimate the to reach a from a given state. These strategies prioritize exploring nodes that appear most promising based on the , contrasting with uninformed methods like that explore blindly. A key requirement for many informed searches is the use of an h(n)h(n), which never overestimates the true from node nn to the , ensuring h(n)h(n)h(n) \leq h^*(n) where h(n)h^*(n) is the optimal . The A* algorithm exemplifies informed search by evaluating nodes using the function f(n)=g(n)+h(n)f(n) = g(n) + h(n), where g(n)g(n) is the exact cost from the start node to nn, and h(n)h(n) is the estimate to the goal. A* maintains a ordered by f(n)f(n), expanding the lowest-cost node first, which guarantees finding the optimal path in terms of if h(n)h(n) is both admissible and consistent (satisfying the : h(n)c(n,n)+h(n)h(n) \leq c(n, n') + h(n') for successor nn'). This optimality holds because A* never expands a node with f(n)>Cf(n) > C^*, where CC^* is the optimal solution cost, as admissibility ensures f(n)Cf(n) \leq C^* for all nodes on the optimal path. Greedy best-first search simplifies this approach by prioritizing nodes solely based on h(n)h(n), ignoring the path cost g(n)g(n), which makes it faster in practice but suboptimal since it may overlook shorter paths in favor of heuristically attractive but longer routes. Introduced in early graph traversal experiments, it expands the node with the lowest estimated goal distance, potentially leading to incomplete exploration in complex spaces. To address memory limitations of A* in large state spaces, iterative deepening A* (IDA*) combines the memory efficiency of with A*'s guidance. IDA* performs a series of es with increasing cost thresholds, where each iteration cuts off paths exceeding the threshold defined by f(n)=g(n)+h(n)f(n) = g(n) + h(n), reusing the threshold from the previous iteration's minimum exceeding cost. This yields optimal solutions like A* while requiring only linear space proportional to the solution depth. Common heuristics in pathfinding include the Manhattan distance, which sums the horizontal and vertical differences between a node's position and the goal, providing an admissible lower bound assuming grid-based movement without obstacles. For instance, in an 8-puzzle, the Manhattan distance for each tile to its goal position (ignoring the blank) is summed, never exceeding the true moves needed since each unit difference requires at least one move. The , the straight-line distance (x2x1)2+(y2y1)2\sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}
Add your contribution
Related Hubs
User Avatar
No comments yet.