Search Algorithms & Research papers on Internet Searching

Prepared by Vikrant Colaso, 2368

HOME

MY WORK

FRIENDSHIP

PERSONAL

FEEDBACK

Breadth First Search Technique

Algorithm

Create a variable called “Node-List” and set it to the initial state
While goal state not found or Node-list is empty do
- Remove the first element from Node-List and call it E
- For ‘E’ do
  - Repeat procedure to generate a new state
  - If the new state is a goal state, quit else
  - Add the new state to the end of the Node List

Depth First Search Technique

Algorithm

If the initial state is a goal state, quit and return success
Else do the foll:
- Generate successor E. If there are no more successors signal Failure.
- Call Depth-First Search with E as the initial state
- If success is returned, signal success, else continue in the loop

Implementation of DFS and BFS in Java Applets

Demo

The Java Files

Reference

Depth first vs Breadth first Search

Depth first and breadth first search both have some advantages. Which is best depends on properties of the problem you are solving. For tree search at least, depth first search tends to require less memory, as you only need to record nodes on the `current' path. If there are lots of solutions, but all at a comparable `depth' in the tree, then you may hit on a solution having only explored a very small part of the tree. On the other hand, that may not be the best solution. Also, depth first search may get stuck exploring long (and potentially infinite) blind alleys, when there is in fact a solution path of only one or two steps. (We could prevent this by setting a depth limit to our search, but then it wouldn't be exhaustive.). So depth first is good when there are many possible solutions, and you only want one (and you don't care which one). It may be less appropriate when there is only one solution, or if you want the shortest one.

Breadth first search may use more memory, but will not get stuck in blind alleys, and will always find the shortest path first (or at least, the path that involves the least number of steps). It may be more appropriate when exploring very large search spaces where there is an expected solution which takes a relatively small number of steps, or when you are interested in all the solutions (perhaps up to some depth limit).

To summarise so far: Search techniques are used in AI to find a sequence of steps that will get us from some initial state to some goal state(s). You can either search backwards, from the goal state, or forwards, from the initial state. And whichever direction you choose you can use various search algorithms to do the search. We've so far discussed simple breadth first search and depth first search. These are both systematic, exhaustive search techniques that will eventually try all the nodes in the search space (if its finite!). The appropriate direction of search and the appropriate algorithm depend on the nature of the problem you are trying to solve, and in particular the properties of the search space.

Heuristic Search

So far we have looked at two search algorithms that can in principle be used to systematically search the whole search space. Sometimes however it is not feasible to search the whole search space - it's just too big. Imagine searching every possible road and alley in a 400 mile circumference around Edinburgh when looking for a route to London. (OK, so this might be feasible on a fast computer, but some search spaces are REALLY big). In this case we need to use heuristic search.

The basic idea of heuristic search is that, rather than trying all possible search paths, you try and focus on paths that seem to be getting you nearer your goal state. Of course, you generally can't be sure that you are really near your goal state - it could be that you'll have to take some amazingly complicated and circuitous sequence of steps to get there. But we might be able to have a good guess. Heuristics are used to help us make that guess.

To use heuristic search you need an evaluation function that scores a node in the search tree according to how close to the target/goal state it seems to be. This will just be a guess, but it should still be useful. For example, for finding a route between two towns a possible evaluation function might be a ``as the crow flies'' distance between the town being considered and the target town. It may turn out that this does not accurately reflect the actual (by road) distance - maybe there aren't any good roads from this town to your target town. However, it provides a quick way of guessing that helps in the search.

There are a whole batch of different heuristic search algorithms, of which we'll go through the following:

Generate and Test
Hill climbing
Best first search
A*

We'll assume that we are searching trees rather than graphs (ie, there aren't any loops etc). However, all the algorithms can be simply extended for graph search by using a closed list of nodes as described above (though this is unnecessary for hill climbing). (Note that there are many minor variants of these algorithms, which are described in many different ways. Don't expect all the algorithms in different text books to look identical).

Generate and Test Technique

Algorithm

Generate a possible solution. (Either generate a point in the problem space or a path from a start state)
Test to see if this is a real solution by comparing with the set of acceptable goal states
If a solution is found, quit; else repeat the loop

Simple Hill Climbing

Algorithm

Start with current-state = initial-state
Evaluate new state
- If it is a goal state, return and quit
- If not, but it is better than the current state, then make it a the current state
- If no better than the current state, continue in the loop

Steepest-Ascent Hill Climbing Technique

Algorithm

If initial state=goal state, quit, else continue
Loop until solution is found or till there is no change
- Let “SUCC” be the best possible successor state
- Apply the operator to generate a new state
  - If it is a goal state, return and quit
Else compare to SUCC. If better then set SUCC to this state, else proceed
- If SUCC is better than the current state then set current state to SUCC and continue

More on Hill Climbing

Best First Search Technique

Similar to Hill Climbing
Combines of DFS and BFS
Follows a single path at a time but switches paths if a more promising one is found

Algorithm

Start with OPEN=initial state
Until the goal is found or OPEN is empty :
- Pick the best node on OPEN
- Generate its successors
- For each successor:
  - Evaluate and add to OPEN
  - If generated before, change the parent for a better path

More on Best First Search

A* Search Technique

In its simplest form as described above, best first search is useful, but doesn't take into account the cost of the path so far when choosing which node to search from next. So, you may find a solution but it may be not a very good solution. There is a variant of best first search known as A* which attempts to find a solution which minimizes the total length or cost of the solution path. It combines advantages of breadth first search, where the shortest path is found first, with advantages of best first search, where the node that we guess is closest to the solution is explored next.

In the A* algorithm the score which is assigned to a node is a combination of the cost of the path so far and the estimated cost to solution. This is normally expressed as an evaluation function f, which involves the sum of of the values returned by two functions g and h, g returning the cost of the path (from initial state) to the node in question, and h returning an estimate of the remaining cost to the goal state:

f(Node) = g(Node) + h(Node)

The A* algorithm then looks the same as the simple best first algorithm, but we use this slightly more complex evaluation function. (Our best node now will be the one with the lowest cost/score).

The A* algorithm guarantees to find the shortest path first. However, to make this true we have to ensure that h(Node) does not overestimate the cost to the solution. The definition of the A* algorithm includes this requirement.

Introduction to Genetic Algorithms for Searching

GA is used for a big search space
Basic steps in the workflow of genetic algorithm are:
- Representing each solution uniformly in a way suitable for computer processing (often like arrays or strings)
- Randomly choose a few potential solutions
- Pick the best ones from current configuration set (mating pool)
- Perform crossover and mutation on mating pool and thus generating new solutions
- Crossover operator generates new solution by using genetic material of existing solutions in the mating pool
- Mutation operator generates a whole new solution randomly
- New solutions are then inserted in the current configuration set
- If stopping criterion is satisfied algorithm ends, otherwise repeat procedure

More on Genetic Algorithms

Back to My Work