The field of Artificial Intelligence (AI)
can be defined in many ways
(e.g. trying to model human intelligence,
or
trying to solve human-level problems by any means
even if quite unlike how humans do it).
But one definition is:
Trying to write programs that come up with some form of "novelty"
- that discover a new solution/strategy/algorithm
for solving a problem
that the human programmer did not imagine.
Obviously the programmer must do something
to get this discovery process started.
What is a "search" algorithm?
The standard way of achieving "novelty"
is to frame the AI problem as a search problem:
Define a "space" of all possible solutions.
Define an algorithm to search that space for a good solution.
If the "space" is small enough, it may be searched exhaustively
until we are sure we have the best solution.
The interesting (and common) scenario is when the space is too big to search
exhaustively.
That is, in any finite experiment (and all experiments are finite)
we can only ever search part of the space.
In this case, we usually
have to give up any guarantee of finding
the best solution.
Instead the standard strategy is:
Define a "heuristic"
rule to search the space.
This is a "rule of thumb"
that does a smart search of the space in the finite time available.
Gives a good answer most of the time, but no guarantee it won't fail in some situations.
This may be well-defined (the solution solves the problem
and receives some score).
Or the evaluation function itself may be a heuristic.
(e.g. The "solution" generated is the next move in a chess game,
that is, a mere step on the way to the ultimate win-lose score.
How "good" is it to reach one given half-way position in chess
versus another?)
Types of heuristic
So we may have:
Heuristic search algorithm.
Heuristic evaluation function.
These are different types of thing!
Example of a heuristic search algorithm
Initialise with 20 random solutions.
Repeat:
Pick the best 10.
Make small changes in them, and generate a new pool of 20 candidate solutions.
Loop.
Is this a good heuristic?
Consider if an apparently good (better than its local neighbours),
but sub-optimal (compared with the global best)
solution is easy to find
at the start.
This could easily converge on 20 similar variants of it.
An easy-to-find local maximum
and a harder-to-find global maximum.
X-axis - The solution.
Y-axis - The fitness of the solution.
Find solution with max fitness.
Consider starting with a random x and then trying to improve it.
It is easy to get (by chance) onto the slopes of the local maximum,
and then repeatedly make changes in x looking for an improvement,
until you get to the local maximum.
The local maximum is very "visible".
Whereas unless you start (by chance) near the global maximum
you may never find out it is there.
Many heuristics might find the local maximum
but not the global one.
In fact, anything other than exhaustive search
would be at risk of finding the local maximum
but not the global one.
The global maximum could be arbitrarily hard to find:
The AI will search future moves before actually moving.
It will consider what moves I could make. Then what replies you could make to each of those. Then what replies I can make to each of those, etc.
Chess has an extremely high
Branching factor
of about 35.
This means that at a typical position, there are about 35 legal moves on average
(exact range is from 0 to 218).
The opponent can then make about 35 possible moves,
then you can make about 35, and so on.
To search 1 step into future, need to look at about 35 moves.
To search 2 steps into future, need to look at about 352 moves.
....
To search n steps into future, need to look at about 35n moves.
Quickly becomes impossible to do exhaustive search.
Move and Ply:
Average length of chess game
= 40 "moves"
(where move = each player moves).
i.e. 80 individual turn takes
or "plies"
("replies").
So exhaustive search would need to look ahead to about
3580 possible situations.
Many of these will be the same situation duplicated many times,
but don't know until we calculate it:
AI program (Deep Blue)
beat best human in world at chess in 1997,
despite not being able to do exhaustive search.
It examines 200 million moves a second,
but this is still not (remotely) exhaustive.
To be precise, it could search 6 moves ahead,
or 12 plies,
or about 3512 possible situations
= approx
3 380 000 000 000 000 000
Of course, human can't do exhaustive search either,
so don't need exhaustive search to beat best human.
McDermott discusses whether humans work by some unconscious search:
"When people express the opinion that human grandmasters do not examine 200,000,000 move sequences per second,
I ask them, 'How do you know?'
The answer is usually that human grandmasters are not aware
of searching this number of positions, or are
aware of searching many fewer. But almost everything that goes on in our minds we are unaware of.
I tend to agree that grandmasters are not searching the way Deep Blue does,
but whatever they are doing would, if implemented on a computer, seem equally 'blind.'"
While it seems bad to have a rule that may deliver a sub-optimal solution
and cannot guarantee finding the optimal one,
this is just normal in human (and animal) life.
It is proven that these problems need to be solved by heuristics
The knapsack problem
and the Bin packing problem
are
NP-hard.
This is explained in
Complexity and computability,
but basically means no easy way of finding global optimal solution
short of exhaustive search.
Exhaustive search only possible if small number of items,
impossible if large number.
If large number of items, need to use a heuristic.
Heuristic Search as a general-purpose, useful, computer algorithm
Even if one is not interested in AI
or classic AI problems,
the concept of defining a solution space
and using a smart heuristic to search it
is just a general-purpose useful addition to the toolkit of any computer software developer.
Consider where you have a system with 50 parameters to be modified,
for example
software to control traffic lights.
Parameters define sequences of lights changing,
length of time they do red for,
synchronisation with lights down the road,
changes for rush-hour and night-time,
etc.
You define what you think are sensible values for all 50 parameters,
after perhaps a lot of tweaking.
You are still unsure which particular combination of the 50 parameters
leads to the best traffic flow.
You could simply set up a search algorithm.
Pick a random combination of the 50 parameters. Test it.
Pick another one. Test it.
Run in simulation overnight.
Search space could be too large to search exhaustively.
Need heuristic.
Heuristic search may be customised for one particular problem
(e.g. specialised chess heuristics).
Or it may be a
Metaheuristic
- General strategy across multiple problems
(e.g. General strategy to maximise a function of n parameters)
No-free-lunch theorem
- The limitations of Metaheuristics.
"a general-purpose universal optimization strategy is theoretically impossible,
and the only way one strategy can outperform another is if it is
specialized to the specific problem under consideration"
Continuum
More domain-specific tweaking
More work for human
Tend to perform better
Domain-specific heuristic
Non-random search of the space
Guided by domain-specific knowledge
|
| Metaheuristic
Non-random search of the space
But no domain-specific knowledge
|
| Random search
Random search of the space
Works ok if space small
Less domain-specific
Less work for human
Tend to perform worse