Action Selection methods using 
 
	 Reinforcement Learning 
 
	Mark Humphrys			  
	
 
	Trinity Hall, Cambridge			  
	
 
June 1997 
	
 
   	 
	
 
	
 
	A dissertation submitted for the degree of Doctor of Philosophy		
 
	in the University of Cambridge						
This is the 
expanded version of my PhD.
See 
full reference.
 Abstract 
The Action Selection problem 
is the problem of run-time choice between conflicting and heterogenous goals,
a central problem in the simulation of whole creatures
(as opposed to the solution of isolated uninterrupted tasks).
This thesis argues that Reinforcement Learning has been overlooked in the solution of the Action Selection problem.
Considering a decentralised model of mind, 
with internal tension and competition between selfish behaviors,
this thesis introduces an algorithm called "W-learning", 
whereby different parts of the mind modify their behavior based on whether or not they are succeeding in
getting the body to execute their actions. 
This thesis sets W-learning in context among the different ways of exploiting
Reinforcement Learning numbers for the purposes of Action Selection. 
It is a "Minimize the Worst Unhappiness" strategy. 
The different methods are tested and their strengths and weaknesses analysed
in an artificial world.
 Contents