Research - PhD - Appendix G - References

References

Aylett, 1995: Aylett, Ruth (1995), Multi-Agent Planning: Modelling Execution Agents, Papers of the 14th Workshop of the UK Planning and Scheduling Special Interest Group.
Baum, 1996: Baum, Eric B. (1996), Toward a Model of Mind as a Laissez-Faire Economy of Idiots, Proceedings of the Thirteenth International Conference on Machine Learning.
Blumberg, 1994: Blumberg, Bruce (1994), Action-Selection in Hamsterdam: Lessons from Ethology, Proceedings of the Third International Conference on Simulation of Adaptive Behavior (SAB-94).
Brooks, 1986: Brooks, Rodney A. (1986), A robust layered control system for a mobile robot, IEEE Journal of Robotics and Automation 2:14-23.
Brooks, 1991: Brooks, Rodney A. (1991), Intelligence without Representation, Artificial Intelligence 47:139-160.
Brooks, 1991a: Brooks, Rodney A. (1991), Intelligence without Reason, Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI-91).
Brooks, 1994: Brooks, Rodney A. (1994), Coherent Behavior from Many Adaptive Processes, Proceedings of the Third International Conference on Simulation of Adaptive Behavior (SAB-94).
Charpillet et al., 1996: Charpillet, Francois; Chevrier, Vincent; Foisel, Remy and Haton, Jean-Paul (1996), Organizing a Society of Softbots for World Wide Web Applications, workshop on Artificial Intelligence-based tools to help W3 users, Fifth International World Wide Web Conference.
Clocksin and Moore, 1989: Clocksin, William F. and Moore, Andrew W. (1989), Experiments in Adaptive State-Space Robotics, Proceedings of the 7th Conference of the Society for Artificial Intelligence and Simulation of Behaviour (AISB-89).
Dennett, 1978: Dennett, Daniel C. (1978), Why not the whole iguana?, Behavioral and Brain Sciences 1:103-104.
Dennett, 1991: Dennett, Daniel C. (1991), Consciousness Explained, Allen Lane, The Penguin Press.
Digney, 1996: Digney, Bruce L. (1996), Emergent Hierarchical Control Structures: Learning Reactive/Hierarchical Relationships in Reinforcement Environments, Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior (SAB-96).
Edelman, 1989: Edelman, Gerald M. (1989), The Remembered Present: A Biological Theory of Consciousness, Basic Books.
Edelman, 1992: Edelman, Gerald M. (1992), Bright Air, Brilliant Fire: On the Matter of the Mind, Basic Books.
Grefenstette, 1992: Grefenstette, John J. (1992), The Evolution of Strategies for Multi-agent Environments, Adaptive Behavior 1:65-89.
Holland, 1975: Holland, John H. (1975), Adaptation in Natural and Artificial Systems, Ann Arbor, Univ. Michigan Press.
Humphrys, 1995: Humphrys, Mark (1995), W-learning: Competition among selfish Q-learners, technical report no.362, University of Cambridge, Computer Laboratory.
Humphrys, 1995a: Humphrys, Mark (1995), Towards self-organising Action Selection, Papers of the 14th Workshop of the UK Planning and Scheduling Special Interest Group.
Humphrys, 1996: Humphrys, Mark (1996), Action Selection in a hypothetical house robot: Using those RL numbers, Proceedings of the First International ICSC Symposia on Intelligent Industrial Automation (IIA-96) and Soft Computing (SOCO-96).
Humphrys, 1996a: Humphrys, Mark (1996), Action Selection methods using Reinforcement Learning, PhD thesis (first version), University of Cambridge, Computer Laboratory.
Jackson, 1987: Jackson, John V. (1987), Idea for a Mind, SIGART Newsletter, Number 101, July 1987.
Kaelbling, 1993: Kaelbling, Leslie Pack (1993), Learning in Embedded Systems, The MIT Press/Bradford Books.
Kaelbling, 1993a: Kaelbling, Leslie Pack (1993), Hierarchical Learning in Stochastic Domains, Proceedings of the Tenth International Conference on Machine Learning.
Kaelbling et al., 1996: Kaelbling, Leslie Pack; Littman, Michael L. and Moore, Andrew W. (1996), Reinforcement Learning: A Survey, Journal of Artificial Intelligence Research 4:237-285.
Karlsson, 1997: Karlsson, Jonas (1997), Learning to Solve Multiple Goals, PhD thesis, University of Rochester, Department of Computer Science.
Lin, 1992: Lin, Long-Ji (1992), Self-Improving Reactive Agents Based On Reinforcement Learning, Planning and Teaching, Machine Learning 8:293-321.
Lin, 1993: Lin, Long-Ji (1993), Scaling up Reinforcement Learning for robot control, Proceedings of the Tenth International Conference on Machine Learning.
Maes, 1989: Maes, Pattie (1989), How To Do the Right Thing, Connection Science 1:291-323.
Maes, 1989a: Maes, Pattie (1989), The dynamics of action selection, Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI-89).
Mataric, 1994: Mataric, Maja J. (1994), Learning to behave socially, Proceedings of the Third International Conference on Simulation of Adaptive Behavior (SAB-94).
McFarland, 1989: McFarland, David (1989), Problems of Animal Behaviour, Longman.
Metcalfe and Boggs, 1976: Metcalfe, Robert M. and Boggs, David R. (1976), Ethernet: Distributed Packet Switching for Local Computer Networks, Communications of the ACM 19:395-404.
Minsky, 1986: Minsky, Marvin (1986), The Society of Mind (and here), Simon and Schuster, New York.
See Notes on it by Michael Dawson.
Also some essays here and here.
Moore, 1990: Moore, Andrew W. (1990), Efficient Memory-based Learning for Robot Control, PhD thesis, University of Cambridge, Computer Laboratory.
Ono et al., 1996: Ono, Norihiko; Fukumoto, Kenji and Ikeda, Osamu (1996), Collective Behavior by Modular Reinforcement-Learning Animats, Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior (SAB-96).
Ray, 1991: Ray, Thomas S. (1991), An Approach to the Synthesis of Life, Artificial Life II.
Ring, 1992: Ring, Mark (1992), Two Methods for Hierarchy Learning in Reinforcement Environments, Proceedings of the Second International Conference on Simulation of Adaptive Behavior (SAB-92).
Rosenblatt, 1995: Rosenblatt, Julio K. (1995), DAMN: A Distributed Architecture for Mobile Navigation, Proceedings of the 1995 AAAI Spring Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents.
Rosenblatt and Thorpe, 1995: Rosenblatt, Julio K. and Thorpe, Charles E. (1995), Combining Multiple Goals in a Behavior-Based Architecture, Proceedings of the 1995 International Conference on Intelligent Robots and Systems (IROS-95).
Ross, 1983: Ross, Sheldon M. (1983), Introduction to Stochastic Dynamic Programming, Academic Press, New York.
Rummery and Niranjan, 1994: Rummery, Gavin and Niranjan, Mahesan (1994), On-line Q-learning using Connectionist systems, technical report no.166, University of Cambridge, Engineering Department.
Sahota, 1994: Sahota, Michael K. (1994), Action Selection for Robots in Dynamic Environments through Inter-behaviour Bidding, Proceedings of the Third International Conference on Simulation of Adaptive Behavior (SAB-94).
Scheier and Pfeifer, 1995: Scheier, Christian and Pfeifer, Rolf (1995), Classification as Sensory-Motor Coordination, Proceedings of the 3rd European Conference on Artificial Life (ECAL-95).
Selfridge and Neisser, 1960: Selfridge, Oliver G. and Neisser, Ulric (1960), Pattern recognition by machine, Scientific American 203:60-68.
Singh, 1992: Singh, Satinder P. (1992), Transfer of Learning by Composing Solutions of Elemental Sequential Tasks, Machine Learning 8:323-339.
Singh et al., 1994: Singh, Satinder P.; Jaakkola, Tommi and Jordan, Michael I. (1994), Learning without state-estimation in Partially Observable Markovian Decision Processes, Proceedings of the Eleventh International Conference on Machine Learning.
Sporns, 1995: Sporns, Olaf (1995), personal communication.
Steels, 1994: Steels, Luc (1994), A case study in the Behavior-Oriented design of Autonomous Agents, Proceedings of the Third International Conference on Simulation of Adaptive Behavior (SAB-94).
Sutton, 1988: Sutton, Richard S. (1988), Learning to Predict by the Methods of Temporal Differences, Machine Learning 3:9-44.
Sutton, 1990: Sutton, Richard S. (1990), Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming, Proceedings of the Seventh International Conference on Machine Learning.
Sutton, 1990a: Sutton, Richard S. (1990), Reinforcement Learning Architectures for Animats, Proceedings of the First International Conference on Simulation of Adaptive Behavior (SAB-90).
Tan, 1993: Tan, Ming (1993), Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, Proceedings of the Tenth International Conference on Machine Learning.
Tesauro, 1992: Tesauro, Gerald (1992), Practical Issues in Temporal Difference Learning, Machine Learning 8:257-277.
Tham and Prager, 1994: Tham, Chen K. and Prager, Richard W. (1994), A modular Q-learning architecture for manipulator task decomposition, Proceedings of the Eleventh International Conference on Machine Learning.
Todd et al., 1994: Todd, Peter M.; Wilson, Stewart W.; Somayaji, Anil B. and Yanco, Holly A. (1994), The blind breeding the blind: Adaptive behavior without looking, Proceedings of the Third International Conference on Simulation of Adaptive Behavior (SAB-94).
Tyrrell, 1993: Tyrrell, Toby (and here) (1993), Computational Mechanisms for Action Selection (and ftp), PhD thesis, University of Edinburgh, Centre for Cognitive Science.
Varian, 1993: Varian, Hal R. (1993), Intermediate Microeconomics, W.W.Norton and Co.
Watkins, 1989: Watkins, Christopher J.C.H. (1989), Learning from delayed rewards, PhD thesis, University of Cambridge, Psychology Department.
Watkins and Dayan, 1992: Watkins, Christopher J.C.H. and Dayan, Peter (1992), Technical Note: Q-Learning, Machine Learning 8:279-292.
Weir, 1984: Weir, Michael (1984), Goal-Directed Behaviour, Gordon and Breach.
Whitehead et al., 1993: Whitehead, Steven; Karlsson, Jonas and Tenenberg, Josh (1993), Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging, in Connell and Mahadevan, eds., Robot Learning, Kluwer Academic Publishers.
Wilson, 1990: Wilson, Stewart W. (1990), The animat path to AI, Proceedings of the First International Conference on Simulation of Adaptive Behavior (SAB-90).
Wixson, 1991: Wixson, Lambert E. (1991), Scaling reinforcement learning techniques via modularity, Proceedings of the Eighth International Conference on Machine Learning.

Return to Contents page.