Dr. Mark Humphrys

School of Computing. Dublin City University.

Online coding site: Ancient Brain

coders   JavaScript worlds

Search:

Free AI exercises


Research - PhD - Appendix E - Appendix F



F Full list of action selection methods

We only consider here what numbers one might generate at a single timestep where some action is being taken. That is, we only use the Q-value an agent has for the executed action or the loss it is suffering. We omit methods that use terms that don't mean anything in this timestep, such as (from §10):

displaymath8337

We also only consider maximizing, minimizing and summing. We omit other possible methods, such as product (§15.5.2) or standard deviation (§12.4).

Many methods below make no sense, such as maximizing unhappiness. Ones that make some sense are in bold.





F.1 Search for compromise action

Search for this, and take action a:

displaymath9652



displaymath9653



F.2 Use only suggested actions

Search for this, and take action tex2html_wrap_inline6834 :

displaymath9658



displaymath9659



Search for this, and take action tex2html_wrap_inline6830 :

displaymath9660





Other methods

In fact, we also omitted combinations of two of these terms. For instance, Tony Prescott suggested the following:

displaymath9661

That is, the gain by the winner, minus the losses it causes the others. Note that:

displaymath9662

This method seems also to inhabit that desirable middle ground between Maximize the Best Happiness and the Collective methods. It somewhat answers the criticism of Maximize the Best Happiness, in that if the winner would make a gain no matter what action is taken, then it will prefer actions that cause smaller losses to the other agents. We see that it would successfully be opportunistic in the example in §9.1. And it is still more individual-driven than the pure Collective methods - it successfully picks tex2html_wrap_inline7368 in the first example in §12.3.

But we only have to increase the number of agents to show that it is still vulnerable to the criticisms of any Collective method. It will fail to pick tex2html_wrap_inline7368 here:

singlespace3823

The only other method of combining two terms that makes sense is:

displaymath9663

which again is just another Collective method.



Appendix G

Return to Contents page.



ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.      New 250 G VPS server.

Note: Links on this site to user-generated content like Wikipedia are highlighted in red as possibly unreliable. My view is that such links are highly useful but flawed.