summary

AI - Adding player agents

This is a step towards using this as a game for RL. We define multiple Agents, which follow simple rules in turn.

Random Agent

TODO description. Trial and error until valid move is found.

class RandomAgent[source]

RandomAgent(player)

Given a swoggle board on which it is a player, make a random valid move

A game with random agents

sr = Swoggle([RandomAgent(i+1) for i in range(4)])
sr.show()
[1.1][...][...][...][...][...][...][4.4]
[...][...][...][...][...][...][...][...]
[...][...][...][...][.d.][...][...][...]
[...][...][.d.][...][...][...][...][...]
[...][...][...][...][...][.d.][...][...]
[...][...][...][.d.][...][...][...][...]
[...][...][...][...][...][...][...][...]
[2.2][...][...][...][...][...][...][3.3]
Spa: []
sr.move_agents()
sr.show()
Moved 1 (0, 0) (1, 0)
Moved 2 (7, 0) (5, 1)
Moved 3 (7, 7) (4, 7)
Moved 4 (0, 7) (6, 3)
[..1][...][...][...][...][...][...][..4]
[1..][...][...][...][...][...][...][...]
[...][...][...][...][.d.][...][...][...]
[...][...][.d.][...][...][...][...][...]
[...][...][...][...][...][.d.][...][3..]
[...][2..][...][.d.][...][...][...][...]
[...][...][...][4..][...][...][...][...]
[..2][...][...][...][...][...][...][..3]
Spa: []

Basic Player

class BasicAgent[source]

BasicAgent(player)

Given a swoggle board on which it is a player, make a sensible move

Here's a way to get win rates for different agents over n games:

win_rates[source]

win_rates(n, agents)

Interestingly, the one opposite the non-random player (player 2) also gets some advantage - the nearby opponents are often removed.

print(win_rates(500, [RandomAgent(i+1) for i in range(3)]+[BasicAgent(4)]))
Winner: [4] 35
{2: 110, 3: 51, 4: 335, 1: 4}

Some convenience methods for PolicyLearners etc:

swoggle_to_state_vector[source]

swoggle_to_state_vector(sr, player, dice_roll)

swoggle_to_state_vector(sr, 1, 5)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0])

action_from_number[source]

action_from_number(move, player, sw, dice_roll)

Takes the output of the network, samples from the probs, does the move if possible, returns move_type, the move itself and the move number

action_from_number(random.choice(range(192)), 4, sr, sr.dice())
('drone', (4, (6, 3), (0, 6), 2, True, False), 134)

Export

from nbdev.export import notebook2script
notebook2script()
Converted 00_core.ipynb.
Converted 01_ai.ipynb.
Converted 02_RL.ipynb.
Converted Policy Gradient with Cartpole and PyTorch (Medium Version).ipynb.
Converted index.ipynb.