In this week you will learn about adversarial games and game AIs to implement OCaml 🐫 bots that play tic-tac-toe and Gomoku.
In these exercises you will:
- Play TIC TAC TOE!!
- Write a bot to play TIC TAC TOE!!
- Improve your TIC TAC TOE bot!!
Tic-tac-toe is a game in which two players take turns in placing either
an O
or an X
in one square of a 3x3 grid. The winner is the first
player to get 3 of the same symbols in a row.
Gomoku (also commonly referred to as "Omok"), is very similar to tic-tac-toe, but bigger. Two players play on a 15x15 board and the winner is the first player to get 5 pieces in a row.
You can think of a digital tic-tac-toe board as a "mapping" of "position -> piece" with the following types:
module Position = struct
(* [position] in the board.
(e.g. "top-left" cell is (row, column) (0, 0), and
"bottom mid" cell is (2, 1)) *)
type t =
{ row : int
; column : int
}
end
module Piece = struct
type t =
| X
| O
end
For example, the board:
X | - | -
___|___|___
- | O | X
___|___|___
- | O | -
Can be represented as a mapping of:
(row, column)
(0, 0) => X
(1, 1) => O
(1, 2) => X
(2, 1) => O
What board does the following "mapping" represent? Is there anything interesting happening? (Hint: If you were O, what move would you play?) Feel free to edit the "Answer board" below:
(row, column)
(1, 1) => X
(0, 0) => O
(0, 2) => O
(2, 0) => X
(2, 1) => X
(2, 2) => O
Answer board:
- | - | -
___|___|___
- | - | -
___|___|___
- | - | -
After you've answered look for a fellow fellow near you and discuss your answers!
First, fork this repository by visiting this page and clicking on the green "Create fork" button at the bottom.
Then clone the fork locally (on your AWS machine) to get started. You can clone a repo on
the command line like this (where $USER
is your GitHub username):
$ git clone git@github.com:$USER/tictactoe.git
Cloning into 'tictactoe'...
remote: Enumerating objects: 61, done.
remote: Counting objects: 100% (61/61), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 61 (delta 2), reused 61 (delta 2), pack-reused 0
Receiving objects: 100% (61/61), 235.81 KiB | 6.74 MiB/s, done.
Resolving deltas: 100% (2/2), done.
Now you should be able to enter into the project's directory, build the starter code, and run the executable binary like this:
$ cd tictactoe
tictactoe$ dune build
tictactoe$ dune runtest
tictactoe$ dune exec student/bin/game_ai.exe help
Bot-running command
game_ai.exe SUBCOMMAND
=== subcommands ===
create-game-and-play . Send off an rpc to create and game and then plays
the game, waiting for the other player to place
pieces
create-game-and-play-against-self
. Send off an rpc to create and then immediately
start playing against itself
join-game-and-play . Send off an rpc to join a game and then randomly
places pieces using web socket rpc
version . print version information
help . explain a given subcommand (perhaps recursively)
NOTE: If you are getting a problem like
Library "bonsai" not found.
or a type error, this is fine and expected. You can run the following commands to fix the problem:
opam update
opam upgrade -y # [-y] stands for "answer yes to all the prompts".
opam install bonsai -y
opam install cohttp_static_handler -y
dune build
"opam" (OCaml's package manager) can be a bit weird at times, so please raise your hand/reach out for a TA if
dune build
does not work afterward.
This repository contains three components:
- A "game server" that keeps track of all running games, and that you can "connect" to and play. It lets you play against bots from other fellows, and also has some bots of its own to play against.
- A web UI where you can see all running games, spectate games, and play directly from the UI. We have a centralized version of the game server and game UI hosted here.
- A directory called [student] with exercises that gradually build up a game AI.
This week you will not be writing game server nor UI code, and will instead focus on the "student" directory:
tictactoe$ tree student
student
├── bin
│ ├── dune
│ ├── game_ai-help-for-review.org
│ ├── game_ai.ml
│ └── game_ai.mli
├── exercises
│ ├── bin
│ │ ├── dune
│ │ ├── tic_tac_toe_exercises-help-for-review.org
│ │ ├── tic_tac_toe_exercises.ml
│ │ └── tic_tac_toe_exercises.mli
│ └── src
│ ├── dune
│ ├── tic_tac_toe_exercises_lib.ml
│ └── tic_tac_toe_exercises_lib.mli
└── src
├── dune
├── tictactoe_game_ai.ml
└── tictactoe_game_ai.mli
6 directories, 14 files
student/bin
contains the "game_ai" executable that you can use to run your tic-tac-toe bot.student/exercises
andstudent/src
are where you'll be implementing your bot.common
contains types likePiece.t
, andPosition.t
and other types shared by the game server and the ui.client
,server
, andrpc-client
are where the UI and server are implemented. You will not be writing code here, but are needed in case you want to "run" the game server and game UI locally.
There are other directories like server
, rpc-client
, and more. These directories
contain the "game server" that you can use to spectate your games!
You can run the game server by running the ./run-game-server.sh
script:
tictactoe$ ./run-game-server.sh -port 8181
Game server running on port 8181
You should then be able to navigate to: http://$YOUR_AWS_HOSTNAME:8181
NOTE: You can find your public AWS hostname with:
curl http://169.254.169.254/latest/meta-data/public-hostname
You should see a game server site like this:
NOTE: If you can't see the above site, do not worry! We have a shared, centralized server hosted here. Its hostname and port are: http://ec2-3-215-240-47.compute-1.amazonaws.com:8181/. We will soon be opening the
8181
port on your AWS boxes so that you can view your locally running web server.
You can think of an AI that plays tic-tac-toe board as a "function" of type
me:Piece.t -> game_state:Game_state.t -> Position.t
.
Where the me
parameter is the "piece" that the bot is playing as, and
Game_state.t
is the state of the board (where all the pieces are, and also
the type of game you're playing (i.e. 3x3 tic-tac-toe vs. 15x15 Omok)), and the
returned position is the place you've picked to put your position.
Over the course of these exercises you will be gradually building such a function.
META NOTE: Throughout these exercises there are notes and hints, but don't feel like you shouldn't read them/spoil yourself due to the hints, it's the opposite! You should feel encouraged to read all the hints/notes in a given exercise's instructions before taking a stab at the exercise.
Your AI needs to make a decision of "which free available spot" it should
pick. Let's find "all free available spots"! Implement
available_moves
in student/exercises/src/tic_tac_toe_exercises_lib.ml
.
val available_moves : game_kind:Game_kind.t -> pieces:Piece.t Position.Map.t -> Position.t list
You can view documentation/definitions of the above types in common/protocol.mli
.
HINT: [Map] is a new OCaml module that you have not seen before! It is OCaml's equivalent of python dictionaries, Java HashMap's or any language's "map". Available functions can be found here: Real World Ocaml. and on ocaml.org's docs page. If map's/dictionaries are a new concept to you this is fine! Feel free to ask a TA for help/ask on the #questions Slack channel if something is unclear!
Here is a [Map] function that you might find useful:
(** Returns a list of the keys in the given map. *)
val keys : ('k, _, _) t -> 'k list
NOTE: The type
('k, 'v, 'cmp) Map.t
means a map from with keys of type'k
and values of type'v
. For example,(Position.t, Piece.t, Position.comparator_witness) Map.t
is a map fromPosition.t
toPiece.t
. The'cmp
parameter is magical, and you can mostly ignore it. If you're interested, read about it on the "Real World OCaml" link above or ask a TA!Something else weird syntax-wise is that the type
(Position.t, Piece.t, Position.comparator_witness) Map.t
is the same asPiece.t Position.Map.t
.
HINT: Look for
Map.to_alist
/Map.keys
in the ocaml docs, and also leverageList.filter
.
HINT: To find all the "free slots", you might first need a function to find "all the slots" for tic-tac-toe/omok.
Once you've implemented [available_moves], feel free to uncomment the "available_moves" tests
near the bottom of student/exercises/src/tic_tac_toe_exercises_lib.ml
, we have prepared some example boards
for you to test with, but feel free to add test cases of your own! You can run tests
with dune runtest
.
We've also prepared some commands that you can run for the exercises to inspect
your output. Feel free to edit and run them for debugging. They are defined
within student/exercises/src/tic_tac_toe_exercises_lib.ml
.
tictactoe$ dune exec student/exercises/bin/tic_tac_toe_exercises.exe help
Tic Tac Toe exercises
tic_tac_toe_exercises.exe SUBCOMMAND
=== subcommands ===
exercise-four . Exercise 4: Is there a losing move?
exercise-one . Exercise 1: Where can I move?
exercise-three . Exercise 3: Is there a winning move?
exercise-two . Exercise 2: Did is the game over?
version . print version information
help . explain a given subcommand (perhaps recursively)
A simple strategy for a game AI is "picking a random available spot".
Implement [random_move_strategy] inside student/src/tictactoe_game_ai.ml
HINT: You'll want to use the
List.random_element_exn
function you learned about during snake, and also the functionavailable_moves
that you just implemented.
HINT 2: Your
available_moves
function is defined withinstudent/exercises/src/tic_tac_toe_exercises_lib.ml
, but therandom_move_strategy
is found withinstudent/src/tictactoe_game_ai.ml
. These are known as different "dune libraries". If you try to useTic_tac_toe_exercises_lib
from withintictactoe_game_ai.ml
, you'll get an error like "Unbound module Tic_tac_toe_exercises_lib". To be able to use it, you'll need to explicitly add it as a dependency within its dune file by adding "tic_tac_toe_exercises_lib" to the "libraries" field insidetictactoe_game_ai
's dune filestudent/src/dune
:
(library
(name tictactoe_game_ai)
(inline_tests)
(preprocess (pps ppx_jane))
- (libraries tic_tac_toe_2023_common))
+ (libraries tic_tac_toe_2023_common tic_tac_toe_exercises_lib))
After you are done, make [compute_next_move] (found within the same file) use [random_move_strategy] to compute its next move.
Let's have your random strategy connect and play against the game server!
The game_ai.exe
executable has three commands that use your [compute_next_move]
function to play a game on the game server. Here are a couple of ways of playing:
tictactoe$ _build/default/student/bin/game_ai.exe --help
Bot-running command
game_ai.exe SUBCOMMAND
=== subcommands ===
create-game-and-play . Send off an rpc to create and game and then plays
the game, waiting for the other player to place
pieces
create-game-and-play-against-self
. Send off an rpc to create and then immediately
start playing against itself
join-game-and-play . Send off an rpc to join a game and then randomly
places pieces using web socket rpc
version . print version information
help . explain a given subcommand (perhaps recursively)
NOTE: We are using
_build/default/student/bin/game_ai.exe --help
instead ofdune exec student/bin/game_ai.exe -- --help
as it makes sending arguments less confusing.
NOTE: These are the "host" and port of the shared whose web ui is viewable at: this link: http://ec2-3-215-240-47.compute-1.amazonaws.com:8181/:
GAME_SERVER_HOST='ec2-3-215-240-47.compute-1.amazonaws.com'
GAME_SERVER_PORT=8181
FAVORITE_POKEMON='bulbasaur' # Pick your own favorite Pokémon!
Optional: you can run your own private/local game server by running the following script:
tictactoe$ ./run-game-server.sh -port 8181
Game server running on port 8181
and your "hostname" would be the output of curl http://169.254.169.254/latest/meta-data/public-hostname
.
Let's play your strategy against itself!
tictactoe$ _build/default/student/bin/game_ai.exe create-game-and-play-against-self -game-kind Tic_tac_toe -host $GAME_SERVER_HOST -username $FAVORITE_POKEMON -port $GAME_SERVER_PORT
You'll see your game being played inside of http://ec2-3-215-240-47.compute-1.amazonaws.com:8181/ (or inside your local web ui if you're running your own game server).
The game might be short-lived, but if you pick "Omok" instead of "Tic_tac_toe", your game will run for a while.
You can play against your bot by running the create-game-and-play
command:
tictactoe$ _build/default/student/bin/game_ai.exe create-game-and-play -port $GAME_SERVER_PORT -game-kind Tic_tac_toe -host $GAME_SERVER_HOST -username $FAVORITE_POKEMON
successfully created game with id: '11'
Waiting for someone to join...
Your game will then show up as a "joinable game", and then you can click it to join it. Also feel free to change your "I am: IP_ADDRESS" to something else. (Sadly there's no authentication/anything like that so pick a unique username like your favorite Pokémon/rodent.)
You can play against the game server bot by specifying -against-bot
to the
create-game-and-play
command:
# Easy
tictactoe$ _build/default/student/bin/game_ai.exe create-game-and-play -port $GAME_SERVER_PORT -game-kind Tic_tac_toe -host $GAME_SERVER_HOST -username $FAVORITE_POKEMON -against-bot Easy
successfully created game with id: '15'
"successfully took turn!"
"successfully took turn!"
"successfully took turn!"
"successfully took turn!"
"successfully took turn!"
("Game is over!" (winner ((Player bulbasaur))))
# Hard
tictactoe$ _build/default/student/bin/game_ai.exe create-game-and-play -port $GAME_SERVER_PORT -game-kind Tic_tac_toe -host $GAME_SERVER_HOST -username $FAVORITE_POKEMON -against-bot Hard
successfully created game with id: '13'
"successfully took turn!"
"successfully took turn!"
"successfully took turn!"
("Game is over!" (winner ((Server_bot Hard))))
You can play against another fellow's strategy by first running
create-game-and-play
, and then the other fellow's strategy can
join your game with the join-game-and-play
with your game's id.
The player that creates the game has the first turn.
Reach out to another fellow here and play against their random bot. If no one is available feel free to move on to exercise 2.
At this point you are done with exercise 1. Congrats! Before moving on to exercise 2, feel free to save your work, and commit it and push it to github.
We now have a working random strategy. Let's improve it!
The next strategies we'll implement are "If there is a move I can make right now that wins, let's do it, otherwise do a random move". To do this we'll need to evaluate all "next possible moves", but first, let's do something simpler:
One question you might ask is: Is already over? Has someone already won? Is there a tie? Is a piece ready to be placed/does the game continue?
Your task for exercise 2 is implementing:
val evaluate : game_kind:Game_kind.t -> pieces:Piece.t Position.Map.t -> Evaluation.t
where evaluation has the type:
module Evaluation = struct
type t =
| Illegal_state
| Game_over of { winner : Piece.t option }
| Game_continues
end
You can implement this function in student/exercises/src/tic_tac_toe_exercises_lib.ml
One way you could imagine implementing this function is by "scanning" the board for patterns of length 3 (for tic tac toe) or 5 (for omok), in rows/columns/or diagonals. For example, the animation below scans for all rows of length 5:
Here are some [Map] functions that you might find useful:
(** Returns [Some value] bound to the given [key], or [None] if none exists. *)
val find : ('k, 'v, 'cmp) Map.t -> 'k -> 'v option
(** [mem map key] tests whether map contains a binding for key. *)
val mem : ('k, _, 'cmp) t -> 'k -> bool
NOTE: The type
('k, 'v, 'cmp) Map.t
means a map with keys of type'k
and values of type'v
. For example,(Position.t, Piece.t, Position.comparator_witness) Map.t
is a map fromPosition.t
toPiece.t
. The'cmp
parameter is magical and you can mostly ignore it. If you're interested, read about it on the "Real World OCaml" link above or ask a TA!Something else weird syntax-wise is that the type
(Position.t, Piece.t, Position.comparator_witness) Map.t
is the same asPiece.t Position.Map.t
.
NOTE 2: You might also find
List.exists
from the snake exercises useful.
Additionally, there are some functions that might be helpful available
for operating on Position.t
's
module Position : sig
(* Top-left is {row = 0; column = 0}. *)
type t =
{ row : int
; column : int
}
val in_bounds : t -> game_kind:Game_kind.t -> bool
val equal : t -> t -> bool
(** [down t] is [t]'s downwards neighbor. *)
val down : t -> t
val right : t -> t
val up : t -> t
val left : t -> t
(** [all_offsets] is a list of functions to compute all 8 neighbors of a
cell (i.e. left, up-left, up, up-right, right, right-down, down,
down-left). *)
val all_offsets : (t -> t) list
end
NOTE: [Position] is defined in
common/protocol.mli
.
Also remember to uncomment the tests at the bottom of
tic_tac_toe_exercises_lib.ml
once you are done.
Using [evaluate] and [available_moves], let's implement winning_moves
in
student/exercises/src/main.ml
.
(* A list the list of moves that win the game in one move. *)
val winning_moves : me:Piece.t -> game_kind:Game_kind.t -> pieces:Piece.t Position.Map.t -> Position.t list
It will look at all available moves, and see if they would win the game in 1 move.
You might find Map.set
useful.
When you're done, uncomment the expect tests and move on to exercise 3.2. Also, remember to commit+push.
Using [winning_moves], let's implement [pick_winning_move_if_possible_strategy] in
tictactoe_game_ai.ml
. If there is a winning move, it'll pick it. Otherwise, it'll
just pick a random move. Let's also update compute_next_move
to use
pick_winning_move_if_possible_strategy
.
Run your bot using the commands from exercise 1.3. Play against it on the UI,
does it pick a winning position if it sees one? Can it beat the Easy
bot more
often?
Alright, now we have a better strategy, let's improve it!
The next strategy we'll implement is "if the bot sees that its opponent can win in their next move, it'll block its opponent from winning".
Your task for exercise 4 is to implement losing_moves
within
tic_tac_toe_exercises_lib.ml
.
HINT: A "losing move" is a move that let's your opponent win in the next move "winning move" for your opponent after all. You might find
Piece.flip
useful!
HINT: You might need to "look 1" step into the future. What will be your opponents game state after you've made your move.
When you're done, uncomment the expect tests and move on to exercise 4.2. Also, remember to commit+push!
Like before, in tictactoe_game_ai.ml
, implement the descriptively named
[pick_winning_move_or_block_if_possible_strategy], and update
[compute_next_move] to use it.
Play against your bot! Can you win against it? How often can it beat the Easy
bot?
We've never mentioned its name thus far, but we've been secretly been implementing a version of the algorithm "Minimax". It was one of the earlier algorithms to beat humans at complex games like chess, and it's also the algorithm we'll be implementing next.
Thus far we've implemented an algorithm that can look 1 move into the future and if it can win/not lose into the future, it can make the best decision. One immediate idea is:
- What if we look further into the future?
Take a moment to intuitively think, for yourself on a piece of paper/on a text file, if you could look 2 moves into the future on tic tac toe/how would you pick your next move? What about 3? Discuss with another fellow or if no one is available talk to a TA!
Minimax has a couple of parts:
- score: What is the current "score" of the game's position? This is a bit
silly on a game like tic-tac-toe/omok, but you could think of "winning" as
+infinity score and losing as -infinity score. You could also come up with
heuristics like, if I see "2" pieces together, then I add 4, if my opponent has
4 pieces together, then I subtract 50. Or even if I have n pieces together, I
add
n*n
to my score. - maximizing player: You win when your score reaches +infinity, so you want to maximize the score.
- minimizing player: Your opponent wins when the score reaches -inifinity, so they will always want to minimize the score.
Minimax assumes that both players will play optimally, so minimax assumes that the maximizing player will always pick the move that results in the maximum score, and that the minimizing player picks the smaller score.
Minimax operates under a "max-depth" it will travese to it will take as a parameter "how many moves into the future" it should try to evaluate. It will then build a tree like the ones shown in this wikipedia page and run the score function on "terminal" nodes that are nodes where either the game ends (score is infinity) or where the max depth has been reached.
Here are more resources on minimax. Also feel free to ask TA's if you have any questions:
You can use your already implemented evaluate
function to implement a score
function. You can implement score
in tictactoe_game_ai.ml
. A very basic
version of score can be -infinity for loss, +infinity for win, [0.0] for the
game continuing.
Read the pseudocode on minimax from wikipedia.
What is a "terminal node"?
A terminal node is the game ending by someone winning/losing or by the game tie'ing due to all of the slots being filled.
What is the "heuristic value of node"?
It would be the score
function you just implemented.
What is the "child of node"?
The available_moves
function you implemented in exercise 1.
If you have any questions please ask a TA!
Implement minimax by following the pseudocode from wikipedia. Some functions you might find handy are:
val min_elt : 'a list -> compare:('a -> 'a -> int) -> 'a option
val max_elt : 'a list -> compare:('a -> 'a -> int) -> 'a option
(** fold t ~init ~f returns f (... f (f (f init e1) e2) e3 ...) en, where e1..en are the elements of t *)
val fold : 'a list -> init:'accum -> f:('accum -> 'a -> 'accum) -> 'accum
HINT: OCaml required the
rec
keyword for recursive functions.
You can write this function, with the parameters of your choosing within
tictactoe_game_ai.ml
.
Once you are done, you can update compute_next_move
to make your AI executable
use minimax and play with your AI! How much better is it? How far deep can you
set the parameter on tic tac toe? What about Omok? How slow/fast is it? Can it beat
the medium server bot? What about the hard server bot? Can you win against it?
This is a list of possible extensions you could implement. You are also free to come up extensions of your own/ideas of your own to beat the "Hard" server bot/be faster than the server bot.
- Change [available_positions] to only pick empty positions that are next to occupied pieces to lower the search space. Does the depth you can explore increase?
- Make your score function "heuristic based" by scoring 2/3/4 consecutive pieces an "n*n" score.
- Implement a minimax optimization called alpha beta pruning. Does the depth that you can explore increase?
- Make evaluation super fast by using bit masks.
- Pick different heuristics for your [score] function.