Reinforced TicTacToe

Test it!

This project has been created as a demonstration and is completely open source.

The opponent AI is based on a reinforcement learning algorithm (On-policy first-visit Monte Carlo) that has been trained on 1 million matches against a random opponent.

From Sutton and Barto (2018) Reinforcement Learning: An Introduction, chapter 5.4

The algorithm had no model of TicTacToe and discovered the best winning strategy play-by-play by discounting a given reward (-1 for a loss / 0 for a draw / 1 for a win) over all previous in-game actions after each episode.

Training

The calculation (training) of the Q-ActionValues has been done in Python and can be found here: reinforced_tictactoe.py. The file contains a class for the TicTacToe game as well as the algorithm used to train the AI.

Website

The website has been build with React and Javascript and utilizes the Q-ActionValues generated in training as an opponent AI. The source code for the website can be found here: /src. It consists of the main game board and also shows the user the top three contemplated moves.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
public		public
src		src
training		training
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforced TicTacToe

Test it!

Training

Website

About

Releases

Packages

Languages

lukaskuhn-lku/reinforced-tictactoe

Folders and files

Latest commit

History

Repository files navigation

Reinforced TicTacToe

Test it!

Training

Website

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages