NEAT Poker Bot

Texas Hold'em poker simulation with bots and leverages the NEAT algorithm to train neural networks for decision-making in the game.

Welcome to the NEAT Poker Bot!

This project contains three modules: pokerneat and test.

The poker module consists of an OOP Texas Holdem simulation, designed to be played by bots. Note: There is not visualization for the poker game.

The neat module contains an implementation of the NEAT algorithm using the Python-NEAT package. This trains a recurrent neural network with a genetic algroithm by evaluating the performance of genomes when competing against eachother.

Additionally, a test module exists, which contains unit tests to validate the classes in the poker module.

Visit the project’s GitHub page for usage instructions and source code:

https://github.com/Jason-Fitzpatrick1/poker-bot

Bot Performance

The genomes were evaluated post-training by simply playing against a human, and the expectation was that it would be clear if the bot was making reasoned decisions, because some basic strategies were assumed to be correct, such as betting when holding a strong hand. A better approach would have been to pair the winning genome against bots that perform random actions, and analyze the long term results. A genome with any strategy, should gain value in a long term evaluation, even in the presence of uncertainty.

In many cases, it was difficult to distinguish between random actions and the bot's behavior. With this in mind, it is worth noting that this was never trained beyond 100 generations, because of the long training times. It was difficult to determine if additional generations improved the quality of the genomes, especially due to the nature of Texas Holdem, which is a game of imperfect information. In other words, because of the imperfect information, it may be impossible in some instances to know if a genome made a poor decision.

At the time of writing, no new training has been done since precomputing the strengths of starting hands, which should significantly reduce training time. In the future, a new winner will be trained over many more generations, and will be evaluated against randomized actions.

Previous
Previous

MIPS CPU