Neural Caissa: Building a Chess bot with Deep Learning

20 Dec 2020

Neural Caissa is yet another attempt to build a neural network improved Chess bot (Caissa is the goddess of Chess). For more details on building and playing with the Chess bot or playing Chess puzzles, you can go to the NeuralCaissa README and check the code available on Github.

Game between human vs NeuralCaissa

How does NeuralCaissa work?

In a nutshell, the current version works as follows:

An initial board state is generated, and a valuator strategy is set (code). The valuator will be used to compute the score for potential next states for the board.
Using a minimax algorithm with limited depth and alpha-beta pruning (code), a set of potential next states are explored (code). The valuator is used to compute the score for each state.
The next state that has higher score is used as the next move (code).

The current valuators available are two: a baseline and a neural valuator. The baseline valuator is fairly simple but works surprisingly well. The current formula implemented is the following:

value = SUM(piece in white)
        - SUM(piece in black)
        + 0.1 * (White pieces mobility score)
        - 0.1 * (Black pieces mobility score)

The neural valuator is a scoring function trained using a deep learning model built in PyTorch. This scoring function is currently built using a series of ConvNets, which are then mapped into a rectified linear activation function (ReLU) and evaluated using a \(tanh \rightarrow [-1, +1]\). The model was trained using Adam as optimizer, MSE as the loss function, mini-batches of size 256, and 100 epochs.

The input is determined by a very simple and naive serialization strategy: each state is represented as a tensor with binary variables (12 x (8 x 8) variables). For each piece \(k\) (\(k \in [1, 12]\)), there’s a 1 if piece \(k\) is in position \(i\) else 0 (\(i \in [1, 64]\)). The target label is either \(+1\) or \(-1\) depending on whether the player that is moving in given state won the game or not.

The data used to train this model was downloaded from caissabase. All the code used to pre-process and serialize caissabase (which is in the Portable Game Notation) is available in NeuralCaissa.

A more detailed description of the model is summarized in the following table. The total number of parameters is fairly small.

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1             [-1, 16, 8, 8]           1,744
            Conv2d-2             [-1, 16, 8, 8]           2,320
            Conv2d-3             [-1, 32, 3, 3]           4,640
            Conv2d-4             [-1, 32, 3, 3]           9,248
            Conv2d-5             [-1, 32, 3, 3]           9,248
            Conv2d-6             [-1, 64, 1, 1]          18,496
            Conv2d-7             [-1, 64, 2, 2]          16,448
            Conv2d-8             [-1, 64, 3, 3]          16,448
            Conv2d-9            [-1, 128, 1, 1]          32,896
           Conv2d-10            [-1, 128, 1, 1]          16,512
           Conv2d-11            [-1, 128, 1, 1]          16,512
           Conv2d-12            [-1, 128, 1, 1]          16,512
           Linear-13                    [-1, 1]             129
================================================================
Total params: 161,153
Trainable params: 161,153
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.03
Params size (MB): 0.61
Estimated Total Size (MB): 0.65
----------------------------------------------------------------

This model was heavily inspired by @geohot’s Twichchess.

TODOs

The following is a list of things I’d love to do in the future to improve the NeuralCaissa. You are more than welcome to try them out!

Speed up and/or distribute the serialization strategy for the Tensor representation 12x8x8 of the board’s state. (currently it’s about ~ 6it/s), which it’s slow if you consider that the training data is about 10M of games (~38M iterations if you consider that, in average, there’s 38 moves per game [source]). This would take ~73 days to process.
Improve space search strategy when running the Minimax algorithm beyond just using the alpha-beta pruning strategy (@geohot in Twitchchess mentions using Beam search for this).
Implement a MuZero style strategy to learn how to play Chess from the entire games, instead of relying on the minimax backtracking algorithm to find the best next move (maybe something like this)
Improve the ChessConvNet structure, so it’s better suited for Chess. Currently it’s a very straightforward stack of ConvNets, and most likely not using the best representation. For example, I think that it can be particularly improved by using a different loss function (I personally like the loss function proposed by @erikbern in his deep-pink implementation).
Improve the serialization strategy as the current one is extremely simple and naive. For example, in the current serialization strategy there basically no specific difference between a pawn and a queen.
Improve minimax algorithm moving from depth-limited search to probability-limited search (similar to what was done in Giraffe). Note that there’s a pretty interesting serialization strategy in the Giraffe paper that accounts for ~300 features for a given board state.
Maybe a simple Q-learning version might also be interesting to explore (Apparently, in this Kaggle set of scripts, there’s something that could be useful).
Consider using the Piece Square Tables used in Sunfish (also described in this blog post).
Chess2Vec could be used as a better vectorial representation than the current naive bit-board strategy used to train the deep learning model.
In Shannon’s work to teach computers to play Chess, he proposes an interesting equation that might be a better (and more complete) baseline for the valuator (more info in this Stanford project).

A note on Chess Puzzles

Besides playing Chess against the bot, you can also play “Chess puzzles”, where given a file with existing games, it renders a game two movements before checkmate for you to finish. If you can’t finish in the two moves, you’ll be able to keep playing with the computer from that point forward. For now, a pre-loaded set of games are loaded by default, but you can modify the code to include your own collection of games.