Chess Mastered with a Self-Play Algorithm

December 6, 2017

A history-changing event in chess play!

From chess24:

DeepMind’s AlphaZero crushes chess

20 years after DeepBlue defeated Garry Kasparov in a match, chess players have awoken to a new revolution. The AlphaZero algorithm developed by Google and DeepMind took just four hours of playing against itself to synthesise the chess knowledge of one and a half millennium and reach a level where it not only surpassed humans but crushed the reigning World Computer Champion Stockfish 28 wins to 0 in a 100-game match. All the brilliant stratagems and refinements that human programmers used to build chess engines have been outdone, and like Go players we can only marvel at a wholly new approach to the game.

After DeepMind's AlphaZero the chess engine world, and the chess world, will never be quite the same again.

The bombshell came in a quietly released academic paper published on 5 December 2017: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.

The full article and a link to download the paper is at:

The contents are stunning. The DeepMind team had managed to prove that a generic version of their algorithm, with no specific knowledge other than the rules of the game, could train itself for four hours at chess, two hours in shogi (Japanese chess) or eight hours in Go and then beat the reigning computer champions – i.e. the strongest known players of those games. In chess it wasn’t just a beating, but sheer demolition.

Stockfish is the reigning TCEC computer chess champion, and while it failed to make the final this year it went unbeaten in 51 games. In a match with the chess-trained AlphaZero, though, it lost 28 games and won none, with the remaining 72 drawn. With White AlphaZero scored a phenomenal 25 wins and 25 draws, while with Black it “merely” scored 3 wins and 47 draws. It turns out the starting move is really important after all!