Page 1 of 6 123 ... LastLast
Results 1 to 10 of 59

Thread: Chess Mastered with a Self-Play Algorithm

  1. #1
    Wayne Komer's Avatar
    Join Date
    2010-03-04
    Posts
    3,874

    Chess Mastered with a Self-Play Algorithm

    Chess Mastered with a Self-Play Algorithm

    December 6, 2017

    A history-changing event in chess play!

    From chess24:

    DeepMind’s AlphaZero crushes chess

    20 years after DeepBlue defeated Garry Kasparov in a match, chess players have awoken to a new revolution. The AlphaZero algorithm developed by Google and DeepMind took just four hours of playing against itself to synthesise the chess knowledge of one and a half millennium and reach a level where it not only surpassed humans but crushed the reigning World Computer Champion Stockfish 28 wins to 0 in a 100-game match. All the brilliant stratagems and refinements that human programmers used to build chess engines have been outdone, and like Go players we can only marvel at a wholly new approach to the game.

    After DeepMind's AlphaZero the chess engine world, and the chess world, will never be quite the same again.

    The bombshell came in a quietly released academic paper published on 5 December 2017: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.

    The full article and a link to download the paper is at:

    https://chess24.com/en/read/news/dee...-crushes-chess

    The contents are stunning. The DeepMind team had managed to prove that a generic version of their algorithm, with no specific knowledge other than the rules of the game, could train itself for four hours at chess, two hours in shogi (Japanese chess) or eight hours in Go and then beat the reigning computer champions – i.e. the strongest known players of those games. In chess it wasn’t just a beating, but sheer demolition.

    Stockfish is the reigning TCEC computer chess champion, and while it failed to make the final this year it went unbeaten in 51 games. In a match with the chess-trained AlphaZero, though, it lost 28 games and won none, with the remaining 72 drawn. With White AlphaZero scored a phenomenal 25 wins and 25 draws, while with Black it “merely” scored 3 wins and 47 draws. It turns out the starting move is really important after all!

  2. #2
    Vadim Tsypin's Avatar
    Join Date
    2013-12-31
    Location
    Pointe-Claire, Quebec
    Posts
    447

    Re: Chess Mastered with a Self-Play Algorithm

    Quote Originally Posted by Wayne Komer View Post
    Chess Mastered with a Self-Play Algorithm
    The contents are stunning. The DeepMind team had managed to prove that a generic version of their algorithm, with no specific knowledge other than the rules of the game, could train itself for four hours at chess, two hours in shogi (Japanese chess) or eight hours in Go and then beat the reigning computer champions – i.e. the strongest known players of those games. In chess it wasn’t just a beating, but sheer demolition.
    Thanks Wayne, these are super-important news. How long would it take this generic algorithm to train in order to beat proprietary computer trading algorithms, eh?! :)

    Bravo AlphaZero, and kudos to the team behind it! Sid Belzberg has posted a very interesting description in another thread this fall.

  3. #3
    Wayne Komer's Avatar
    Join Date
    2010-03-04
    Posts
    3,874

    Re: Chess Mastered with a Self-Play Algorithm

    Chess Mastered with a Self-Play Algorithm

    December 6, 2017

    When I first read this article, I thought that it was the end of chess as we know it. If Magnus used the program to prepare his games, how could he be beaten?

    This caution today:

    The University of Princeton's AI expert Prof Joanna Bryson added that people should be cautious about buying too deeply into the firm's hype.

    But she added that its knack for good publicity had put it in a strong position against challengers.

    "It's not only about hiring the best programmers," she said.

    "It's also very political, as it helps makes Google as strong as possible when negotiating with governments and regulators looking at the AI sector."

    http://www.bbc.com/news/technology-42251535

  4. #4
    Garland Best's Avatar
    Join Date
    2008-05-30
    Posts
    645

    Re: Chess Mastered with a Self-Play Algorithm

    This is for real. And it turns everything about machine chess upside down.

    No opening book.
    No tablebases.
    No brute force approach.

    Just start with the rules and within four hours be better than the best program of 2016.

    Think what happens when this gets applied to stuff other than games.

  5. #5

    Re: Chess Mastered with a Self-Play Algorithm

    Quote Originally Posted by Wayne Komer View Post
    could train itself for four hours at chess [....] and then beat the reigning computer champion
    Speechless!

  6. #6
    Vadim Tsypin's Avatar
    Join Date
    2013-12-31
    Location
    Pointe-Claire, Quebec
    Posts
    447

    Re: Chess Mastered with a Self-Play Algorithm

    Quote Originally Posted by Garland Best View Post
    Just start with the rules and within four hours be better than the best program of 2016.

    Think what happens when this gets applied to stuff other than games.
    Garaland, here's a start for one possible scenario. :) -

    The First Millions

    Their first target was MTurk, the Amazon Mechanical Turk. After its launch in 2005 as a crowdsourcing Internet marketplace, it had grown rapidly, with tens of thousands of people around the world anonymously competing around the clock to perform highly structured chores called HITs, “Human Intelligence Tasks.” These tasks ranged from transcribing audio recordings to classifying images and writing descriptions of web pages, and all had one thing in common: If you did them well, nobody would know that you were an AI. Prometheus 10.0 was able to do about half of the task categories acceptably well. For each such task category, the Omegas had Prometheus design a lean custom-built narrow AI software module that could do precisely such tasks and nothing else. They then uploaded this module to Amazon Web Services, a cloud-computing platform that could run on as many virtual machines as they rented. For every dollar they paid to Amazon’s cloud-computing division, they earned more than $2 from Amazon’s MTurk division. Little did Amazon suspect that such an amazing arbitrage opportunity existed within their own company!

    To cover their tracks, they had discreetly created thousands of MTurk accounts during the preceding months in the names of fictitious people, and the Prometheus-built modules now assumed their identities. The MTurk customers typically paid after about eight hours, at which point the Omegas reinvested the money in more cloud-computing time, using still better task modules made by the latest version of the ever-improving Prometheus. Because they were able to double their money every eight hours, they soon started saturating MTurk’s task supply, and found that they couldn’t earn more than about $1 million per day without drawing unwanted attention to themselves. But this was more than enough to fund their next step, eliminating any need for awkward cash requests to the chief financial officer.
    Read the excerpt aptly titled "The Last Invention of Man" or, better, get the whole book: in Quebec, it's already available in local libraries, I just finished it last month.

    Interesting times ahead.

  7. #7
    Wayne Komer's Avatar
    Join Date
    2010-03-04
    Posts
    3,874

    Re: Chess Mastered with a Self-Play Algorithm

    Chess Mastered with a Self-Play Algorithm

    December 6, 2017

    Some online comments:

    Maelic - It is a nice step different direction, perhaps the start if the revolution but Alpha Zero is not yet better than Stockfish and if you keep up with me I will explain why. Most of the people are very excited now and wishing for sensation so they don't really read the paper or think about what it says which leads to uninformed opinions.

    The testing conditions were terrible. 1min/move is not really suitable time for any engine testing but you could tolerate that. What is intolerable though is the hashtable size - with 64 cores Stockfish was given, you would expect around 32GB or more otherwise it fills up very quickly leading to marked reduction in strength - 1GB was given and that far from ideal value! Also SF was now given any endgame tablebases, which is current norm for any computer chess engine.

    The computational power behind each entity was very different - while SF was given 64 CPU threads (really a lot I've got to say), Alpha Zero was given 4 TPUs. TPU is a specialized chip for machine learning and neural network calculations. It's estimated power compared to classical CPU is as follows - 1TPU ~ 30xE5-2699v3 (18 cores machine) -> Alpha Zero had at it's back power of ~2000 Haswell cores. That is nowhere near fair match. And yet, even though the result was dominant, it was not where it would be if SF faced itself 2000cores vs 64 cores, It that case the win percentage would be much more heavily in favor of the more powerful hardware.

    From those observations we can make a conclusion - Alpha Zero is not so close in strength to SF as Google would like us to believe. Incorrect match settings suggest either lack of knowledge about classical brute-force calculating engines and how they are properly used, or intention to create conditions where SF would be defeated.

    With all that said, it is still an amazing achievement and definitively fresh air in computer chess, most welcome these days. But for the new computer chess champion we will have to wait a little bit longer.
    ________

    - It is not necessary to play perfectly, only better than your opponent.

    - I want this program for stock exchange analysis !!!

    - The funny and ironic thing is that the games between AlphaZero and Stockfish were much more exciting than the games in London Chess Classic.

    - I read the full paper.

    As was said in the first comment above, the result, at least to me as a spectator, appears to be not completely valid, because the hardware was not equivalent between the two engines. I don't understand why the DeepMind team wouldn't grant Stockfish the same hardware as their engine. Alternatively, why not run AlphaZero on the same hardware as Stockfish?
    The time allotted for each move as well as the rules for resignation, on the other hand, seem to have been clearly thought-out and implemented.

    Nevertheless, a stunning result from a program that learned by itself without the benefit of human input in the form of evaluation functions, heuristics, and what not.

    - Hardware was not the same because a different approach is used. Stockfish can't use tensor processing units. In any case they provide a comparison in the paper how strength varies based on time per move. Stockfish still wins in short time controls based on limitations of the Monte Carlo simulation approach when there are not "enough" simulations.

    - What's the status of the "academic paper"? Has it been peer-reviewed, i.e. checked by independent experts not involved in the study - who may (or may not) raise points here first mentioned by maelic?

    It doesn't seem to be the case - just an authors' preprint, no scientific journal, no acknowledgements to reviewers. If this is the case, it won't be taken seriously in the scientific world but considered "grey literature".

    - I do not think that arguments of unfairness in hardware plays a major role, because Figure 2 in the report shows that Stockfish is reaching a plateau and more thinking time only marginally increases ELO. So more hardware for Stockfish will not make it much stronger.

    We also have to consider that AlphaGo also was at a disadvantage as it did not use opening and endgame books as far as I understand but Stockfish was. What would happen if AlphaGo could use endgame tables and opening books??

    Regarding whether the paper is peer-reviewed or not is not very relevant. If they make a reasonable case, show the games and other can follow their arguments, not much will change. In theoretical physics, pre-prints are taken very seriously.
    ________

    Stockfish was reaching a plateau, yes, but maybe it was because of hardware limitations! I'm by no means a computer hardware expert, but I don't think 'better' hardware means more thinking time; rather, it allows faster processing, exploration of more positions, deeper calculations, etc.

    Your point about AlphaZero being at a disadvantage because it didn't have opening books and endgame tablebases is a valid one.

    - Frankly I find there is a lot of hype around all this, but if just one looks at the games it should become apparent that Stockfish made a number of incomprehensible mistakes, and I am referring specifically to the ones which GM Hammer considers as positional masterpieces, so I honestly do not share his excitement at all.

    - In essence, this AlphaZero algorithm of tabula rasa (self-play reinforced learning) is an outside-the-box approach.

    Although it defeated SF (28x in 100 games), the chip used by the DeepMind program was much superior. Since the algorithm approaches of the two chess engines are radically different, it's premature to conclude if indeed AlphaZero is superior. For example, IFFY Stockfish is backed by the equivalent 2000 Haswell cores of AlphaZero's 4 TPUs, will the game score now even out (or even shift to SF favor)?

  8. #8
    Egidijus Zeromskis's Avatar
    Join Date
    2008-05-30
    Location
    Aurora/GTCL/ON
    Posts
    4,747

    Re: Chess Mastered with a Self-Play Algorithm

    Do I understand correctly that the algorithm after the learning period gains (statistical) knowledge of all parts of the game (opening, middle-, end-game) and stores that all information somewhere? If that is right, the match is not really right, as stockfish is not great without opening and endgame bases.
    While technical features of the Stockfish computer looks great, its play looks not that great.

    I think that if to make a longer learning period, AZ would try to solve a chess. Hmm, but they say it is impossible with the current and future hardware - too many possibilities.

    Anyway, the article gave some new ideas in chess programing. Lets hope that there will be more development in this field, and AZ will last longer that Deep Blue (dismantle after the match).

  9. #9
    Vlad Drkulec's Avatar
    Join Date
    2008-05-30
    Posts
    3,515

    Re: Chess Mastered with a Self-Play Algorithm

    It seems a bit fishy to me. AlphaZero learned to play book openings by playing itself for a very short period of time. The show Person of Interest had a AI computer learn/solve chess in a similar fashion. I think it would be more interesting and believable if Stockfish was run on equivalent hardware with opening book and endgame tablebases by a separate team of programmers. I wonder if this too is fake news. It is cool if it is real but...

  10. #10
    Sid Belzberg's Avatar
    Join Date
    2014-05-10
    Posts
    562

    Re: Chess Mastered with a Self-Play Algorithm

    Quote Originally Posted by Vlad Drkulec View Post
    It seems a bit fishy to me. AlphaZero learned to play book openings by playing itself for a very short period of time. The show Person of Interest had a AI computer learn/solve chess in a similar fashion. I think it would be more interesting and believable if Stockfish was run on equivalent hardware with opening book and endgame tablebases by a separate team of programmers. I wonder if this too is fake news. It is cool if it is real but...
    They married an Monte Carlo tree Search to a deep residual convolutional neural network stack. The Monte Carlo Tree Search is Heuristic search algorithm. This technology was not around several years ago. It is a radically new methodology for a computer to teach itself starting with a clean slate. I am certain it is for real. These things run on the Tensorflow research cloud that offers a research cloud with 180 petaflops(!) of cpu power. That is 10 to the power of 15 floating point calculations per second (Quadrillion!). 180000 trillion calcs per second! It is very real!
    https://www.tensorflow.org/tfrc/

    Here is the paper with 10 example games between alpha zero and stockfish. arxiv.org is a well known repository of scientific papers. Alphazero totally crushed stockfish! Hardly "fake news".
    https://arxiv.org/pdf/1712.01815.pdf
    Last edited by Sid Belzberg; Thursday, 7th December, 2017 at 02:08 AM.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Do you play online chess?
    By John Nguyen in forum ChessTalk - CANADA'S CHESS DISCUSSION BOARD...go to www.strategygames.ca for your chess needs!
    Replies: 2
    Last Post: Tuesday, 1st March, 2016, 02:22 PM
  2. Doubles (Bughouse) Chess Championship** Tomorrow!/ Casual Play @ Pub Chess Toronto
    By Yelizaveta Orlova in forum ChessTalk - CANADA'S CHESS DISCUSSION BOARD...go to www.strategygames.ca for your chess needs!
    Replies: 1
    Last Post: Sunday, 21st February, 2016, 03:30 PM
  3. LADIES PLAY FREE/IULIA LACAU-RODEAN Lecture/Tournaments/Casual play - wed.nov 11
    By Yelizaveta Orlova in forum ChessTalk - CANADA'S CHESS DISCUSSION BOARD...go to www.strategygames.ca for your chess needs!
    Replies: 0
    Last Post: Tuesday, 10th November, 2015, 11:04 AM
  4. Pub Chess Toronto/Sept 30 - CHESS TRAVELS WITH DANIEL WIEBE/Tournament/Casual Play
    By Yelizaveta Orlova in forum ChessTalk - CANADA'S CHESS DISCUSSION BOARD...go to www.strategygames.ca for your chess needs!
    Replies: 6
    Last Post: Friday, 2nd October, 2015, 10:24 AM
  5. Exception to Jon Edward's and Vlad Vukovic algorithm rules about the Greco sac
    By Alan Tomalty in forum ChessTalk - CANADA'S CHESS DISCUSSION BOARD...go to www.strategygames.ca for your chess needs!
    Replies: 3
    Last Post: Friday, 23rd January, 2015, 02:14 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •