Chess Mastered with a Self-Play Algorithm

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    Re: Digging deeper into the AlphaZero Chess paper

    Originally posted by Ted Hsu View Post
    Looking at the diagrams in the Silver et al. paper, I find it amusing that after learning about 2 hours, when it reaches a rating of about 3000, it suddenly "discovers" the Caro-Kann and the French Defense, and then seemingly discards them at still higher ratings.
    Can it even have a rating at all during its learning phase, when it is only playing itself? If yes, how meaningful is such a rating?
    Only the rushing is heard...
    Onward flies the bird.

    Comment


    • #47
      Re: Chess Mastered with a Self-Play Algorithm

      Originally posted by Sid Belzberg View Post
      Above is the main point that is germane to this discussion. Perhaps the law of diminishing returns that you cite with respect to training time is another relevant point. The rest of your long winded post appears to do nothing more then attempt to validate your technical credentials and present yourself as an altruistic person.

      The only technical argument I have seen from you as to why the poker results are not convincing is that the sample size is not large enough and therefore not statistically significant. However the trend over the last several years prior to the publication of this paper is a steady improvement in AI driven poker playing as cited in the paper.

      The paper is authored by credible researchers in this area and the results are indeed compelling. Although you may want to believe the results are a statistical anomaly the paper itself presents good mathematical arguments as to why the results are statistically significant.

      I would agree that this technology is not a panacea to solve all problems with incomplete information such as playing poker at a table of nine players as opposed to heads up matches.

      As far as options trading is concerned or for that matter bonds or equity trading at an institutional trading level the main problem institutional traders often face is trying to guess as to how large a buy or sell order is that they are up against when trying to predict the price movement.

      My friend that I mentioned earlier is doing research for a very large bank in deploying the MCTS/Neural network approach to identify trading patterns that are predictive of this. It is early days but the results of his research are interesting.

      I think you are approaching the problem with respect to online anti cheating computer chess from the wrong angle. I together with some co authors will be publishing a white paper in the next few days that among other things addresses the problem of controlling secure standardized software deployment and decentralized randomization techniques. It could have some applicability to online gaming. I will share the link when it is up.
      Sid, you didn't even get my motivation correct. There was no attempt to portray myself as altruistic. There was only the intention of getting you to see that the long time it is taking for my chess vision to become reality isn't because of laziness or because I'm diverting my money to fancy cars or clothes or material things. It is due to living in the U.S. where a person born with a disability is mostly left on their own, with family or charities as their only recourse. So a vast proportion of my earnings is going to support such a person because her family cannot do it and because I am in love with her. I don't claim any special status for myself on that basis. Just understand that I have obstacles to overcome to make my vision happen. The other obstacle is the fact that it's chess I am trying to work with, and for investors, that is a turn off which is why I come down hard on FIDE and anyone who supports the status quo in chess.

      I didn't say the poker results were a statistical anomaly. In fact, the reverse really. Humans do it on a regular basis, which makes it something very common in the poker domain. What I am really disputing is your claim that a chess variant with poker-like hidden information has no chance of success (I think you specified online) because AZ technnology will conquer it. My first argument is that AZ will only be another very good player on the scene, and will not dominate for the next several decades. My second argument is that even if it were to dominate, my vision is more about brick and mortar events anyway at casinos and hotels around the world. There could be anti-computer provisions at such events as there are at current chess events. But I don't think that problem is going to occur in the first place.

      I really don't know what you mean when you say I'm approaching anti-computer chess from the wrong angle. With respect to the variant which I will monetize and create a federation for, the goal wasn't so much to be anti-computer as it was to introduce randomness which allows higher variance in results. How could that be the wrong angle to approach chess from? It is the poker model, and it works, just ask tens of millions of poker players around the world. But chess so far doesn't offer any such product for chess players. Poker has proven that if you build it, they will come. Even if every single one of FIDE's membership refused to play, because they turn up their noses at introducing randomness, that still leaves about 599 million potential customers worldwide. Surely, as a businessman, you can appreciate what I am doing from that angle alone.

      But there's even more to it, such as that even if I only broke even, I'd still have the satisfaction of creating something that met a demand from lower-level chess players (to have a chance to win big money once in a while) that no one before me was meeting. In a world where millions of aspiring entrepreneurs are searching for an opening, for an unmet demand, I have found one, and I will be first into it! I would bet even Garry Kasparov will be envious when it gets rolling and he realizes it was born from such a simple creative spark.... why not give non-elite chess players a chance to win big money? That's what was missing from Maurice Ashley and Amy Lee's Millionaire Chess undertaking. They expected the lower chess players to continue to pay for the rewards of the elite chess players, so what they offered was just a gilded turd for most.

      You might even be doing Kasparov a favor if you put him in touch with me about this. But I would understand if he's in no position now to invest in something considering he and some partners appear to have invested heavily in online chess servers and a Universal Chess Rating which is going nowhere.

      Option Chess is a different variant which I deliberately released into public domain, and it is meant to be anti-computer. But I won't be doing anything to monetize that or create a federation for it, rather, I leave it for anyone else interested to do that. I will do what I can to get people to try it out, and they can decide for themselves whether or not to play it. Right now, I have a 2400 ELO player in Europe who is playing me 2 correspondence matches, and we are only up to move 12, and he has already claimed it is the most interesting chess he has ever played. I am not surprised. The game is fantastic! Don't knock it till you've tried it.
      Only the rushing is heard...
      Onward flies the bird.

      Comment


      • #48
        Re: Chess Mastered with a Self-Play Algorithm

        Originally posted by Paul Bonham
        In essence I will agree that this really amounts to an explosion of growth in the search tree. But it is an explosion of such magnitude that is it useless for engine programmers to pursue any of their normal CPU or memory or bandwidth or concurrency optimizations to try and compensate. Within just a few plies, it outstrips all imaginable compensations. It demands either radical new technology... or something more imaginable, a change from brute-force programming into AI programming.
        I agree with this statement you made on Chessbase some time ago. Randomness does not seem to figure into option chess.We still have a defined number of possibilities. Are you therefore suggesting that the search tree on a game of go is smaller then option chess? I am very sure it is not. I am confident that AZ could easily master option chess. Here are the possible number of go games 10^(5.3 × 10^170) , here is the number of possible chess games 10 ^120. If we assign your double move option to chess I doubt that it is going bump it up so many magnitudes that it exceeds the game of go. Perhaps you have already calculated the number of possible option chess games? < 2*(10^120) ? I would bet you a donut that it is no where near the number 10^(5.3 × 10^170). Show me the math that says otherwise and I will stand corrected.

        I also believe enormous shared distributed networks will be available for mass adoption within the next few years that will mean anyone can run sophisticated AI programs. When I post the whitepaper it will come clear to you what I mean.
        Last edited by Sid Belzberg; Sunday, 10th December, 2017, 02:31 AM.

        Comment


        • #49
          Re: Chess Mastered with a Self-Play Algorithm

          Originally posted by Sid Belzberg View Post
          I agree with this statement you made on Chessbase some time ago. Randomness does not seem to figure into option chess.We still have a defined number of possibilities. Are you therefore suggesting that the search tree on a game of go is smaller then option chess? I am very sure it is not. I am confident that AZ could easily master option chess. Here are the possible number of go games 10^(5.3 × 10^170) , here is the number of possible chess games 10 ^120. If we assign your double move option to chess I doubt that it is going bump it up so that many magnitudes that it exceeds the game of go. Perhaps you have already calculated the number of possible option chess games? I would bet you a donut that it is no where near the number 10^(5.3 × 10^170). Show me the math that says otherwise and I will stand corrected.

          I also believe enormous shared distributed networks will be available for mass adoption within the next few years that will mean anyone can run sophisticated AI programs. When I post the whitepaper it will come clear to you what I mean.

          Correct: randomness is not a part of Option Chess. I never stated otherwise. And I have never compared the size of the search tree of Option Chess with anything other than your ego. Ok, ok, that was a cheap shot, I just couldn't resist.... I wouldn't actually call you egotistical, just supremely self-confident and that's ok. Anyone who would bet a whole donut -- literally, a Canadian doughnut -- has to be supremely self-confident.

          Let me ask you this: would you say Go has the same level (amount) of overall tactics and strategy as chess? We all know how rich chess is in various strategies and tactics, how does Go compare? I'm guessing that for every ply in Go, while there may be many more move choices than in chess, the strategical and tactical considerations that could be applied to each separate move in Go would be far less than the amount of the same that could be applied to each separate move choice for a single ply in chess. However, there is likely no mathematical way to express that.

          Go seems to me to be a game of great simplicity relative to chess. I would summarize it this way: MANY MORE MOVE CHOICES PER PLY, BUT MUCH LESS IMPLICATIONS PER MOVE CHOICE.

          Therefore if we could somehow measure the OVERALL COMPLEXITY of Go versus chess, I'd bet chess turns out to be more complex. Or at least equally complex.

          If that be true, then Option Chess represents a whole new world for AZ. It will learn the game, I have no doubt of that, and that is revolutionary in itself. That is the neural network breakthrough I wish could have happened 20 years ago, when I first heard of Octavius (Garland Best: AZ was not the first neural net chess software that actually learned to play chess, by 20 years -- google "Octavius" together with "chess").

          But AZ will not dominate Option Chess according to my definition, which I'll even lower to winning 3 out of every 5 decisive games. We can argue that back and forth until your donut decomposes into a pile of mould (get it? Canadian spelling).

          If your AI network becomes reality, we may get a better idea within the next 10 years.

          EDIT: by the way, you wrote this: "the number of possible option chess games? < 2*(10^120)"

          Really? come on, are you laying a trap? I'm not falling for it.

          If chess has 10 ^ 120 possible games and you double the number of move choices PER PLY, you don't get 2 * 10 ^ 120 as your number of possible games. You get exponentially more than that!

          And Option Chess represents far more than doubling the move choices per ply. If there are 35 choices per ply in chess, Option Chess gives (35 x 34) / 2 choices per ply (divide by 2 because half the combinations are mirror images of each other), except it eliminated the choices where the first move would be check or a capture. So maybe (17 x 34) / 2 choices per ply = 17 x 17 choices per ply, a conservative guess as an overall average.
          Last edited by Paul Bonham; Sunday, 10th December, 2017, 03:05 AM.
          Only the rushing is heard...
          Onward flies the bird.

          Comment


          • #50
            Re: Chess Mastered with a Self-Play Algorithm

            Originally posted by Paul Bonham View Post
            Correct: randomness is not a part of Option Chess. I never stated otherwise. And I have never compared the size of the search tree of Option Chess with anything other than your ego. Ok, ok, that was a cheap shot, I just couldn't resist.... I wouldn't actually call you egotistical, just supremely self-confident and that's ok. Anyone who would bet a whole donut -- literally, a Canadian doughnut -- has to be supremely self-confident.

            Let me ask you this: would you say Go has the same level (amount) of overall tactics and strategy as chess? We all know how rich chess is in various strategies and tactics, how does Go compare? I'm guessing that for every ply in Go, while there may be many more move choices than in chess, the strategical and tactical considerations that could be applied to each separate move in Go would be far less than the amount of the same that could be applied to each separate move choice for a single ply in chess. However, there is likely no mathematical way to express that.

            Go seems to me to be a game of great simplicity relative to chess. I would summarize it this way: MANY MORE MOVE CHOICES PER PLY, BUT MUCH LESS IMPLICATIONS PER MOVE CHOICE.

            Therefore if we could somehow measure the OVERALL COMPLEXITY of Go versus chess, I'd bet chess turns out to be more complex. Or at least equally complex.

            If that be true, then Option Chess represents a whole new world for AZ. It will learn the game, I have no doubt of that, and that is revolutionary in itself. That is the neural network breakthrough I wish could have happened 20 years ago, when I first heard of Octavius (Garland Best: AZ was not the first neural net chess software that actually learned to play chess, by 20 years -- google "Octavius" together with "chess").

            But AZ will not dominate Option Chess according to my definition, which I'll even lower to winning 3 out of every 5 decisive games. We can argue that back and forth until your donut decomposes into a pile of mould (get it? Canadian spelling).

            If your AI network becomes reality, we may get a better idea within the next 10 years.

            EDIT: by the way, you wrote this: "the number of possible option chess games? < 2*(10^120)"

            Really? come on, are you laying a trap? I'm not falling for it.

            If chess has 10 ^ 120 possible games and you double the number of move choices PER PLY, you don't get 2 * 10 ^ 120 as your number of possible games. You get exponentially more than that!
            Originally posted by Paul Bonham
            Or at least equally complex.
            I have only played a few games of Go in my life. I found a lot of thinking to be chess like (if I do this then he does that then I do this etc etc etc). Strategically it is every bit as rich as chess if not more so. Play a few games (it is enjoyable) and you will know what I mean. In my opinion more complex then chess but that is really subjective.
            Originally posted by Paul Bonham
            f chess has 10 ^ 120 possible games and you double the number of move choices PER PLY, you don't get 2 * 10 ^ 120 as your number of possible games. You get exponentially more than that!
            My question refers to number of possible games. In your rules lets assume that you have the option for a double move at any time during the game. Then you have twice as many possible moves available to you. In the game of Go it is played on a 19 by 19 board, much larger then a 8 by 8 chess board. So even with twice as many moves offered per game in option chess Go has many magnitudes more possible games then even option chess. In any event comparing the search tree plys in a regular chess engine to MCTS is like comparing an easter egg hunt in central park where in one case you have dragnet of police searching every square inch of the park and in the other case you deploy a small group of 6 year olds who randomly run through the park.
            Last edited by Sid Belzberg; Sunday, 10th December, 2017, 04:00 AM.

            Comment


            • #51
              Re: Chess Mastered with a Self-Play Algorithm

              Originally posted by Sid Belzberg View Post
              In your rules lets assume that you have the option for a double move at any time during the game. Then you have twice as many possible moves available to you.
              No, that is incorrect. If you have 35 average possible move choices on a given ply in chess, and you allow all possible double moves with no restrictions on that very same ply, then you would have (35 x 34) / 2 possible moves on that ply. The / 2 factor is because half the choices would be mirror images of each other.

              In Option Chess, the first move cannot be a check or capture. So let's assume half the possible moves on a given ply are either a check or a capture. That would leave us with (17 x 34) /2 moves per ply, which equals (17 x 17) moves per ply, which is 289 moves per ply average for Option Chess, all 289 producing a unique position. Since I've been very conservative, I think we can safely say 300 possible moves per ply AVERAGE in Option Chess.

              But we can add to this the possibility of NOT playing an option, which gives us the original 35 move choices on top of that.

              So 335 average move choices per ply in option chess.

              The one thing we haven't factored in is transpositions. Since Go is simply placement of white stones and black stones on a 19 x 19 grid, it will be unimaginably full of transpositions, orders of magnitude more so than in chess or option chess.
              Only the rushing is heard...
              Onward flies the bird.

              Comment


              • #52
                Re: Chess Mastered with a Self-Play Algorithm

                Originally posted by Paul Bonham View Post
                No, that is incorrect. If you have 35 average possible move choices on a given ply in chess, and you allow all possible double moves with no restrictions on that very same ply, then you would have (35 x 34) / 2 possible moves on that ply. The / 2 factor is because half the choices would be mirror images of each other.

                In Option Chess, the first move cannot be a check or capture. So let's assume half the possible moves on a given ply are either a check or a capture. That would leave us with (17 x 34) /2 moves per ply, which equals (17 x 17) moves per ply, which is 289 moves per ply average for Option Chess, all 289 producing a unique position. Since I've been very conservative, I think we can safely say 300 possible moves per ply AVERAGE in Option Chess.

                But we can add to this the possibility of NOT playing an option, which gives us the original 35 move choices on top of that.

                So 335 average move choices per ply in option chess.

                The one thing we haven't factored in is transpositions. Since Go is simply placement of white stones and black stones on a 19 x 19 grid, it will be unimaginably full of transpositions, orders of magnitude more so than in chess or option chess.
                Even with 335 moves per average ply on option chess compared compared to 35 moves in regular chess that is only a magnitude of approximately 10 times. Comparing chess to go .......you can see 10^121 possible Option chess games vs 10^(5.3*10^170) possible go games the two are not even close. So if the thing can master Go I don't see any rational argument that it could not master Option chess. The more interesting part of the question is once the thing is trained it appears that the trained program does not require the massive computational resources to run as I pointed out the Chessbase article stated. So the problem is that if big money is up for grabs with a online chess variant inevitably someone will use a shared TPU facility like the one google offers to train up a program for option chess. Maybe there will be time before this scenario comes to pass but make no mistake about it AZ is a game changer to the world of online games, including poker. It is simply a question of when and not if. Few believed it would beat even a hobbled version of Stockfish within only 6 weeks of mastering Go.
                Last edited by Sid Belzberg; Sunday, 10th December, 2017, 01:51 PM.

                Comment


                • #53
                  Re: Chess Mastered with a Self-Play Algorithm

                  Originally posted by Sid Belzberg View Post
                  Even with 335 moves per average ply on option chess compared compared to 35 moves in regular chess that is only a magnitude of approximately 10 times. Comparing chess to go .......you can see 10^121 possible Option chess games vs 10^(5.3*10^170) possible go games the two are not even close. So if the thing can master Go I don't see any rational argument that it could not master Option chess. The more interesting part of the question is once the thing is trained it appears that the trained program does not require the massive computational resources to run as I pointed out the Chessbase article stated. So the problem is that if big money is up for grabs with a online chess variant inevitably someone will use a shared TPU facility like the one google offers to train up a program for option chess. Maybe there will be time before this scenario comes to pass but make no mistake about it AZ is a game changer to the world of online games, including poker. It is simply a question of when and not if. Few believed it would beat even a hobbled version of Stockfish within only 6 weeks of mastering Go.

                  Sid, I thought you were better at math than what you are showing in this discussion. The 335 possible moves per ply for Option Chess is 10x the 35 moves per ply for chess. But that is PER PLY! So the overall possibilities are far, far more than 10x.

                  In chess you have 35 x 35 x35.... x35 N times, where N is the average number of plies per game.
                  That is 35 ^ N.
                  In Option Chess it is 335 x 335 x 335..... x 335, which is 335 ^ N.

                  In just 2 plies, you have 35 x 35 = 1225 possibilities for chess, and you have 335 x 335 = 112,225 .... WAY more than 10x the possibilities for chess. And the difference gets exponentially bigger with each additional ply.

                  So what we have here is a possible games graph where, if the symbol "<<<" means "exponentially less than", you would have

                  chess <<< Option Chess <<< Go

                  So despite your inaccurate math, your main point is still valid (if your figure for possible games in Go is correct, I am reasonably sure you aren't just making it up).

                  My point is that even though Option Chess <<< Go, the overall complexity of Option Chess is >>> Go. You would argue the opposite, but it is a subjective argument that neither of us can win.

                  So we can only await the day when an AZ engine can play Option Chess AND can have an opponent that is "GM Level" or better at Option Chess. Since no one is playing option chess at the moment, that 2nd condition may be many years away. However, I am going to get on TalkChess.com and see if any engine programmers are interested in writing a conventional brute force engine for Option chess.

                  Oh, and Sid, the "big money online chess variant" isn't going to be Option chess. It's a totally different variant where there is way more hidden information than in Option chess, plus there is randomness like in poker. I think this would be the biggest challenge ever for AZ, and it would fail to dominate and might not even do well at all.

                  Finally, we need AZ to play a fully optimized version of Stockfish and see the results from that. Keep in mind that for such a match, Stockfish will be playing MUCH better, whereas AZ is already at virtual peak playing ability and won't be improving by any measureable amount (maybe a few tens of ELO points). If they play to a virtual even result, my point that AZ won't dominate is already well on its way to being proven.

                  The learning part is remarkable, though.
                  Only the rushing is heard...
                  Onward flies the bird.

                  Comment


                  • #54
                    Re: Chess Mastered with a Self-Play Algorithm

                    Originally posted by Paul Bonham View Post
                    Sid, I thought you were better at math than what you are showing in this discussion. The 335 possible moves per ply for Option Chess is 10x the 35 moves per ply for chess. But that is PER PLY! So the overall possibilities are far, far more than 10x.

                    In chess you have 35 x 35 x35.... x35 N times, where N is the average number of plies per game.
                    That is 35 ^ N.
                    In Option Chess it is 335 x 335 x 335..... x 335, which is 335 ^ N.

                    In just 2 plies, you have 35 x 35 = 1225 possibilities for chess, and you have 335 x 335 = 112,225 .... WAY more than 10x the possibilities for chess. And the difference gets exponentially bigger with each additional ply.

                    So what we have here is a possible games graph where, if the symbol "<<<" means "exponentially less than", you would have

                    chess <<< Option Chess <<< Go

                    So despite your inaccurate math, your main point is still valid (if your figure for possible games in Go is correct, I am reasonably sure you aren't just making it up).

                    My point is that even though Option Chess <<< Go, the overall complexity of Option Chess is >>> Go. You would argue the opposite, but it is a subjective argument that neither of us can win.

                    So we can only await the day when an AZ engine can play Option Chess AND can have an opponent that is "GM Level" or better at Option Chess. Since no one is playing option chess at the moment, that 2nd condition may be many years away. However, I am going to get on TalkChess.com and see if any engine programmers are interested in writing a conventional brute force engine for Option chess.

                    Oh, and Sid, the "big money online chess variant" isn't going to be Option chess. It's a totally different variant where there is way more hidden information than in Option chess, plus there is randomness like in poker. I think this would be the biggest challenge ever for AZ, and it would fail to dominate and might not even do well at all.

                    Finally, we need AZ to play a fully optimized version of Stockfish and see the results from that. Keep in mind that for such a match, Stockfish will be playing MUCH better, whereas AZ is already at virtual peak playing ability and won't be improving by any measureable amount (maybe a few tens of ELO points). If they play to a virtual even result, my point that AZ won't dominate is already well on its way to being proven.

                    The learning part is remarkable, though.
                    Yes, my apologies about the inaccurate math, was getting a bit punch drunk, sometimes have trouble sleeping. I don't think you really need to hire an engine programmer to do a brute force model as it probably would not work very well. I have been watching various open source projects on github for MCTS/Neural network projects and it may be easier to wait for that and modify code for option chess and train it on the shared TPU platform. I think you would be happier with the result. Just for the fun of it I was going through this example of how to get MCTS to train for tic tac toe written in Java.
                    http://www.baeldung.com/java-monte-carlo-tree-search.
                    i am not sure if AZ totally flatlined at 4 hours and that's why they stopped the training. Will be interesting to see. We live in exciting times!

                    Comment


                    • #55
                      Re: Chess Mastered with a Self-Play Algorithm

                      Has the author team commented on their paper? I would love to now the prospect of the chess project.

                      Comment


                      • #56
                        Re: Chess Mastered with a Self-Play Algorithm

                        Looking at pictures of the Google "computer" reminds of those 50 years old computers
                        https://www.nextplatform.com/2017/05...ning-clusters/

                        Comment


                        • #57
                          Re: Chess Mastered with a Self-Play Algorithm

                          Originally posted by Sid Belzberg View Post
                          Above is the main point that is germane to this discussion. Perhaps the law of diminishing returns that you cite with respect to training time is another relevant point. The rest of your long winded post appears to do nothing more then attempt to validate your technical credentials and present yourself as an altruistic person.

                          The only technical argument I have seen from you as to why the poker results are not convincing is that the sample size is not large enough and therefore not statistically significant. However the trend over the last several years prior to the publication of this paper is a steady improvement in AI driven poker playing as cited in the paper.

                          The paper is authored by credible researchers in this area and the results are indeed compelling. Although you may want to believe the results are a statistical anomaly the paper itself presents good mathematical arguments as to why the results are statistically significant.

                          I would agree that this technology is not a panacea to solve all problems with incomplete information such as playing poker at a table of nine players as opposed to heads up matches.

                          As far as options trading is concerned or for that matter bonds or equity trading at an institutional trading level the main problem institutional traders often face is trying to guess as to how large a buy or sell order is that they are up against when trying to predict the price movement.

                          My friend that I mentioned earlier is doing research for a very large bank in deploying the MCTS/Neural network approach to identify trading patterns that are predictive of this. It is early days but the results of his research are interesting.

                          I think you are approaching the problem with respect to online anti cheating computer chess from the wrong angle. I together with some co authors will be publishing a white paper in the next few days that among other things addresses the problem of controlling secure standardized software deployment and decentralized randomization techniques. It could have some applicability to online gaming. I will share the link when it is up.
                          THe whitepaper I mentioned is now posted on our website. Whitepaper Url https://docs.wixstatic.com/ugd/14667...75d6bcfce9.pdf our website url is ig17.xyz
                          Last edited by Sid Belzberg; Saturday, 16th December, 2017, 08:22 PM.

                          Comment


                          • #58
                            Re: Chess Mastered with a Self-Play Algorithm

                            Chess Mastered with a Self-Play Algorithm

                            December 19, 2017

                            Ken Regan has weighed in on AlphaZero:

                            https://rjlipton.wordpress.com/2017/...uth-from-zero/

                            Truth From Zero

                            Extracts:

                            Although AlphaZero’s 64-36 margin over Stockfish looks like a shellacking, it amounts to only 100 points difference on the Elo scale. The scale was built around the idea that a 200-point difference corresponds to about 75% expectation for the stronger player—and this applies to all games. Higher gains become multiplicatively harder to achieve and maintain. This makes the huge margins in Go and Shogi all the more remarkable.

                            There has been widespread criticism of the way Stockfish was configured for the match. Stockfish was given 1 minute per move regardless of whether it was an obvious recapture or a critical moment. It played without its customary opening book or endgame tables of perfect play with 6 or fewer pieces. The 64 core threads it was given were ample hardware but they communicated position evaluations via a hash table of only one gigabyte, a lack said to harm the accuracy of deeper searches. However hobbled, what stands out is that Stockfish still drew almost three-fourths of the games, including exactly half the games it played Black.

                            I have fielded numerous queries these past two weeks about how this affects my estimate that perfect play in chess is rated no higher than 3500 or 3600, which many others consider low. Although the “rating of God” moniker is played up for popular attention, it really is a vital component in my model: it is the {Y}-intercept of regressions of player skill versus model parameters and inputs. I’ve justified it intuitively by postulating that slightly randomized versions of today’s champion programs could score at least 10–15% against any strategy. I regard the ratings used for the TCEC championships as truer to the human scale than the CCRL ratings. TCEC currently rates the latest Stockfish version at 3226, then 3224 for Komodo and 3192 for the Houdini version that won the just-completed 10th TCEC championship. CCRL shows all of Houdini, Komodo, and an assembly-coded version of Stockfish above 3400. Using the TCEC ratings and the

                            Ken has possible experiments for AlphaZero – one is on test positions for which perfect answers are known.

                            See the paper and the 35 comments that follow.

                            Comment


                            • #59
                              Re: Chess Mastered with a Self-Play Algorithm

                              In https://www.youtube.com/watch?v=DXNqYSNvnjA Demis Hassabis talks about the work done in DeepMind, the company he founded, on game playing using machine learning. For a chess player the parts from minute 22:07 onwards are most relevant. I think that the section from the start of the video up to minute 9:14 is also worth watching since it succinctly summarizes his overall philosophy. This research involves deep reinforcement learning, where there is an agent interacting with an environment and getting feedback from it. DeepMind has combined deep convolutional networks with reinforcement learning, which is in itself novel, and is called deep reinforcement learning. They were not the first people to use either approach, it the effective combination of the two which is their main contribution. In my field of computer vision where the goal is to make computers “see” like humans deep convolutional networks have already had a big impact. Very quickly and surprisingly they have outperformed the best hand crafted systems created by computer vision experts on a variety of tasks. These results can now be easily duplicated by basically anyone who can follow instructions and has a good graphics card on their computer.

                              However, DeepMind is a private company, and while they do publish papers, they do not provide source code for their experiments as is now done by most researchers at Universities. Their earlier experiments involving deep reinforcement learning for playing video games (such as breakout or pong) can now be easily duplicated and work well. However, reinforcement learning was thought to be best suited for robotics applications (such as driving) where the feedback is very fast. It is very surprising that this approach works so well for games like Go and Chess which have a much longer horizon, and until they give out their source code there will be questions about this research.

                              In deep learning the training process is around one hundred times slower than executing the final network, and a lot of work is being done to make this execution process efficient, even for large networks. This means that any neural network trained to play go, chess, etc. will eventually be able to run on home computers, even if they can not be initially trained on such machines. This is the case now for deep convolutional networks of all kinds. As I said, it is possible for anyone with a graphics card to execute and even retrain a deep convolutional neural network using what is called transfer learning. This means, for example, that it should be possible to adjust or tweak a neural network to play chess in a way that mimics a particular grandmaster, as long as there are enough games available for the training process. This is not so easy to do with traditional chess playing programs. Of course this assumes that the DeepMind approach works as well as they claim.

                              In summary there were almost no computer scientists in the world who predicted this degree of success for any type of deep learning neural network approach ten years ago. In problem after problem these networks have outperformed customized algorithms created by domain experts over many years. They have done so using relatively simple algorithms running on very fast computers trained with large amounts of data. All these factors (algorithms, computer power, amount of training data) are still improving rapidly. While these networks do not show general intelligence I believe that they will eventually be “intelligent” enough to create robots that perform a wide variety of tasks such as driving, walking; basically interacting with their environments in a human like fashion. Making them play games is just a step on the way to creating such robots.

                              Comment

                              Working...
                              X