Poker, Chess and AI

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Poker, Chess and AI

    Poker, Chess and AI

    From the Wall Street Journal, Jan. 8, 2015

    http://www.wsj.com/articles/computer...say-1420743623

    Computer Conquers Texas Hold ‘Em, Researchers Say Advances in Artificial Intelligence Allow Program to Play Poker Almost Perfectly, New Paper Asserts

    ROBERT LEE HOTZ

    Artificial intelligence experts said they have developed a computer card shark that plays poker almost perfectly, having mastered a version of a popular game called Texas Hold ’Em.

    While playful at heart, their advance in the computational mathematics of game theory may lead to broader innovations in military strategy, national security, medical decision-making, complex contract negotiations and auctions, experts said.
    The basic poker game “has been essentially solved, ” said Tuomas Sandholm, director of the Electronic Marketplaces Laboratory at Carnegie Mellon University in Pittsburgh, whose own poker-playing program won a 2014 world championship. He wasn’t involved in the project. “This is a breakthrough. ”

    In recent years, high-powered computer programs have outmatched top human players in chess, checkers, Scrabble and the quiz game Jeopardy, but the uncertainties of poker—where so much information about the state of play is hidden—until now had defied the efforts of dozens of research teams.

    _______

    In their work announced Thursday, researchers led by Michael Bowling at Canada’s University of Alberta Computer Poker Research Group in Edmonton tackled a two-player variant of that game called “heads-up limit hold ’em” in which the size of bets and the number of rounds of play during each hand are fixed. It is the simplest version of the game that people usually play. Scientists are still working to solve more complex variations of the poker game. Their Cepheus program not only outplays skilled human players in a fair game but always plays essentially a perfect game.

    ________

    Earlier Computer Successes

    Backgammon. In 1979, a backgammon-playing program defeated world champion Luigi Villa, the first time that a computer program beat a human champion at a board game.

    Chess. In 1997, a chess-playing computer developed by IBM called Deep Blue won a 6-game match against world champion Garry Kasparov.

    Checkers. In 1994, a checkers-playing computer program called Chinook was declared the Man-Machine World Champion in checkers.

    Scrabble. In 2007, a computer program called Quackle beat former Scrabble world champion David Boys.

    Go. In 2010, a program called MoGoTW running on a supercomputer defeated a professional Go player.

    Jeopardy! In 2011, an IBM computer system called Watson beat two human experts to win the quiz show’s first prize of $1 million.

  • #2
    Re: Poker, Chess and AI

    Computers have only beaten strong ("professional") human Go players on 9 x 9 boards, or on the regular 19 x 19 boards with handicaps.

    http://en.wikipedia.org/wiki/Computer_Go

    Comment


    • #3
      Re: Poker, Chess and AI

      Poker, Chess and AI

      The Wall Street Journal report is behind a paywall. These give much the same information:

      http://www.livescience.com/49376-com...ves-poker.html

      http://www.torontosun.com/2015/01/08...-texas-hold-em

      Comment


      • #4
        Re: Poker, Chess and AI

        Originally posted by Wayne Komer View Post
        Poker, Chess and AI

        The Wall Street Journal report is behind a paywall. These give much the same information:

        http://www.livescience.com/49376-com...ves-poker.html

        http://www.torontosun.com/2015/01/08...-texas-hold-em

        Wayne, it should be emphasized the rather severe limitations on this claim of "solving" poker: it is limited to the computer versus 1 other player, and the game being played is LIMIT Texas Hold 'Em. This means the size of the bets are limited, neither player can go all-in any time they want.

        It's also notable how they word their claim: the program CAN beat a human at this particular game. The word 'always' is missing, so we are left to wonder how often this would happen in reality.

        More importantly, I looked at all of the articles and none of them explain exactly how a human opponent was simulated. And to make their claim of "solving" this game totally unprovable, here's how they define "solving" it (from the livescience link you provided):

        "Because poker isn't solvable the way chess or checkers is, Bowling and his team came up with a different set of requirements for calling the game "solved." In scientific terms, the game is "essentially solved," which means that there is a way to exploit the strategy the computer uses. The researchers assumed a person played the computer for 70 years, 365 days per year, for 24 hours a day. The program they wrote played so well that if the big blind — the fixed bet — is $1,000, the most a perfect player can win is about $1 per hand, or 1/1000 of the big blind."

        That last statement is unclear: is it the computer program or is it a 'perfect' human player who can win a maximum of $1 per hand?

        There must be some mathematics proving this claim, and it would be great to see that. I am very skeptical on this claim. How would they simulate a human player? How does a mathematical equation take into account that during that 70 years of play, an ACTUAL human poker player would learn the tendencies of the program and adjust his or her play accordingly?

        That's what good poker players do. They change their play according to what they learn about their opponent. I didn't see anything in the articles claiming that this program has any such learn-opponent-tendencies-and-adjust-play-accordingly capabilities. Without that, I don't think any computer program could IN REALITY come out on top of a human player over that long a match, or even much less. Maybe over a few hours it CAN beat a human player. That's far from even just this one version of poker being "solved".
        Only the rushing is heard...
        Onward flies the bird.

        Comment


        • #5
          Re: Poker, Chess and AI

          Originally posted by Paul Bonham View Post
          Wayne, it should be emphasized the rather severe limitations on this claim of "solving" poker: it is limited to the computer versus 1 other player, and the game being played is LIMIT Texas Hold 'Em. This means the size of the bets are limited, neither player can go all-in any time they want.

          It's also notable how they word their claim: the program CAN beat a human at this particular game. The word 'always' is missing, so we are left to wonder how often this would happen in reality.

          More importantly, I looked at all of the articles and none of them explain exactly how a human opponent was simulated. And to make their claim of "solving" this game totally unprovable, here's how they define "solving" it (from the livescience link you provided):

          "Because poker isn't solvable the way chess or checkers is, Bowling and his team came up with a different set of requirements for calling the game "solved." In scientific terms, the game is "essentially solved," which means that there is a way to exploit the strategy the computer uses. The researchers assumed a person played the computer for 70 years, 365 days per year, for 24 hours a day. The program they wrote played so well that if the big blind — the fixed bet — is $1,000, the most a perfect player can win is about $1 per hand, or 1/1000 of the big blind."

          That last statement is unclear: is it the computer program or is it a 'perfect' human player who can win a maximum of $1 per hand?

          There must be some mathematics proving this claim, and it would be great to see that. I am very skeptical on this claim. How would they simulate a human player? How does a mathematical equation take into account that during that 70 years of play, an ACTUAL human poker player would learn the tendencies of the program and adjust his or her play accordingly?

          That's what good poker players do. They change their play according to what they learn about their opponent. I didn't see anything in the articles claiming that this program has any such learn-opponent-tendencies-and-adjust-play-accordingly capabilities. Without that, I don't think any computer program could IN REALITY come out on top of a human player over that long a match, or even much less. Maybe over a few hours it CAN beat a human player. That's far from even just this one version of poker being "solved".
          The computer doesn't need to do any adjustments to crush. It just plays a strategy that is close enough to unexploitable that it doesn't really matter what the human does or doesn't do. People generally make tons of theoretically bad plays, and a strategy that is close to unexploitable will win at an extremely high rate in practice.

          Comment


          • #6
            Re: Poker, Chess and AI

            Originally posted by Paul Bonham View Post
            Wayne, it should be emphasized the rather severe limitations on this claim of "solving" poker: it is limited to the computer versus 1 other player, and the game being played is LIMIT Texas Hold 'Em. This means the size of the bets are limited, neither player can go all-in any time they want.
            Silly human, did you think poker is immune to Moore's law?
            Last edited by Vlad Drkulec; Friday, 9th January, 2015, 03:05 AM.

            Comment


            • #7
              Re: Poker, Chess and AI

              Originally posted by Digeng Du View Post
              The computer doesn't need to do any adjustments to crush. It just plays a strategy that is close enough to unexploitable that it doesn't really matter what the human does or doesn't do. People generally make tons of theoretically bad plays, and a strategy that is close to unexploitable will win at an extremely high rate in practice.
              Well, I don't think anyone used the word "crush". But besides that, I would need to see a pretty good definition, complete with multiple scenarios, of "unexploitable".

              Also, I saw another writeup on this on phys.org and it appears that the authors refer to winning only in the context of winning more hands. Apparently the human could still win more chips by maximizing the pot values of the hands s/he does win.

              So the authors have redefined "solving poker" and they have also redefined "winning" in terms of a heads-up match. Winning is now winning more hands rather than winning more chips.
              Only the rushing is heard...
              Onward flies the bird.

              Comment


              • #8
                Re: Poker, Chess and AI

                Originally posted by Paul Bonham View Post
                Well, I don't think anyone used the word "crush". But besides that, I would need to see a pretty good definition, complete with multiple scenarios, of "unexploitable".

                Also, I saw another writeup on this on phys.org and it appears that the authors refer to winning only in the context of winning more hands. Apparently the human could still win more chips by maximizing the pot values of the hands s/he does win.

                So the authors have redefined "solving poker" and they have also redefined "winning" in terms of a heads-up match. Winning is now winning more hands rather than winning more chips.
                Unexploitable means exactly what it sounds like... no matter how you play against me, you'll lose in the long run. Adjustments wont save you (in fact, they'll probably hurt you). Unexploitable doesn't mean "best"... maybe if we played, you would fold too much and I should punish you with aggression. It just means "best" against the set of ALL possible strategies you could choose, and that you can't beat me in the long term with any of those strategies.

                There's tons of literature on what unexploitable play looks like in simplistic scenarios that you shouldn't have trouble finding (almost all of it in short-stacked, short-handed play). Here, I found you a couple relatively simple examples. Math incoming:

                http://www.bluff.com/magazine/unexpl...o-use-it-6801/
                http://ca.pokernews.com/strategy/gam...ld-em-5092.htm

                It's really not that hard to imagine a computer could come close to playing like this in a ton of heads-up situations... and if a computer could play in a completely game theory optimal fashion, it would absolutely destroy a human for a massive win-rate. We're just too human :)

                Comment


                • #9
                  Re: Poker, Chess and AI

                  Originally posted by Tyler Longo View Post
                  Unexploitable means exactly what it sounds like... no matter how you play against me, you'll lose in the long run. Adjustments wont save you (in fact, they'll probably hurt you). Unexploitable doesn't mean "best"... maybe if we played, you would fold too much and I should punish you with aggression. It just means "best" against the set of ALL possible strategies you could choose, and that you can't beat me in the long term with any of those strategies.

                  There's tons of literature on what unexploitable play looks like in simplistic scenarios that you shouldn't have trouble finding (almost all of it in short-stacked, short-handed play). Here, I found you a couple relatively simple examples. Math incoming:

                  http://www.bluff.com/magazine/unexpl...o-use-it-6801/
                  http://ca.pokernews.com/strategy/gam...ld-em-5092.htm

                  It's really not that hard to imagine a computer could come close to playing like this in a ton of heads-up situations... and if a computer could play in a completely game theory optimal fashion, it would absolutely destroy a human for a massive win-rate. We're just too human :)

                  Tyler, thanks for those links. So, in the first link, right in the first paragraph there is this example of an 'unexploitable' strategy:

                  "An unexploitable strategy for rock-paper-scissors is to throw rock 1/3 of the time, paper 1/3 of the time, and scissors 1/3 of the time, because there is no strategy your opponent can devise that would cause you to lose in the long term."

                  Now think about that: if you go out and play rock-paper-scissors and use that strategy, could you lose? More importantly, could you lose over and over? Absolutely yes to both questions. Your opponent could find a 'tell' in your play. Highly unlikely, but possible and not accounted for. The strategy doesn't work if you give away information, and until you figure out what that tell is, you are doomed.

                  So the word 'unexploitable' is misused in that example. The strategy isn't unexploitable, because it doesn't account for other factors in the game that make your play knowable.

                  (By the way, the second link uses the same rock-paper-scissors analogy, and also says nothing about preknowledge gained by tells.)

                  So how does this translate to this poker program? Well, let's take the assumptions in the first link. One of the big assumptions is that by raising all-in with A-5 in the small blind, the big blind is going to fold with any hand NOT in the range of [any pair or A-5+].

                  But this is an assumption that some human player could undermine! The human player could figure out that the computer will only go all-in from the small blind (I know, this program plays limit only but because this example is taken from No-Limit I have to go with that to make my point) with a range of A-5+. So the human decides to narrow his or her range for calling. Let's say s/he only calls with a pair of 10's or higher. This totally changes the equation, but the computer can't detect it and continues on with the assumption that human will call with the range that has been hard-coded into it. Now, the human is folding much more often, and calling with much higher quality hands. I don't know what this new equation works out to, but I think it's better for the human long-term, because s/he will be giving up many more small pots to gain more of the all-in pots.

                  The human knows the program is making only 'unexploitable' plays. S/he knows that all these plays are based on assumptions. The human reaction to this is to vary from the assumptions that make the play unexploitable and thus make it exploitable. I'm talking, of course, about a VERY good human poker player, not a fish. And we're talking about 70 straight years of 200 hands per hour for 12 hours each day (from http://poker-blog.srv.ualberta.ca/20...sequences.html). The human is going to learn a LOT about the program during that time!

                  All these contrived, mathematical examples have assumptions that a human player can figure out and adapt to. The computer can't adapt at all. Well, at least not according to the authors, who make no mention of any 'learning' algorithms in the program: "the dummy AI continues to play and evolve until it eventually arrives at a strategy which it no longer wishes to change, a ‘best’ strategy where any change would increase regret." So even as it plays, it won't change it's strategy. Any non-changing poker strategy is exploitable if it can be determined, just like my rock-paper-scissors example. And any good human player WOULD determine it.

                  The worst part is that the claims made by the authors are unprovable. "If a perfect opponent, either human or computerized, were to play a semi-infinite number of hands against Cepheus the best possible result would be for them to break even." What human is going to play a 'semi-infinite' number of hands against Cepheus? We're given a definition of 'semi-infinite' that requires 70 years of a human's life! In short, we're being asked to just believe this claim, and I for one do not believe it. Science demands better than this.

                  And voila, I am not alone in my views. Take a look at this:

                  http://www.theguardian.com/sport/201...t-cepheus-flaw

                  The author's points after playing 400 hands against Cepheus are exactly the points I've been making.
                  Only the rushing is heard...
                  Onward flies the bird.

                  Comment


                  • #10
                    Re: Poker, Chess and AI

                    Human does much worse just calling with T's plus vs calling with A5+ and any pair.
                    the reason should be obvious....

                    Comment


                    • #11
                      Re: Poker, Chess and AI

                      Originally posted by Digeng Du View Post
                      Human does much worse just calling with T's plus vs calling with A5+ and any pair.
                      the reason should be obvious....

                      Against that specific hand, A-5o, yes, but I am talking about against a RANGE of hands.

                      The example talks about you knowing the hand you are playing against as A-5, but the reality is, you NEVER know that. So what you do is (over N hands, maybe many hundreds or thousands of hands) you take note of what range of hands the program is using to go all-in before the flop from the small blind. Maybe A-5 is at the bottom of that range. But remember, you have an advantage in knowing that the program never varies. So whenever you see it go all-in before the flop after N hands, you KNOW it has a hand in this range, because you're a good player and you learned it from the program's play.

                      (Again, we're assuming No Limit because that's the context of the original example -- instead of "whenever you see it go all-in before the flop", you can substitute "whenever you see it max-raise before the flop" if you like.)

                      Let's just say the program's range for doing this is [A-5o to A-9o, A-2s to A-9s, 2-2 to 9-9]. I didn't pick that for any reason, just a random range.

                      I'm not enough of a poker player to know what range of hands you would have to have to call against that specific range. Against a human player, your range in this situation would vary, because predictability is the enemy of any good poker player. Against this program, it's more about just going against the assumptions of the program and you can do that consistently once you know the play of the program.

                      If you knew you were up against A-5o, then you would call with [A-5o+, 2-2+] and that's what the computer would EXPECT you to do for that hand. If you knew you were up against the upper end of the range, 9-9, you would call with [9-9 to A-A], a much tighter range but the range the program would expect you to call with. The median range you should call with might be denoted as Rmedian. A good player would be able to calculate Rmedian. That good player would then tighten his or her range in this situation, so that instead of always using Rmedian, s/he would be calling with a tighter range of Rtighter-than-median.

                      The result of THAT would be +EV for the human, I believe. If not, I'd have to see some math to prove it. The reason I think it would be is that the human would lose more 1BB by folding more often, but would win more 20BB by winning more of the calls at the higher end of the program's range. The 20-to-1 ratio decides it over many thousands of hands in favor of the human, if the range was tightened optimally. However, in Limit that ratio might be much smaller, so the tightening adjustment has to be less to compensate.

                      The program would lose over time because it can't adjust to the human's adjustment.

                      The article I linked to demonstrates this. The author noted that Cepheus was calling all his bluffs, and the author was losing chips. So the author adjusted by being more aggressive with good hands and stopped all bluffing. The program kept calling as it was against the bluffs, but now the author was winning more often and regained the chip lead. And the author is likely not even a really good player. Phil Ivey would decimate this program over many thousands of hands, imo.
                      Last edited by Paul Bonham; Friday, 9th January, 2015, 08:10 PM. Reason: spelling
                      Only the rushing is heard...
                      Onward flies the bird.

                      Comment


                      • #12
                        Re: Poker, Chess and AI

                        Even setting aside tells, etc. it would in large part depend on the SPR (Stack-to-Pot-Ratio). A non-exploitative strategy would vary depending upon how large the smallest stack size in the heads-up match was relative to the size of the pot before the all-in. The computer would have to perform the computation and then calculate or be given what the optimal range is for that SPR. Even knowing what the computer's non-exploitative range is in any specific situation, a person would not be able to do better against the computer relative to any other range chosen by a human or computer for that matter.

                        The article you cited where the author played a few hundred hands is waaaaay too small a sample size to draw any conclusions, imo.

                        As Tyler Longo pointed out, a savvy player might well do better than a computer vs a random player, but that doesn't change the fact that the computer's play is not exploitative, though of course it is not optimal in some circumstances.

                        If you think chess is dominated by opening memorization you can understand how all-in preflop bets and calls with small SPRs are even more based on memorizing a bunch of betting/calling ranges based on various SPRs. Most heads-up matches don't feature a lot of all-in preflop pots [Edit: at least at first when the blinds are really low]. Sit n' Goes (one table tournaments) when the table is down to two players are an exception to this rule because by that point the SPRs are often relatively low (for example you could have 1500 in the pot with the short stack having 5000 chips in front of him) and for those players knowing roughly the non-exploitative ranges are essential, imo.


                        Originally posted by Paul Bonham View Post
                        Let's just say the program's range for doing this is [A-5o to A-9o, A-2s to A-9s, 2-2 to 9-9]. I didn't pick that for any reason, just a random range.

                        I'm not enough of a poker player to know what range of hands you would have to have to call against that specific range.
                        Last edited by Tom O'Donnell; Friday, 9th January, 2015, 09:06 PM.
                        "Tom is a well known racist, and like most of them he won't admit it, possibly even to himself." - Ed Seedhouse, October 4, 2020.

                        Comment


                        • #13
                          Re: Poker, Chess and AI

                          A non-exploitative strategy for a computer in rock/paper/scissors: use a random number generator to create the next "throw". It's obviously not optimal, since it doesn't take into account any patterns the opponent has, but it is non-exploitative in the sense that there is no legal strategy that can be used against it to score more than 50 percent in the long run.
                          "Tom is a well known racist, and like most of them he won't admit it, possibly even to himself." - Ed Seedhouse, October 4, 2020.

                          Comment


                          • #14
                            Re: Poker, Chess and AI

                            Originally posted by Tom O'Donnell View Post
                            A non-exploitative strategy for a computer in rock/paper/scissors: use a random number generator to create the next "throw". It's obviously not optimal, since it doesn't take into account any patterns the opponent has, but it is non-exploitative in the sense that there is no legal strategy that can be used against it to score more than 50 percent in the long run.
                            computers / robots have already taken over rock / paper / scissors :-)

                            http://www.slate.com/blogs/future_te...me_video_.html

                            Comment


                            • #15
                              Re: Poker, Chess and AI

                              Originally posted by Tom O'Donnell View Post
                              The article you cited where the author played a few hundred hands is waaaaay too small a sample size to draw any conclusions, imo.

                              True, and the author admitted that in his article. But what to me was more telling than the overall result was the trending: the computer was winning early on and showed no inclination to fold to a bluff. The author took advantage of that very obvious trend and turned things totally around. An amateur player in just 400 hands! Maybe part of that was luck, and the author didn't include that bit of information, but I'm inclined to think something bigger was at work.

                              So Tom, do you buy into the claims of the authors of this program? Don't you find it suspicious that it's a claim that can never be proven? And if Phil Ivey was to soundly beat this program over say 10,000 hands.... (which we have to ask, how much money would it take to get him to even PLAY anywhere near that many hands against the program?).... the authors can always say it's too small a sample size. They need a sample size of 70 years playing 200 hands an hour for 12 hours a day, 365 days a year. I'm not making that up.

                              Maybe I should write a poker program and claim I only need a sample size of 10 years at 30 hands an hour for 8 hours a day for 313 days a year (one day off every weekend). Who is going to disprove it?

                              [edit]: and at these kind of sample sizes, no one is likely to ever get their losses back if they hit a real bad run in Vegas and lose their life savings because they played 'unexploitable' strategy.
                              Last edited by Paul Bonham; Friday, 9th January, 2015, 11:33 PM.
                              Only the rushing is heard...
                              Onward flies the bird.

                              Comment

                              Working...
                              X