A Numerical Look at Chess Perfection

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • A Numerical Look at Chess Perfection

    With all the talk in another thread about AlphaZero and Stockfish and a paradigm shift, I want to present a "numerical look" at chess perfection. This will be somewhat philosophical.

    So let's play God for a moment, and imagine that we have in front of us the total chess search tree. Every possible variation from the opening position is in there. It's a tree structure, so you can go back and forth along the branches, or to any node in the tree at will. All branches terminate with a result, so every move has a score based on: how many wins for the moving side it leads to, how many losses for moving side it leads to, how many draws it leads to.

    Ok, so let's make a few assumptions:
    (1) the average number of choices at any ply of a chess game is 35.
    (2) the average length of a chess game between any 2 chess engines, not including opening book or endgame tablebase moves, is 80 plies for each engine.

    What we are going to try and do is "score" chess perfection. Not a rating, but a "score" based on the following:

    From assumption (1), if we sort all 35 average choices on a given ply based on the value each move has in God's chess search tree, then we assign a perfection score to each move where the top (best) move gets 35 points, the 2nd best move gets 34 points, 3rd best move gets 33 points, all the way down to worst move gets 1 point. The only value these points have is in assessing how perfect a person or engine is playing chess.

    If we had a perfect chess engine that played best move on each and every ply guaranteed, and it played the average of 80 plies per game, it's perfection score would be 80 x 35 points = 2800 points (don't confuse this with 2800 ELO, not at all the same thing).

    So 2800 points is a typical chess perfection score for a typical 80 ply game.

    Now let's say we have an engine that plays 2nd best move on every ply, guaranteed. Its chess perfection score would be 80 x 34 points = 2720 points. This engine is playing at a perfection rate of 2720/2800 * 100 = 97.143% rounded to three decimal places. Of course, you'd get the same result by doing 34/35 * 100.

    If a person or engine played the worst possible move on every ply guaranteed, it would achieve the lowest possible perfection score of 1/35 * 80 = 80 points. We assume it still could last 80 plies in a typical game, but in reality it would be lasting maybe 20, so let's be realistic and give it a perfection score of 20 points.

    Now here's where things get interesting: where on this scale between 20 points (worst) and 2800 points (best) do you think the current version of Stockfish 8 would be, playing on a typical computer of quad core i7, 16 GB RAM, with opening book and endgame tablebase enabled (not relevant because those moves coming from opening book or endgame tablebase don't count in the perfection score)?

    Would it be maybe equivalent to a theoretical engine that always plays 2nd best move guaranteed on every ply? Would it be 97.143% perfect?

    Here's why I ask this question: because there is a limit to absolutely perfect chess, which I think a lot of people are ignoring. And if we are to believe that AlphaZero has a 28 - 0 score in 100 games versus that version of Stockfish playing on that hardware, that means we must believe that version of Stockfish playing on that hardware is very far down in perfection score. Like maybe equivalent to only playing maybe the 4th best move on every ply.

    But when we look at Stockfish's moves versus other engines, can we really believe it is playing only the 4th best move on every ply? That would mean on every ply, there are always 3 better moves!

    But of course, these are averages. Stockfish would perhaps be playing absolute best move on say half the plies. But that means on the other half, it must be playing only the 7th or 8th best move. Does that really sound believable?

    It could be playing 7th or 8th best moves in cases where the top 7 or 8 best moves are almost identical in score. Remember, the move scores come from a chess tree bigger than all the atoms in our universe. So these move scores will not be identical, there would almost never be a tie to infinite decimal places of two of these scores. But they could be so close that we could say they are VIRTUALLY identical.

    How often would this come up in a typical game, that there are 7 or 8 moves for the side to move that are just about identical in value on God's search tree? As much as 40 plies of an 80 ply game?

    You can see why I warned this is a philosophical approach -- there are no hard and fast answers to these questions. I'm just presenting another way of looking at the whole question of where current engines may be on the path to absolute chess perfection. We've been up to now largely assuming that today's alpha-beta brute force engines are very close to such perfection.

    If AlphaZero is really that much better, to win 28 out of 28 decisive games against Stockfish at or near its best, then it challenges that viewpoint. And THAT means that you should be able to go through a typical Stockfish 8 game played on typical hardware as described, and find a few dozens of plies where it played 5th, 6th, 7th or 8th best move. I wonder if anyone could actually do that, and identify those plies, and prove with analysis that there were on all those plies a handful of slightly better or at least equal moves that Stockfish did not play.
    Only the rushing is heard...
    Onward flies the bird.

  • #2
    Re: A Numerical Look at Chess Perfection

    Stockfish is designed primarily to play vs. Human style chess or against other engines also designed this way. If we accept your "perfection score" premise, it certainly could be possible that it's playing the 7th or 8th best move against moves that it's not really designed to cope with.

    One problem with your assumptions is that sometimes the 2nd-best move loses.

    You could in theory have an engine with a perfection score of 99.99% that had a huge blind spot and loses almost every game.
    Last edited by Christopher Mallon; Friday, 8th December, 2017, 08:17 PM.
    Christopher Mallon
    FIDE Arbiter

    Comment


    • #3
      Re: A Numerical Look at Chess Perfection

      Originally posted by Christopher Mallon View Post
      Stockfish is designed primarily to play vs. Human style chess or against other engines also designed this way. If we accept your "perfection score" premise, it certainly could be possible that it's playing the 7th or 8th best move against moves that it's not really designed to cope with.

      One problem with your assumptions is that sometimes the 2nd-best move loses.

      You could in theory have an engine with a perfection score of 99.99% that had a huge blind spot and loses almost every game.
      Agreed. The real question is... why do we even bother?

      Comment


      • #4
        Re: A Numerical Look at Chess Perfection

        Originally posted by Christopher Mallon View Post
        Stockfish is designed primarily to play vs. Human style chess or against other engines also designed this way. If we accept your "perfection score" premise, it certainly could be possible that it's playing the 7th or 8th best move against moves that it's not really designed to cope with.
        Hmmm.... well, first of all, no engine I know of actually classifies the last move that was played into a certain "type" of move and alters its strategy based on that "type" of move. The engines don't even care what the last move was, they are analyzing the current POSITION.

        So you would be saying Stockfish can only play well from certain types of POSITIONS, and if you put Stockfish consistently into "other types" of positions, it will respond with only 7th or 8th best move. An interesting theory, but my question would be, why hasn't any human player discovered what these "other types" of positions are and won consistently against Stockfish?

        Well, we do have Lyudmil Tsvetkov posting here, he's written a book basically saying the same thing and that he has actually accomplished this and can beat the top engines very frequently. He's a regular poster at TalkChess.com (ChessTalk flipped around), a site devoted mostly to computer chess programming.

        If his claims are verifiably correct, then you may have a point.



        Originally posted by Christopher Mallon View Post
        One problem with your assumptions is that sometimes the 2nd-best move loses.

        You could in theory have an engine with a perfection score of 99.99% that had a huge blind spot and loses almost every game.
        I am not saying there is such an engine in the real world that always plays 2nd best move guaranteed. I only used that as a prop to illustrate an engine that is "one level" down from the perfect chess engine. Every non-perfect chess engine would mix up best moves with 2nd best moves with 3rd best moves etc.

        But yes, if in theory there was an engine that always played 2nd best moves, it would once in a blue moon lose a game here and there spectacularly, with an outright blunder.
        Only the rushing is heard...
        Onward flies the bird.

        Comment

        Working...
        X