Komodo

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Komodo

    Komodo is the new computer World Champion and is now rated 3322, followed by Stockfish (3300) and Houdini (3297). Carlsen's 2862 rather pales by comparison and the gap continues to widen, now only 40 points from 500 ):

    http://www.extremetech.com/extreme/1...n-grandmasters

  • #2
    Re: Komodo

    Originally posted by Jack Maguire View Post
    Komodo is the new computer World Champion and is now rated 3322, followed by Stockfish (3300) and Houdini (3297). Carlsen's 2862 rather pales by comparison and the gap continues to widen, now only 40 points from 500 ):
    But since the computers generally just play other computers and not humans they form, in effect, another rating pool. Thus there is no real statistical justification for knowing what the 500 point "gap" actually means. We know that the computers are better than the best humans pretty clearly, but we don't really know how much better they are until we have a sufficient sample of games between the best computers and the best humans.

    Comment


    • #3
      Re: Komodo

      Originally posted by Jack Maguire View Post
      Komodo is the new computer World Champion and is now rated 3322, followed by Stockfish (3300) and Houdini (3297). Carlsen's 2862 rather pales by comparison and the gap continues to widen, now only 40 points from 500 ):

      http://www.extremetech.com/extreme/1...n-grandmasters

      I went to the TCEC archive site and tabulated some stats on the final match games of all 7 seasons (actually, 6 seasons: season 3 didn't appear to have a finals match, perhaps they used a RR to determine the winner, so I didn't include that season).

      A few interesting things:
      - 89 out of 304 final match games were decisive, for 29.7%
      - out of the 89 decisive games, White won 62 (69.7%)
      - from Season 1 finals to Season 4 finals, time controls were 150 minutes + 30 sec increment. But over this span, decisive game % went from 42.5% (Season 1) down to 20.0% (Season 4). So from Season 5 on, they shortened the time controls down to 120 moves + 30 sec increment. Decisive games jumped to 37.5% for season 5, then dropped to 29.7% for season 6, then dropped again to 17.2% for the latest season.

      That last point indicates that if time controls are kept constant, but the top 2 engines keeping getting better and the hardware they run on keeps getting faster, then less and less games between those 2 engines become decisive. This seems to support my view that for typical tournament time controls, computer engines and hardware will reach a point in maybe less than 10 years at which no decisive games can be produced between the 2 (or perhaps more) top engines.

      The really interesting thing is that over the past 2 season finals, White has won 27 of the 30 decisive games. This also supports the above-stated view, because if smaller and smaller mistakes are being made, not only should decisive games shrink in number, but before they disappear totally, the remaining few should all go in favor of White because of White's built-in first-move advantage. Season 7 finals (64 games): all 11 decisive games were won by White.
      Only the rushing is heard...
      Onward flies the bird.

      Comment


      • #4
        Re: Komodo

        I'm not 100% convinced that the the data supports your hypothesis that if time controls are kept constant the number of draws increase. Correlation does not equal causality. I tend to doubt that a 20% decrease in time controls could directly cause that significant an increase in then number of decisive games. It could be that in season 5 some advancement was made in some algorithms leading to an increase in wins, and then the competing software teams caught up.

        To really prove your hypothesis we would need to run two tournaments with the same hardware at the different time controls, and see how the result compare.

        The recent dearth of wins by black is interesting. Based on your numbers, for the first 4 seasons white won 35 of 59 decisive games, roughly 60%, much closer to normal play in human games. It will be interesting to see if the more recent results are a statistical anomaly or a real trend.

        Comment


        • #5
          Re: Komodo

          Originally posted by Garland Best View Post
          I'm not 100% convinced that the the data supports your hypothesis that if time controls are kept constant the number of draws increase. Correlation does not equal causality. I tend to doubt that a 20% decrease in time controls could directly cause that significant an increase in then number of decisive games. It could be that in season 5 some advancement was made in some algorithms leading to an increase in wins, and then the competing software teams caught up.

          To really prove your hypothesis we would need to run two tournaments with the same hardware at the different time controls, and see how the result compare.

          It turns out such a thing has been done. And the results are in favor of my hypothesis, which I will put forward here along with the supporting data and links. It all comes from the CCRL (Computer Chess Rating Lists) website:

          40 MOVES IN 4 MINUTES (ABOUT 6 SECONDS PER MOVE)

          Games Between All Engines

          taken from
          http://www.computerchess.org.uk/ccrl/404/

          Total: 1,225,149 games
          played by 1,414 programs
          ELO ranging from 3368 down to 287.

          White wins: 482,058 (39.3% of all games) (54.9% of decisive games)
          Black wins: 395,971 (32.3% of all games) (45.1% of decisive games)
          Draws: 347,120 (28.3% of all games)

          % of Decisive Games: 71.7%

          NOTE: this % is not reliable for a control because it may include games between engines that are hundreds of ELO rating points apart.



          Code:
          Games between Top 3 Engines
          taken from:	
          	http://www.computerchess.org.uk/ccrl/404/cgi/engine_details.cgi?print=Details&each_game=1&eng=Stockfish%205%2064-bit%204CPU#Stockfish_5_64-bit_4CPU
          and
          	http://www.computerchess.org.uk/ccrl/404/cgi/engine_details.cgi?print=Details&each_game=1&eng=Komodo%208%2064-bit%204CPU#Komodo_8_64-bit_4CPU
          	
          	All engines are the 64-bit 4CPU versions:
          	Stockfish 5 ELO 3368
          	Komodo 8    ELO 3359
          	Houdini 4   ELO 3338
          	
          	Engines differ by no more than 30 ELO rating points.
          	
          										Win		Loss          Draw
          	Stockfish 5		vs		Komodo 8 			+18		-20		=62
          	
          				vs		Houdini 4			+61		-16		=57
          
          	Komodo 8		vs		Houdini 4			+23		-24		=53
          
          	Total Games: 334
          	Total Draws: 172
          	% of Decisive Games: 48.5%


          40 MOVES IN 40 MINUTES (ABOUT 1 MINUTE PER MOVE)

          taken from
          http://www.computerchess.org.uk/ccrl/4040/

          A much longer time control, should result in lower % of decisive games by my hypothesis.

          Code:
          Games between Top 3 Engines
          taken from:	
          	http://www.computerchess.org.uk/ccrl/4040/cgi/engine_details.cgi?print=Details&each_game=1&eng=Komodo%208%2064-bit%204CPU#Komodo_8_64-bit_4CPU
          and
          	http://www.computerchess.org.uk/ccrl/4040/cgi/engine_details.cgi?print=Details&each_game=1&eng=Stockfish%205%2064-bit%204CPU#Stockfish_5_64-bit_4CPU
          	
          	All engines are the 64-bit 4CPU versions:
          	Komodo 8    ELO 3304
          	Stockfish 5 ELO 3283
          	Houdini 4   ELO 3274
          	
          	Engines differ by no more than 30 ELO rating points.
          	
          										Win		Loss	        Draw
          	Komodo 8		vs		Stockfish 5			+19		-8		=67
          	
          				vs		Houdini 4			+30		-17		=47
          
          	Stockfish 5		vs		Houdini 4			+20		-13		=44
          
          	Total Games: 265
          	Total Draws: 158
          	% of Decisive Games: 40.4%
          This is a statistically significant difference, LOWER than the number (48.5%) for the faster time control. This strongly supports my hypothesis.

          Garland, I should also point out that these 2 numbers are BOTH significantly above the % of decisive games from TCEC, where the time controls are very much longer. This again supports my hypothesis, producing a very hefty weight of evidence.

          We can also wait for more results from TCEC events. Now that they are down to 11/64 games decisive, they might shorten the time control again, which according to my hypothesis should result in a one-time jump in % of games that are decisive. And then after that, with the new shorter time control constant, the % decisive games should consistently drift downward. Of course, there's always the chance of statistical anomolies.

          Just out of curiosity, Garland, is there something about these hypotheses that you think doesn't make sense? If so, then what would you have us think instead? That % of decisive games will just keep fluctuating up and down irregardless of playing strength of the opponents and irregardless of time controls?
          Only the rushing is heard...
          Onward flies the bird.

          Comment


          • #6
            Re: Komodo

            Originally posted by Paul Bonham View Post
            This is a statistically significant difference, LOWER than the number (48.5%) for the faster time control. This strongly supports my hypothesis.
            It is a barely enough to say anything about engines strength with only ~200 games. (a bad opening tree could ruin all stats)

            On my machine without any help (no -bases), Stockfish versions wins against other. Stockfish 5 does not say anything too. The new revisions are released almost daily.

            Comment


            • #7
              Re: Komodo

              I didn't think your hypothesis was unreasonable, just that the data presented was insufficient to prove it. Hence my post. I would tend to agree in general that as playing strength increases, the number of draws between two equally skilled players increases, as the number and severity of errors made in a game grow smaller in size.

              Comment


              • #8
                Re: Komodo

                The Wall Street Journal chimes in with 'The Real Kings of Chess Are Computers'.

                http://www.wsj.com/articles/the-real...ers-1420827071

                Comment


                • #9
                  Re: Komodo

                  The Boston Globe comments on the 100-game match between the top two engines, Komodo and Stockfish. There were only 9 decisive results, 8 wins by Komodo and one solitary win by Stockfish. Surprisingly, 5 of Komodo's 8 wins came with Black, which rather begs the question, is White even an advantage? (:

                  https://www.bostonglobe.com/metro/20...cmJ/story.html

                  Comment


                  • #10
                    Re: Komodo

                    Magnus Carlsen would sit #59 on the following rating list, 516 points below Komodo ):

                    http://www.computerchess.org.uk/ccrl/4040/

                    Comment

                    Working...
                    X