If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
Policy / Politique
The fee for tournament organizers advertising on ChessTalk is $20/event or $100/yearly unlimited for the year.
Les frais d'inscription des organisateurs de tournoi sur ChessTalk sont de 20 $/événement ou de 100 $/année illimitée.
You can etransfer to Henry Lam at chesstalkforum at gmail dot com
Transfér à Henry Lam à chesstalkforum@gmail.com
Dark Knight / Le Chevalier Noir
General Guidelines
---- Nous avons besoin d'un traduction français!
Some Basics
1. Under Board "Frequently Asked Questions" (FAQs) there are 3 sections dealing with General Forum Usage, User Profile Features, and Reading and Posting Messages. These deal with everything from Avatars to Your Notifications. Most general technical questions are covered there. Here is a link to the FAQs. https://forum.chesstalk.com/help
2. Consider using the SEARCH button if you are looking for information. You may find your question has already been answered in a previous thread.
3. If you've looked for an answer to a question, and not found one, then you should consider asking your question in a new thread. For example, there have already been questions and discussion regarding: how to do chess diagrams (FENs); crosstables that line up properly; and the numerous little “glitches” that every new site will have.
4. Read pinned or sticky threads, like this one, if they look important. This applies especially to newcomers.
5. Read the thread you're posting in before you post. There are a variety of ways to look at a thread. These are covered under “Display Modes”.
6. Thread titles: please provide some details in your thread title. This is useful for a number of reasons. It helps ChessTalk members to quickly skim the threads. It prevents duplication of threads. And so on.
7. Unnecessary thread proliferation (e.g., deliberately creating a new thread that duplicates existing discussion) is discouraged. Look to see if a thread on your topic may have already been started and, if so, consider adding your contribution to the pre-existing thread. However, starting new threads to explore side-issues that are not relevant to the original subject is strongly encouraged. A single thread on the Canadian Open, with hundreds of posts on multiple sub-topics, is no better than a dozen threads on the Open covering only a few topics. Use your good judgment when starting a new thread.
8. If and/or when sub-forums are created, please make sure to create threads in the proper place.
Debate
9. Give an opinion and back it up with a reason. Throwaway comments such as "Game X pwnz because my friend and I think so!" could be considered pointless at best, and inflammatory at worst.
10. Try to give your own opinions, not simply those copied and pasted from reviews or opinions of your friends.
Unacceptable behavior and warnings
11. In registering here at ChessTalk please note that the same or similar rules apply here as applied at the previous Boardhost message board. In particular, the following content is not permitted to appear in any messages:
* Racism
* Hatred
* Harassment
* Adult content
* Obscene material
* Nudity or pornography
* Material that infringes intellectual property or other proprietary rights of any party
* Material the posting of which is tortious or violates a contractual or fiduciary obligation you or we owe to another party
* Piracy, hacking, viruses, worms, or warez
* Spam
* Any illegal content
* unapproved Commercial banner advertisements or revenue-generating links
* Any link to or any images from a site containing any material outlined in these restrictions
* Any material deemed offensive or inappropriate by the Board staff
12. Users are welcome to challenge other points of view and opinions, but should do so respectfully. Personal attacks on others will not be tolerated. Posts and threads with unacceptable content can be closed or deleted altogether. Furthermore, a range of sanctions are possible - from a simple warning to a temporary or even a permanent banning from ChessTalk.
Helping to Moderate
13. 'Report' links (an exclamation mark inside a triangle) can be found in many places throughout the board. These links allow users to alert the board staff to anything which is offensive, objectionable or illegal. Please consider using this feature if the need arises.
Advice for free
14. You should exercise the same caution with Private Messages as you would with any public posting.
Komodo is the new computer World Champion and is now rated 3322, followed by Stockfish (3300) and Houdini (3297). Carlsen's 2862 rather pales by comparison and the gap continues to widen, now only 40 points from 500 ):
Komodo is the new computer World Champion and is now rated 3322, followed by Stockfish (3300) and Houdini (3297). Carlsen's 2862 rather pales by comparison and the gap continues to widen, now only 40 points from 500 ):
But since the computers generally just play other computers and not humans they form, in effect, another rating pool. Thus there is no real statistical justification for knowing what the 500 point "gap" actually means. We know that the computers are better than the best humans pretty clearly, but we don't really know how much better they are until we have a sufficient sample of games between the best computers and the best humans.
Komodo is the new computer World Champion and is now rated 3322, followed by Stockfish (3300) and Houdini (3297). Carlsen's 2862 rather pales by comparison and the gap continues to widen, now only 40 points from 500 ):
I went to the TCEC archive site and tabulated some stats on the final match games of all 7 seasons (actually, 6 seasons: season 3 didn't appear to have a finals match, perhaps they used a RR to determine the winner, so I didn't include that season).
A few interesting things:
- 89 out of 304 final match games were decisive, for 29.7%
- out of the 89 decisive games, White won 62 (69.7%)
- from Season 1 finals to Season 4 finals, time controls were 150 minutes + 30 sec increment. But over this span, decisive game % went from 42.5% (Season 1) down to 20.0% (Season 4). So from Season 5 on, they shortened the time controls down to 120 moves + 30 sec increment. Decisive games jumped to 37.5% for season 5, then dropped to 29.7% for season 6, then dropped again to 17.2% for the latest season.
That last point indicates that if time controls are kept constant, but the top 2 engines keeping getting better and the hardware they run on keeps getting faster, then less and less games between those 2 engines become decisive. This seems to support my view that for typical tournament time controls, computer engines and hardware will reach a point in maybe less than 10 years at which no decisive games can be produced between the 2 (or perhaps more) top engines.
The really interesting thing is that over the past 2 season finals, White has won 27 of the 30 decisive games. This also supports the above-stated view, because if smaller and smaller mistakes are being made, not only should decisive games shrink in number, but before they disappear totally, the remaining few should all go in favor of White because of White's built-in first-move advantage. Season 7 finals (64 games): all 11 decisive games were won by White.
Only the rushing is heard...
Onward flies the bird.
I'm not 100% convinced that the the data supports your hypothesis that if time controls are kept constant the number of draws increase. Correlation does not equal causality. I tend to doubt that a 20% decrease in time controls could directly cause that significant an increase in then number of decisive games. It could be that in season 5 some advancement was made in some algorithms leading to an increase in wins, and then the competing software teams caught up.
To really prove your hypothesis we would need to run two tournaments with the same hardware at the different time controls, and see how the result compare.
The recent dearth of wins by black is interesting. Based on your numbers, for the first 4 seasons white won 35 of 59 decisive games, roughly 60%, much closer to normal play in human games. It will be interesting to see if the more recent results are a statistical anomaly or a real trend.
I'm not 100% convinced that the the data supports your hypothesis that if time controls are kept constant the number of draws increase. Correlation does not equal causality. I tend to doubt that a 20% decrease in time controls could directly cause that significant an increase in then number of decisive games. It could be that in season 5 some advancement was made in some algorithms leading to an increase in wins, and then the competing software teams caught up.
To really prove your hypothesis we would need to run two tournaments with the same hardware at the different time controls, and see how the result compare.
It turns out such a thing has been done. And the results are in favor of my hypothesis, which I will put forward here along with the supporting data and links. It all comes from the CCRL (Computer Chess Rating Lists) website:
Total: 1,225,149 games
played by 1,414 programs
ELO ranging from 3368 down to 287.
White wins: 482,058 (39.3% of all games) (54.9% of decisive games)
Black wins: 395,971 (32.3% of all games) (45.1% of decisive games)
Draws: 347,120 (28.3% of all games)
% of Decisive Games: 71.7%
NOTE: this % is not reliable for a control because it may include games between engines that are hundreds of ELO rating points apart.
Code:
Games between Top 3 Engines
taken from:
http://www.computerchess.org.uk/ccrl/404/cgi/engine_details.cgi?print=Details&each_game=1&eng=Stockfish%205%2064-bit%204CPU#Stockfish_5_64-bit_4CPU
and
http://www.computerchess.org.uk/ccrl/404/cgi/engine_details.cgi?print=Details&each_game=1&eng=Komodo%208%2064-bit%204CPU#Komodo_8_64-bit_4CPU
All engines are the 64-bit 4CPU versions:
Stockfish 5 ELO 3368
Komodo 8 ELO 3359
Houdini 4 ELO 3338
Engines differ by no more than 30 ELO rating points.
Win Loss Draw
Stockfish 5 vs Komodo 8 +18 -20 =62
vs Houdini 4 +61 -16 =57
Komodo 8 vs Houdini 4 +23 -24 =53
Total Games: 334
Total Draws: 172
% of Decisive Games: 48.5%
A much longer time control, should result in lower % of decisive games by my hypothesis.
Code:
Games between Top 3 Engines
taken from:
http://www.computerchess.org.uk/ccrl/4040/cgi/engine_details.cgi?print=Details&each_game=1&eng=Komodo%208%2064-bit%204CPU#Komodo_8_64-bit_4CPU
and
http://www.computerchess.org.uk/ccrl/4040/cgi/engine_details.cgi?print=Details&each_game=1&eng=Stockfish%205%2064-bit%204CPU#Stockfish_5_64-bit_4CPU
All engines are the 64-bit 4CPU versions:
Komodo 8 ELO 3304
Stockfish 5 ELO 3283
Houdini 4 ELO 3274
Engines differ by no more than 30 ELO rating points.
Win Loss Draw
Komodo 8 vs Stockfish 5 +19 -8 =67
vs Houdini 4 +30 -17 =47
Stockfish 5 vs Houdini 4 +20 -13 =44
Total Games: 265
Total Draws: 158
% of Decisive Games: 40.4%
This is a statistically significant difference, LOWER than the number (48.5%) for the faster time control. This strongly supports my hypothesis.
Garland, I should also point out that these 2 numbers are BOTH significantly above the % of decisive games from TCEC, where the time controls are very much longer. This again supports my hypothesis, producing a very hefty weight of evidence.
We can also wait for more results from TCEC events. Now that they are down to 11/64 games decisive, they might shorten the time control again, which according to my hypothesis should result in a one-time jump in % of games that are decisive. And then after that, with the new shorter time control constant, the % decisive games should consistently drift downward. Of course, there's always the chance of statistical anomolies.
Just out of curiosity, Garland, is there something about these hypotheses that you think doesn't make sense? If so, then what would you have us think instead? That % of decisive games will just keep fluctuating up and down irregardless of playing strength of the opponents and irregardless of time controls?
Only the rushing is heard...
Onward flies the bird.
This is a statistically significant difference, LOWER than the number (48.5%) for the faster time control. This strongly supports my hypothesis.
It is a barely enough to say anything about engines strength with only ~200 games. (a bad opening tree could ruin all stats)
On my machine without any help (no -bases), Stockfish versions wins against other. Stockfish 5 does not say anything too. The new revisions are released almost daily.
I didn't think your hypothesis was unreasonable, just that the data presented was insufficient to prove it. Hence my post. I would tend to agree in general that as playing strength increases, the number of draws between two equally skilled players increases, as the number and severity of errors made in a game grow smaller in size.
The Boston Globe comments on the 100-game match between the top two engines, Komodo and Stockfish. There were only 9 decisive results, 8 wins by Komodo and one solitary win by Stockfish. Surprisingly, 5 of Komodo's 8 wins came with Black, which rather begs the question, is White even an advantage? (:
Comment