If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
Policy / Politique
The fee for tournament organizers advertising on ChessTalk is $20/event or $100/yearly unlimited for the year.
Les frais d'inscription des organisateurs de tournoi sur ChessTalk sont de 20 $/événement ou de 100 $/année illimitée.
You can etransfer to Henry Lam at chesstalkforum at gmail dot com
Transfér à Henry Lam à chesstalkforum@gmail.com
Dark Knight / Le Chevalier Noir
General Guidelines
---- Nous avons besoin d'un traduction français!
Some Basics
1. Under Board "Frequently Asked Questions" (FAQs) there are 3 sections dealing with General Forum Usage, User Profile Features, and Reading and Posting Messages. These deal with everything from Avatars to Your Notifications. Most general technical questions are covered there. Here is a link to the FAQs. https://forum.chesstalk.com/help
2. Consider using the SEARCH button if you are looking for information. You may find your question has already been answered in a previous thread.
3. If you've looked for an answer to a question, and not found one, then you should consider asking your question in a new thread. For example, there have already been questions and discussion regarding: how to do chess diagrams (FENs); crosstables that line up properly; and the numerous little “glitches” that every new site will have.
4. Read pinned or sticky threads, like this one, if they look important. This applies especially to newcomers.
5. Read the thread you're posting in before you post. There are a variety of ways to look at a thread. These are covered under “Display Modes”.
6. Thread titles: please provide some details in your thread title. This is useful for a number of reasons. It helps ChessTalk members to quickly skim the threads. It prevents duplication of threads. And so on.
7. Unnecessary thread proliferation (e.g., deliberately creating a new thread that duplicates existing discussion) is discouraged. Look to see if a thread on your topic may have already been started and, if so, consider adding your contribution to the pre-existing thread. However, starting new threads to explore side-issues that are not relevant to the original subject is strongly encouraged. A single thread on the Canadian Open, with hundreds of posts on multiple sub-topics, is no better than a dozen threads on the Open covering only a few topics. Use your good judgment when starting a new thread.
8. If and/or when sub-forums are created, please make sure to create threads in the proper place.
Debate
9. Give an opinion and back it up with a reason. Throwaway comments such as "Game X pwnz because my friend and I think so!" could be considered pointless at best, and inflammatory at worst.
10. Try to give your own opinions, not simply those copied and pasted from reviews or opinions of your friends.
Unacceptable behavior and warnings
11. In registering here at ChessTalk please note that the same or similar rules apply here as applied at the previous Boardhost message board. In particular, the following content is not permitted to appear in any messages:
* Racism
* Hatred
* Harassment
* Adult content
* Obscene material
* Nudity or pornography
* Material that infringes intellectual property or other proprietary rights of any party
* Material the posting of which is tortious or violates a contractual or fiduciary obligation you or we owe to another party
* Piracy, hacking, viruses, worms, or warez
* Spam
* Any illegal content
* unapproved Commercial banner advertisements or revenue-generating links
* Any link to or any images from a site containing any material outlined in these restrictions
* Any material deemed offensive or inappropriate by the Board staff
12. Users are welcome to challenge other points of view and opinions, but should do so respectfully. Personal attacks on others will not be tolerated. Posts and threads with unacceptable content can be closed or deleted altogether. Furthermore, a range of sanctions are possible - from a simple warning to a temporary or even a permanent banning from ChessTalk.
Helping to Moderate
13. 'Report' links (an exclamation mark inside a triangle) can be found in many places throughout the board. These links allow users to alert the board staff to anything which is offensive, objectionable or illegal. Please consider using this feature if the need arises.
Advice for free
14. You should exercise the same caution with Private Messages as you would with any public posting.
Your thought experiment is extreme, but not completely nonsensical. We know that Ruy Lopez suggested playing with the sun at your back so that it is in your opponent's eyes, and we know too of Lasker's terrible, stinking cigars that could not have helped his opponents to play their best. Now, of course we could "judge" a game by the extent to which one's moves concided with those suggested by the top engines and "rate" it accordingly. But I think this would lead to many draws being higher rated than wins insofar as the wins were provoked by "unsound" moves that introduced complications the oppenent was unable to work through, while not a singe "unsound" move will lead to a draw if the opponent plays just as well. If we are going to rate draws higher than wins then there must be something wrong with the system.
You are hitting the nail on the head for sure! But in your last statement, what are you referring to when you use the word "system"? The rating system? No, there would be nothing wrong with the rating system just because draws are rated higher than wins. Unless you want the rating system to compare the "entertainment value" of games?
Chess as a competitive endeavor has a fundamental problem: it is a game of perfect information, and as such perfect play should ALWAYS lead to a draw. This has already been proven in checkers, and it should also be the case in chess, but we will likely never prove it mathematically because the search tree is just too big.
As you know, chess is addressing the problem by going more and more to Rapid and Blitz time controls, and where necessary, even Armageddon. Now remember what I've been saying here: ELO says nothing about quality of games. My rating system does encapsulate quality of games, and yes, absolutely, it will show that draws generally speaking (there are always some so-called "fighting draws") will have higher ratings than decisive games. And it should show that. Absolutely. Because in chess, higher quality means more draws. Just look at correspondence chess for the ultimate proof... now at about 95% draw rate at the top levels, and fewer and fewer people are wanting to play correspondence chess anymore.
My rating system shows that Carlsen playing and winning at Rapid time control played at about 1/7th the strength of Fischer winning at standard slow time control. Ok, that's just a sample size of one, but I fully expect the data to continue to show that pattern.
So the problem isn't the rating system, its the game itself. It's running out of ways to win under slow time controls. It goes to faster time controls, the games are entertaining to watch, but the quality is drastically reduced. If that's what everyone wants, that's what we'll be left with. Slow time controls will eventually disappear just like correspondence chess. Well, at the top levels anyway, chess clubs are not having the problem to near the same extent.
I would sum it up this way: in chess, you cannot provably force yourself to win.... but you can force yourself to lose. So yes, you can play more like Tal, and your quality of play will go down, and there should be a rating system to show that, and to quantify it in a consistent way. So that is what I am providing.
Botvinnik Game Performance Rating (GPR): 1461
Tal Game Performance Rating (GPR): 1803
These numbers DO NOT correspond to ELO ratings, so don't think of them that way.
So as Brad suggested, Tal's number is very low and he possibly "dragged" Botvinnik down to play worse than he would otherwise, and Botvinnik played even worse than Tal. Who knows, this may turn out to be the WC game with the worst overall GPR ever. It will by some great length of time before we know that, but it could be.
In other words, the worst quality WC game ever? But in terms of entertainment for those playing through it, well, it's not bad at all.
But that leads to the thought that heck, if this was entertaining, there must be tens of thousands of games played by players whose ratings are say 1600 to 2000 ELO which could be considered just as entertaining. Just put Tal and Botvinnik as the player names, and pretend it's for the World Championship! lol
Perhaps the incorporation of the normal rating system based upon wins, losses and draws combined with the engine analysis system of pure accuracy and soundness of moves would in some manner be useful? In your thought experiment the player would not have an especially high rating overall because the engine portion would be very low, thus taking into consideration Lasker's cigars having their desired effect. Tal might get docked more "soundness" points than Capablanca for example and have a lower overall rating as a result.
You mention "dragged" down. Yes, that is the point of many of Tal's moves, and those of many players at many times. It is not uncommon for the best move(s) in the position to be drawish, while certain presumably weaker moves at least infuse a complex challenge into the game wherein both players have a greater chance of going wrong, not just the player who instigated the complications. There must be games wherein Tal outsmarted himself, so to speak. Players that are good at calculating their way through the complications are more likely to provoke them and win or lose, while players who like a safer game will often play the better moves engine-wise, and draw more often as a result. Also, there is the game situation. If you need a draw you may play safer, a win you may go for broke. Neither a one game engine analysis nor the standard system takes these sorts of factors into account.
Perhaps the incorporation of the normal rating system based upon wins, losses and draws combined with the engine analysis system of pure accuracy and soundness of moves would in some manner be useful? In your thought experiment the player would not have an especially high rating overall because the engine portion would be very low, thus taking into consideration Lasker's cigars having their desired effect. Tal might get docked more "soundness" points than Capablanca for example and have a lower overall rating as a result.
If someone wants to combine the two (ELO and GPR), they are free to do that, but it's not my goal to do that.
But you said "in some manner be useful", and here's what I think could happen someday and it would be useful: continue to rate tournaments using ELO, and use GPR as a tiebreaker to determine places and prizes. I don't know all the different tiebreak methods being used right now, or how much ELO ratings tie into those methods, but if GPR were used as the sole tiebreak method, then ELO would have no influence in tiebreaks and that would be good, because what the tiebreaks are trying to do is break the ties based on what actually happened IN THE TOURNAMENT. Not all the past stuff that ELO brings into it, which is really not relevant.
You mention "dragged" down. Yes, that is the point of many of Tal's moves, and those of many players at many times. It is not uncommon for the best move(s) in the position to be drawish, while certain presumably weaker moves at least infuse a complex challenge into the game wherein both players have a greater chance of going wrong, not just the player who instigated the complications. There must be games wherein Tal outsmarted himself, so to speak. Players that are good at calculating their way through the complications are more likely to provoke them and win or lose, while players who like a safer game will often play the better moves engine-wise, and draw more often as a result. Also, there is the game situation. If you need a draw you may play safer, a win you may go for broke. Neither a one game engine analysis nor the standard system takes these sorts of factors into account.
Of course no system that is involved in making tough decisions can be perfect. We even see mistakes in murder trials.
In chess there is no possible way to take all factors in a chess result into account. GPR just says "Based on the moves made in this game, the White player performed at this level, and the Black player performed at this level." Or as noted elsewhere in this thread, it doesn't have to be a game that is rated, it could be a collection of moves you made in endgame play (just as one example). I guess in that case, the name "Game Performance Rating" is not quite appropriate. Maybe it should be 'Move Performance Rating" in that case.
If GPR were in fact being used for tiebreaks, then players would be striving to play "safer" chess in pursuit of prizes. This is an artifact of chess as competition. Chess as competition says "play the best moves at all times". Chess as art and as entertainment says "play the craziest moves at all times". Since chess tournaments and matches are all about prizes and money, the striving to play best moves is bound to win out. That's why there is only one Tal, I suppose.
Wikipedia says Tal was a great chess writer, I'd like to see some of his writings. I wonder if he ever opined on this topic of playing safe versus playing for complications.
Ok, here's another GPR rated game. I took Brad's cue and decided to rate a game between two Canadians, and to save time I just looked at the games Frank Dixon has been providing as so-called "Mystery Games". I found his game #106 which seemed a good example, and it turned out to be very interesting in terms of GPR.
As I started rating the game, I gradually came to realize that for Hebert, this game was playing itself, he didn't even have to think very much. Most of his moves were only moves, or were decisively better than the next best choice. I was already accounting for that in my rating system. What it meant was that for this particular game, Hebert played only 7 moves that were acceptable to be rated. So his sample size is very small.
Once Quan played 13.Ne5, everything seemed to be on full auto. For both players actually, Quan also had a very small sample size.
The other thing of note is that Quan kept playing well beyond the point he should have, and then he played 37.b4??? which even though the game was already well lost, cost him tons of GPR points. So far it's the worst blunder I've seen in all the games I've been rating. With that move included, Quan's GPR for the entire game is really really bad. Take that move out and it improves substantially, although it's still bad.
So I'll include both results, with and without 37.b4??? .........
With 37.b4,
Quan Game Performance Rating (GPR): 291
Hebert Game Performance Rating (GPR): 3030
Without 37.b4,
Quan Game Performance Rating (GPR): 1990
Hebert Game Performance Rating (GPR): 3030
Again, this huge difference with just one move is due to the extraordinarily small sample sizes.
Last edited by Pargat Perrer; Sunday, 1st August, 2021, 04:21 AM.
Quan Game Performance Rating (GPR): 291
Hebert Game Performance Rating (GPR): 3030
Without 37.b4,
Quan Game Performance Rating (GPR): 1990
Hebert Game Performance Rating (GPR): 3030
Again, this huge difference with just one move is due to the extraordinarily small sample sizes.
This is totally unfair. One blunderous move out of 38 good moves destroys the rating. It should only have 1/38 effect.
A simple way of rewarding winners with this system is to add 100 GPR points for winning the game. This might help the unbalance with draws. It is harder to win a double-edged, complex game.
This is totally unfair. One blunderous move out of 38 good moves destroys the rating. It should only have 1/38 effect.
A simple way of rewarding winners with this system is to add 100 GPR points for winning the game. This might help the unbalance with draws. It is harder to win a double-edged, complex game.
I'm still taking all suggestions into consideration, so thanks for that Erik.
One thing I could do is have a "desperado" provision, where if the game is already out of hand -- which would be determined by some measurement, such as one player being up by the equivalent of a minor piece for example -- any blunders by the player who is behind would not be included, being considered as some "desperado" attempt to change the game completely. But I wouldn't want to stop the rating altogether at that point, because then any game where a player won despite being down in material would not get properly rated. If a player gave up material as a sacrifice to win, my method will detect that, unless the combination is so deep that no one, human or engine, can recognize it for what it is. I haven't seen anything like that yet, although I do know people have come up with totally weird and ridiculous positions where an engine will think it is totally winning but in fact it can only draw.
Quan's 37.b4 would very easily fall into that desperado category. Also, once a game is in hand, the player who is ahead might start getting a little careless with moves, knowing that any one of a half-dozen or more moves is still going to be winning. That should be reflected in that player's GPR. Anyway, Quan's 37.b4?? is removed from the process and his GPR becomes 1990, much closer to what his level of play really was over the whole game.
I don't know about adding points for winning. It would get back into a discussion of combining with ELO because a win against an 1800 player isn't the same as a win against a 2400 player. If GPR becomes known and established and used in some fashion, then people can discuss how to combine it with wins and losses as both Brad and you are suggesting. I myself don't have an interest in that, I just want a flat system that rates single games based only on the move, and I think it could be useful for tiebreaks and also for comparing players over many generations.
EDIT: if I start finding games where the respective players' GPRs do not reflect who won the game, then I might think about adding some points for winning.
Last edited by Pargat Perrer; Sunday, 1st August, 2021, 04:43 PM.
Your GPR is relative to the evolving strengths of the engines, but this is not a problem since the engines have gotten so strong and have long since surpassed humans. I remember when the top commercially accessible programs had yet to crack IM status.
Stipulate exactly which moves/types of moves (you suggested openings, a concept vague as to where the middegame arrives, and you have also suggested "desperados", and so forth) are to be omitted. We may come up with many question, here are a few. When is a player so far ahead or so far behind that the moves will not be scored? What if someone blunders in this hopeless unscored game and the game is suddenly even again, where do you start scoring again? A player could deploy a fabulous series of perfect engine moves for a very high score, but only after blundering grotesquely otherwise the game would have quickly fallen into a nececssary threefold repetition, thus how heavily can you weigh blunders as opposed to a sequence of many sound, simple perfect moves? You need to spell all of this out. This is the formula we need.
Now, here is the point. In order to make all of these assessments of ommisions, appraisals, all of these stipulations as to the particulars, you will need to rely upon your Elo strength, thus to do full justice to your notion only Magnus Carlsen is qualified to fill in the details, you can speak only in generalities and get away with it. What you want generally speaking is a formula that best reflects the quality of a given game in some manner of speaking. But we need at least a team of qualified Grandmasters, hopefully including Magnus, to supply the fine print. Otherwise Elo will have been excluded, but we want it to be included as much as possible.
Your GPR is relative to the evolving strengths of the engines, but this is not a problem since the engines have gotten so strong and have long since surpassed humans. I remember when the top commercially accessible programs had yet to crack IM status.
Stipulate exactly which moves/types of moves (you suggested openings, a concept vague as to where the middegame arrives, and you have also suggested "desperados", and so forth) are to be omitted. We may come up with many question, here are a few. When is a player so far ahead or so far behind that the moves will not be scored? What if someone blunders in this hopeless unscored game and the game is suddenly even again, where do you start scoring again? A player could deploy a fabulous series of perfect engine moves for a very high score, but only after blundering grotesquely otherwise the game would have quickly fallen into a nececssary threefold repetition, thus how heavily can you weigh blunders as opposed to a sequence of many sound, simple perfect moves? You need to spell all of this out. This is the formula we need.
Now, here is the point. In order to make all of these assessments of ommisions, appraisals, all of these stipulations as to the particulars, you will need to rely upon your Elo strength, thus to do full justice to your notion only Magnus Carlsen is qualified to fill in the details, you can speak only in generalities and get away with it. What you want generally speaking is a formula that best reflects the quality of a given game in some manner of speaking. But we need at least a team of qualified Grandmasters, hopefully including Magnus, to supply the fine print. Otherwise Elo will have been excluded, but we want it to be included as much as possible.
Hi Brad, I edited my previous post, not sure if you saw the edited version, because I removed the part saying that I would stop rating the game at a certain point where either player is behind by the equivalent of a minor piece.
Instead, I will continue rating all moves at that point, but where one move by the player who is behind (and ONLY the player who is behind) seems to be a "desparado" attempt to change the game completely, and that move turns out to be a hideous blunder even though the very best move at that point is still losing, I will rate the blunder to see what number it comes up with, but I will NOT include it in the totals for that player's GPR. So Quan's 37.b4?? will not affect his game GPR, which will officially for now be 1990 (not to be confused with 1990 ELO, that is a totally different thing).
So that removes all the questions about where will I stop rating the game, and where would I restart it again if the situation suddenly changes. Only that 1 blunder will be removed.
For the openings, I decided on the first 12 moves (24 plies) of the game as being opening. I can't try and decide for each individual game how much of the opening is known theory and where each player actually starts thinking. So I just decided to make it 12 moves, and apply that to all the games. I suppose if I wanted to make it a custom number for each player in each game, I'd have to be supplied with the clock times to see where did each player start actually taking lots of time? And even that wouldn't be truly accurate, since a player may have left the board to go get a coffee before playing his or her 6th move of the game or something. And yes, in some top level games, the opening theory might not end until move 25 or something. I can't know that for sure for each opening, so the only way is to apply a cutoff point that applies universally. As I've already said, we can't be 100% accurate. But... if a player plays any move from the 13th move on and it is part of known opening theory and comes up first in the engine diagnosis, it doesn't necessarily become part of that player's GPR. That's all I can say about that for now, and yes, I can speak only in generalities and get away with it, but so far no one is doing anything with this, so it is simply my project and I am sharing it and willing to take suggestions. Once I finalize it, I may or may not share what I'm doing, but if a large hue and cry arose because everyone and their grandmother wanted an explanation, then maybe I'll provide everything. For now I am just providing some examples, putting out the numbers and seeing what anyone thinks of it.
If I were to start getting inquiries from tournament organizers about using this as a tiebreak formula, then also I would likely reveal how it works. Its' not super complicated, but it is time consuming.
Well, I was quite surprised because even way back then, these guys knew how to play chess! Anderssen was extremely precise! His GPR for this game is the closest yet to Fischer's game against Smyslov. The results were:
Anderssen Game Performance Rating (GPR): 6132
Steinitz Game Performance Rating (GPR): 1860
Now for Steinitz, it must be noted that he was very much in the game until his move 34. And then his move 35 was also very weak. Steinitz without moves 34 and 35 had a GPR of 5229, which was better than Smyslov's game against Fischer.
One thing I don't know is time control. Did they have chess clocks back in 1866? And if yes, what was the common time control?
I was remiss in mentioning the time control for the Quan - Hebert game (last post). Their time control was G60 +5sec/move, which is not exactly a slow time control yet not Rapid either. I guess it's "Intermediate" time control, and so that could explain why Hebert only had a 3030 GPR.
Yifan Game Performance Rating (GPR): 3158
Polgar Game Performance Rating (GPR): 2837
If indeed this was a slow time control (which I don't know for certain, but I didn't see "Rapid" anywhere in the source or in the naming of the event), then this might have been one of Judit's worst games since becoming a GM. But maybe someone reading this will know her career and can come up with some insight on that.
She got behind at an early stage and just kept making it worse. It's an example of a "bad day at the office" for sure.
Yifan was only 17 at the time of this game, so it may not be an example of her at her best either. She played about exactly 3 times better than Giri did in his Rapid loss to Carlsen.
Ok, I have another game rated. It's a lengthy one between Korchnoi and Karpov in their 1978 World Championship match. This match had the new rule that draws count for 0 points, and the match winner would be the first to 6 wins. Karpov achieved a 5-2 lead, then slowly Korchnoi came back, and this game, the 31st of the match, was the one where Korchnoi drew even 5-5.
It features and rook-and-pawn endgame in which both players played rather poorly. The middlegame was really quite boring, being mostly without Queens and having many shuffles of rooks and bishops that kept the game score around 0.00 for a long time. It was Karpov who, perhaps from exhaustion, misplayed the endgame to lose the game...
As promised, here now are the GPR results for game 32, the final game of the 1978 World Championship between Karpov and Korchnoi. The latter had come back from a 5-2 deficit to tie things up 5-5, and then there was a 5-day break before the next game. So both players should have been well rested and refreshed.
Yet in this critical game, they both played very poorly. Maybe the worst-played WC game of all time? Here is the pgn...
Although it was played very poorly, the poor play gave rise to many amazing tactical possibilities. Hans Jung, you seem to like tactical lines, if you are reading this, scroll down and make a copy of the pgn that has Stockfish 13 analysis, and after White's move 25 you will find multiple amazing tactical lines that Stockfish found. And even more show up later on.
The GPR result was: Karpov Game Performance Rating (GPR): 2096
Korchnoi Game Performance Rating (GPR): 1351
To repeat, these are not ELO rating numbers, they are totally different.
Here is the pgn with Stockfish 13 analysis of critical moves:
I'll bet there are a lot of people here who remember the so-called 'Game of the Century". It was in 1956 between Donald Byrne and Bobby Fischer. It featured a very pretty mate at the end, and a 17th move by Fischer that has been called the "counter-attack of the century".
Does it really live up to that hype? To find out, i did a GPR on the game. Here is the pgn...j
Well, I have to tell you, this game doesn't deserve to be called anything close to the "Game of the Century". I had to modify the GPR method on this game because I normally don't start doing the rating until the 13th move, but Byrne blundered on his 11th move and the game was really over at that point. Fischer just kept building on a huge advantage he had starting with his own 11th move. Byrne kept blundering and making things worse. I don't know what his ELO rating at the time would have been, but I can't imagine much over 1400.
Here were the numbers to prove that Byrne played ridiculously bad and Fischer himself didn't play very well at all either. His 17th move was nothing spectacular, it was considered by Stockfish as the only reasonable move on the board.
Byrne Game Performance Rating (GPR): 714
Fischer Game Performance Rating (GPR): 1427
Again these are not ELO numbers, but if you have followed this thread, you would know these are both very bad numbers. I wonder if anyone who has long considered this to actually be the game of the 20th Century will dispute what I have found out.
I'll bet there are a lot of people here who remember the so-called 'Game of the Century". It was in 1956 between Donald Byrne and Bobby Fischer. It featured a very pretty mate at the end, and a 17th move by Fischer that has been called the "counter-attack of the century".
Does it really live up to that hype? To find out, i did a GPR on the game. Here is the pgn...j
[Date "1956-10-17"]
[Result "0-1"]
[FEN "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"]
[White "Donald Byrne"]
[Black "Robert Fischer"]
[Event "Third Rosenwald Trophy (1956)"]
[Site "New York, NY USA"]
[Round "8"]
Byrne kept blundering and making things worse. I don't know what his ELO rating at the time would have been, but I can't imagine much over 1400.
Here were the numbers to prove that Byrne played ridiculously bad and Fischer himself didn't play very well at all either. His 17th move was nothing spectacular, it was considered by Stockfish as the only reasonable move on the board.
Byrne Game Performance Rating (GPR): 714
Fischer Game Performance Rating (GPR): 1427
Again these are not ELO numbers, but if you have followed this thread, you would know these are both very bad numbers. I wonder if anyone who has long considered this to actually be the game of the 20th Century will dispute what I have found out.
In the 10th annual rating list, May 1956, Donald Byrne 2557, Fischer, 13 years old, 1726, his first rating.
In the 11th annual USCF Rating List, May 1957 Donald Byrne 2468, the sixth highest player behind Reshevsky, Evans, Robert Byrne, Rossolimo, and Kashdan, Fischer was now 2231.
Comment