Ahhh ratings...yes apparently there is deflation going on. Anyway FIDE is asking for your thoughts...go to FIDE.com
Rating changes coming to FIDE?
Collapse
X
-
From my email to Qualification Commission
Hello,
The FIDE recently published the recommendations of mathematician Jeff Sonas and of the qualification committee. The proposed changes are very radical and making those changes would involve giving many players hundreds of rating points and raising the minimal FIDE rating to 1400, among numerous other additions.
The full text of Sonas' proposal can be found here:
https://www.fide.com/docs/presentati...g%20System.pdf
I would like to express my point of view about the whole situation with FIDE ratings and about the proposed changes.
1. The rating situation – an extensive period of deflation
In the chess world, starting from 2014, ratings have been deflating. The deflationary wave, which started from the bottom of the chess pyramid, is climbing upwards and has been experienced by all players, including the elite (2700+).
The reason for the change in trend (inflation to deflation) was the FIDE's reform in 2014, when nearly every player was given a rating and a rating minimum of 1000 was created. A ton of players, mostly kids, were added to the rating pool.
The rating trajectory of many young players looks something like: initially earning a rating of a bit higher than 1000 (let's say 1100), playing for a few years, boosting their rating by a few hundred rating points (perhaps to 1700), and subsequently quitting chess towards the end of their teenage years, unfortunately often permanently. As these players leave the game, they also take with them the hundreds of rating points they gained throughout their careers (would be 600 in my example).
Differing K-factors smooth this effect, which is why the FIDE wisely instituted 3 K-factors in 2014. To be clear, they smooth this change, but they do not erase deflation nor get rid of its causes.
As such, it can be said that deflationary pressures are tied to the very nature of the chess pyramid and its growth. Therefore, monitoring rating trends and responding to them with lasting changes when necessary is certainly desirable for the FIDE.
Unfortunately, the FIDE didn't notice the sufficiently obvious deflationary trend for quite a while. Furthermore, the rule change from January 1, 2022 only increased the rate of rating deflation and made the deflation reach the upper levels of the chess pyramid faster. The change was that if a player plays multiple games against opponents with a >400 rating difference in one tournament, only 1 of those games is adjusted to a 400 point difference. Previously, the rating changes as a result of all such games were done as though the rating difference between players were 400 points, and not greater.
a. Deflation and the elite
The deflationary wave did not reach the chess elite immediately, rather a few years after 2014. Let's try to analyze the deflation of the elite's ratings by creating a number E, which represents the sum of the differences between all ratings above 2700 and 2700. For instance, Carlsen's current rating is 2835, meaning that he adds 135 points to the count, while a player with a rating of 2700 adds exactly 0. I think that E sets up a rather informative indicator for how rating deflation has affected the chess elite; this is better than, say, counting the number of players whose ratings exceed 2700.
Here's how E has looked at the start of each year since 2015:
2015 - 1895
2016 - 1864
2017 - 1934
2018 - 1941
2019 - 1810
2020 - 1760
2022 - 1559
2023 - 1483
Based on live ratings, E is now 1407.
E does not include the ratings of inactive chess players such as Kasparov and Kramnik.
Clearly, E reached its peak by 2018 and has since dropped more than 500 points, which is representative of 12 points for each 2700+ rated player on average. It would be logical to propose that without administrative measures the deflationary trend would continue at a rate of 3-4 points lost per elite player each year.
At lower levels, the deflationary pressure is even stronger and has indubitably been felt after the reforms of 2014, and not after 2018 as was the case for the elite.
b. COVID-19
The pandemic-related two-year-long lack of tournaments put additional pressure onto the rating system. Young players became significantly stronger over 2 years, but their ratings remained unchanged. After most restrictions were lifted in 2022, the young players rapidly began taking rating points from the higher rated and generally older players.
This effect has likely dissipated over the last 1.5 years and the afflicted generation has been able to maintain an adequate rating reflecting their current level of play.
c. Deflation – not the only problem. Elo tables no longer reflect reality
Of the tables that Jeff Sonas cites in his proposal, the most interesting are the ones that compare data for the years 2008-2012, when no one was aware of deflation yet because there was, in fact, rating inflation. We can see from that table that lower rated players had a statistical advantage in terms of expecting rating changes in games with higher rated opponents. The difference ranged from 2% (100 point difference) to 7-8% (400 point difference).
Since 2021, the expected rating change per game reached a difference of 15% between different ratings, according to the table. This is a seriously large amount.
Many experienced chess players remember the main coefficients of the Elo tables. For instance,
0.64 - expected performance then playing against someone that is 100 points lower rated
0.76 - 200 point rating difference
0.92 - 400 point rating difference
In reality, even during the inflationary period, the number 0.62, 0.72, and 0.85 more accurately and fairly represent rating change expectations.
We all respect Professor Arpad Elo. His work and data created the basis of the rating rules that the FIDE has been based on for over 50 years. There's no doubt that in 1970, Elo's tables perfectly reflected reality.
However, this was a long time ago, when the rating range was 500 points, from 2200 to 2700. Now the rating range is greater than 1800 points (1000-2800+), and even if Sonas' proposal is accepted, the range will be 1400 points (1400-2800). This is 3 to 3.5 times greater than before. Of course, it comes as no surprise that there are significant differences between rating data from today and that from 50 years ago.
2. The solution
Sonas' proposed reforms will result in great shifts in rating for a huge number of chess players and this will decrease confidence in the rating system, which is a key asset for the FIDE. Sonas proposes extremely harsh measures to deal with long-standing problems that could be solved via much more conventional means. The proposal forgoes one of the key principles of rating – it can be changed only at the chess board. I'd imagine that the FIDE should aim to minimize instances of direct interference. Mathematically speaking, it's illogical to change the value of a function, when it would suffice to simply change the rules of calculating the derivative.
Of course, like in medicine, there are moments when the use of more severe measures is most apt. In this case, however, the relatively calm deflation clearly is not representative of such an instance. The deflation has lasted for only 5 years among the elite, after a much longer and more significant period of inflation. The number E that I defined earlier ranged from 0-30 at the end of the 1970s, when Fischer had quit chess and this number depended on Karpov's rating, given that he was the only 2700+ rated player at the time.
Nonetheless, some changes are necessary and should, alongside combating deflation, help adapt the tables of Elo to today's reality.
My proposals:
a. Modifying Elo's tables.
Instead of 0.64, 0.76, 0.92 (corresponding to rating differences of 100, 200, and 400 points between opponents), I propose 0.62, 0.72, and 0.85. This means decreasing the expected result of a chess player with a higher rating by 2%, 4%, and 7% respectively. Of course, the rest of the numbers in Elo's tables would be modified accordingly.
This change would largely halt rating deflation among the elite. If, say, Carlsen were to play 50 games a year against opponents with an average rating difference of 100 points, his rating would increase by 10 points a year. Instead of receiving 3.6 rating points (0.36 x 10) per win, he now receives 3.8, loses 1.2 instead of 1.4 in a draw, and loses 6.2 instead of 6.4 when he loses. The total of this, under these circumstances, is 10 points.
For weaker players, the effects of this change will be less significant, but the impacts will be felt at least by players rated above 2400, who play the majority of their games against weaker opponents, and would therefore gain rating as a result of this change.
Another positive effect of this modification is that stronger players' fears of entering tournaments where there are many weaker players will largely disappear. Currently, many strong players aim to minimize their participation in tournaments with a large rating range, which results in such tournaments becoming even weaker. In North America, for example, many large tournaments have significantly lower average ratings than was the case 10 years ago.
For the last several years, the common, and correct, view has been that it is far easier to gain rating against strong players. Is this fair though? For instance, a chess player rated 2000 could consistently crush opponents rated 1800, while another 2000-rated player may be able to snag quite a few points off an opponent rated 2200. Recently, the latter player would be deemed better and was rewarded more. Why is this? There must be, at least roughly, symmetry in this way, and a player's ability to get a ton of points against lower-rated opponents must be adequately rewarded.
- 1 like
Comment
-
Continuation...
b. Fully bringing back the 400-point rule, but making it 500.
In accordance with my proposition to modify Elo's tables, 0.9 would be the coefficient for a 500 point rating difference. No matter how many games a stronger player plays against those more than 500 points weaker than them, they should get at least 1 rating point when they win (2 points when K-factor is 20 and 4 points when K-factor is 40).
Proposals a and b would stop deflation among strong players and get rid of a key flaw of Elo's tables, referring to the fact that the coefficients he provided no longer correspond with reality. Those in the middle of the chess pyramid would neither profit nor suffer, but those at the lower levels would lose rating. Two further proposals will help deal with deflation at the middle and lower levels.
c. Changing how a player's first FIDE rating is calculated
Jeff Sonas' proposal about changing players' first FIDE rating (adding two draws against 1800 to the count) seems completely logical and would help deal with deflation at the bottom of the pyramid. Jeff suggested specifically 1800 because he also proposed that 1400 become the lowest rating a player could have. I proposed that 1000 remains the minimal rating. Accordingly, 1500 or 1400 should replace the proposed 1800 for a similar effect. I would like to point out that adding these two draws against 1400 or 1500 to a player's initial rating should only be applied for players whose initial ratings would increase as a result.
To some extent, this proposal does violate the principle of fairness, artificially increasing one's first rating, but this could, in my opinion, be considered as a sort of rating advance to often young, new players.
Of course, changing the way in which one's first rating is calculated will not immediately stop rating deflation, but the effects will be felt soon enough. Moreover, this benefits specifically the lower levels of the chess pyramid sooner and more than those stronger than them.
d. Rounding ratings monthly upwards
The FIDE calculates changes in rating to the tenth of a point, while up-to-date official ratings, published once a month, are rounded to the nearest point. 0.5 is rounded in the direction of the last rating change, i.e 1700.5 becomes 1701 if the player's rating grew over the last month and 1700 if it fell.
I propose that ratings always be rounded upwards for players rated <2000. For the majority of them, their K-factor is 20 or 40, and thus their rating could end in .0, .2, .4, .6, or .8. 40% of the time, a player would receive an extra point at the end of the month, as compared to the current rules. For the most active players, who play each month, this change would result in 4.8 (0.4 x 12) additional points a year. On average, I assume this effect would be around 2 points per player each year.
Though the change seems small, it will at least reward more active players.
Rounding to the nearest point is the most common approach to such things, but rounding in the player's favor is not at all illogical.
3. Side effects
Of course, it is difficult to predict the impact of such relatively large changes on the whole situation with regards to rating. However, unlike Jeff Sonas' proposals, mine are not irreversible and not nearly as huge as handing out hundreds of points to the majority of players.
In my opinion, all that remains is adapting the tables for norms to the new parameters. This is a purely technical question that can be answered easily and quickly.
Victor Plotkin
FM, FT, CFC FIDE Representative
- 1 like
Comment
-
I placed on Facebook and Twitterx a link to an article of mine from 2019 that corroborates Sonas's tables and main recommendation in an independent manner---with an update added last week at the bottom:
https://rjlipton.wpcomstaging.com/20...range-horizon/
Hence I largely agree with his recommendations. My two main remarks on Sonas's report and Vctor's commentary taken together are:- The analysis for 2020 to the present needs to be redone using my pandemic lag rating adjustments. The adjustment below 2000 remains completely unchanged from the back-of-the-envelope formula in my July 2021 article https://rjlipton.wpcomstaging.com/20.../pandemic-lag/ Above 2000, I now treat the formula f as a differential df that gets integrated, so the rating increases taper off.
- The analysis of Elo's probability table needs to reference Mark Glickman's analysis of the effect of uncertainty. I've written an old article for popular consumption on this too: https://rjlipton.wpcomstaging.com/20...know-a-secret/
Comment
Comment