All the people who comment on the cheating scandal of Borislav Ivanov should first read http://www.cse.buffalo.edu/~regan/ch...y/Golfers.html and spend some time trying to understand the ideas behind the algorithms. To summarize what Regan says "statistical analysis can only be supporting evidence of cheating, in cases that have some other concrete distinguishing mark. When such a mark is present, the stats can be effective, and can meet civil court standards of evidence". By black mark or spot he means "the spot can only be physical or observational evidence of cheating, something independent of the consideration of chess analysis and statistical matching to a computer." Regan is clear and precise, as one would expect of a respected academic. And finally he notes that "One thing that does not constitute a black spot is an unfounded accusation of cheating."
People who have cannot or will not understand the statistical principles behind the algorithms are confused. In http://www.cse.buffalo.edu/~regan/chess/fidelity/ Regan states that "The main statistical principle which these pages show has been misunderstood by the chess world is that a move that is given a clear standout evaluation by a program is much more likely to be found by a strong human player. And a match to any engine on such a move is much less statistically significant than one on a move given slight but sure preference over many close alternatives. Case in point, 2/23/09: The mention of Rybka in GM Mamedyarov's protest letter at the 2009 Aeroflot Open was evidently this kind of misunderstanding” And finally, in his words again from http://www.cse.buffalo.edu/~regan/ch...y/Golfers.html he says "Note that you already need the full model described in my papers even just to judge this kind of outlier---if you merely get a lot of matching, you may simply have played an unusually forcing game."
The other problem is that right now the cheaters are likely using standard chess engines at their full strength. But they could "detune" the engines to play at lower strength, and in this way "simulate" the progression of a chess player over time who is putting a lot of energy into studying and improving. The cheaters might also use open source chess engines, and change their evaluation methodology to add some randomness to the move selection. A very sophisticated cheater might even add some "human like" randomness so that the engine makes mistakes that are similar to those made by people; i.e. choosing a wrong move more often in a very complex tactical situation. As the cheaters improve and make their engines play more like humans it will become more difficult for these statistical methods to spot them. In short, it is hard to see how statistical approaches to catching cheaters will ever be anything other than supporting evidence that requires some other physical or observational evidence of cheating.
Cheaters in standard tournaments either have an accomplice who communicates the moves to them, or if they are working on their own they are using a phone to cheat. In the first case simply delaying the move transmission should help, and in the second case the metal detectors will find their phones. And of course we can also apply the simple rule that no player can have a phone in their possession. So if the tournament is important it seems inevitable that these types of measures will be necessary; sad but true.
For online chess playing systems the situation is direr. Here there will never be any physical evidence, so what can be done? I do not believe that the current cheating detection systems in the online world are anywhere near as accurate as Regan's system. As far as I know they do not have his software, and while they may sincerely try to implement his principles, their implementation may also be wrong. And remember, they need to test for cheating for a large number of players quickly. So they will not be able to spend as much computer time to test if someone is cheating as Regan does for the small number of high profile cases that he has worked on. This will likely result in a larger number of false positives, where people are accused of cheating simply because of correlation of moves with a chess engine on a small number of their games. This situation has already being reported on Chess Talk.
You might say so what, why does being banned from some online chess program matter? Well recently Fide has announced its very own on line chess system http://www.fide.com/component/conten...ine-arena.html and there they make the following claim "As you know, there are many chess playing platforms. However, FIDE online arena has a unique feature that completely sets it apart: a highly sophisticated chess anti-cheating system." I wonder what Professor Regan thinks of this assertion? Where is the evidence given that this is true? You might say so what, I do not need to use this system, I can ignore it. But such a system may trap the unwary.
Consider the following case: someone innocently plays for a while on the Fide online system and then their anti cheating software falsely accuses them of cheating. Is it not possible that this will eventually affect their long time control Fide rating or even their Fide status? I am sure all the people programming and running these online cheating systems have utter confidence in their anti-cheating technology. My response is so what? Where is the evidence that the Fide system meets Regan's requirement that "Allegations need to be presented and tested with scientifically rigorous methodology, open for peer review.” My answer to Fide's claim for the accuracy of their online system is “please show us some evidence!”
http://www.nytimes.com/2012/03/20/sc...f=science&_r=0 is an example of how complex scientific concepts can quickly become "muddled" as they pass from the expert into the public domain. This article in the New York Times commenting on Regan’s system uses the word "proof" casually in a way that someone with even a rudimentary mathematical background would never do. The word proof has a particular meaning in the mathematical world. Statistical evidence is not a proof. However, a number of independent items of statistical evidence may be convincing beyond a reasonable doubt. We see by the false accusations made by people with a strong chess background, such as GM Mamedyarov, that just being chess Grandmaster does not make one qualified to deal with these issues.
People who have cannot or will not understand the statistical principles behind the algorithms are confused. In http://www.cse.buffalo.edu/~regan/chess/fidelity/ Regan states that "The main statistical principle which these pages show has been misunderstood by the chess world is that a move that is given a clear standout evaluation by a program is much more likely to be found by a strong human player. And a match to any engine on such a move is much less statistically significant than one on a move given slight but sure preference over many close alternatives. Case in point, 2/23/09: The mention of Rybka in GM Mamedyarov's protest letter at the 2009 Aeroflot Open was evidently this kind of misunderstanding” And finally, in his words again from http://www.cse.buffalo.edu/~regan/ch...y/Golfers.html he says "Note that you already need the full model described in my papers even just to judge this kind of outlier---if you merely get a lot of matching, you may simply have played an unusually forcing game."
The other problem is that right now the cheaters are likely using standard chess engines at their full strength. But they could "detune" the engines to play at lower strength, and in this way "simulate" the progression of a chess player over time who is putting a lot of energy into studying and improving. The cheaters might also use open source chess engines, and change their evaluation methodology to add some randomness to the move selection. A very sophisticated cheater might even add some "human like" randomness so that the engine makes mistakes that are similar to those made by people; i.e. choosing a wrong move more often in a very complex tactical situation. As the cheaters improve and make their engines play more like humans it will become more difficult for these statistical methods to spot them. In short, it is hard to see how statistical approaches to catching cheaters will ever be anything other than supporting evidence that requires some other physical or observational evidence of cheating.
Cheaters in standard tournaments either have an accomplice who communicates the moves to them, or if they are working on their own they are using a phone to cheat. In the first case simply delaying the move transmission should help, and in the second case the metal detectors will find their phones. And of course we can also apply the simple rule that no player can have a phone in their possession. So if the tournament is important it seems inevitable that these types of measures will be necessary; sad but true.
For online chess playing systems the situation is direr. Here there will never be any physical evidence, so what can be done? I do not believe that the current cheating detection systems in the online world are anywhere near as accurate as Regan's system. As far as I know they do not have his software, and while they may sincerely try to implement his principles, their implementation may also be wrong. And remember, they need to test for cheating for a large number of players quickly. So they will not be able to spend as much computer time to test if someone is cheating as Regan does for the small number of high profile cases that he has worked on. This will likely result in a larger number of false positives, where people are accused of cheating simply because of correlation of moves with a chess engine on a small number of their games. This situation has already being reported on Chess Talk.
You might say so what, why does being banned from some online chess program matter? Well recently Fide has announced its very own on line chess system http://www.fide.com/component/conten...ine-arena.html and there they make the following claim "As you know, there are many chess playing platforms. However, FIDE online arena has a unique feature that completely sets it apart: a highly sophisticated chess anti-cheating system." I wonder what Professor Regan thinks of this assertion? Where is the evidence given that this is true? You might say so what, I do not need to use this system, I can ignore it. But such a system may trap the unwary.
Consider the following case: someone innocently plays for a while on the Fide online system and then their anti cheating software falsely accuses them of cheating. Is it not possible that this will eventually affect their long time control Fide rating or even their Fide status? I am sure all the people programming and running these online cheating systems have utter confidence in their anti-cheating technology. My response is so what? Where is the evidence that the Fide system meets Regan's requirement that "Allegations need to be presented and tested with scientifically rigorous methodology, open for peer review.” My answer to Fide's claim for the accuracy of their online system is “please show us some evidence!”
http://www.nytimes.com/2012/03/20/sc...f=science&_r=0 is an example of how complex scientific concepts can quickly become "muddled" as they pass from the expert into the public domain. This article in the New York Times commenting on Regan’s system uses the word "proof" casually in a way that someone with even a rudimentary mathematical background would never do. The word proof has a particular meaning in the mathematical world. Statistical evidence is not a proof. However, a number of independent items of statistical evidence may be convincing beyond a reasonable doubt. We see by the false accusations made by people with a strong chess background, such as GM Mamedyarov, that just being chess Grandmaster does not make one qualified to deal with these issues.
Comment