Talk:ELO rating system

For the ELO system the precise statistical model and the estimation of parameters is difficult to be retrieved on the internet. therefore I would much appreciated seeing it on this page, esp. since it should be a couple of lines only.

Done, roughly speaking. It's not clear what the precise model is, since Elo himself waffled between the normal and logistic curves. Moreover, the implementation of the model varies significantly from one organization to the next. Finally, it should be noted that it is a stretch to label this adjustments of ratings up and down as statistical estimation. Yes, there is a model, but adding and subtracting points on a game-by-game basis is a klutzy way to estimate anything, and highly unlikely to be used in any real statistical application.
The rating systems in place today are a political compromise between mathematicians who would like to estimate hypothetical parameters accurately and players who want each game to be a fight over the rating points they win and lose. Players seem to prefer being able to say, "I beat that guy four games straight and took 45 points from him," as opposed to being able to say, "My rating is accurate to the third digit." They don't want accuracy, they want to win and lose points. That way they have something to fight for every single game, even if they are not in contention to win a given match or tournament. --Fritzlein 20:19 28 Jun 2003 (UTC)
Can't they fight over fractions, or floating points (:)), instead? lysdexia 17:12, 12 Nov 2004 (UTC)
Contents

His name

Tidbit: "élő" means "living" in Hungarian language. --grin 19:45, 2004 Apr 6 (UTC)

Depth of something ranked with ELO?

I removed the section below from the article, as I can't find any information about this concept elsewhere... can anyone provide a cite? -- The Anome 14:16, 12 Sep 2004 (UTC)

The ELO rating depth also states something over the "depth" of the game. The total depth of a game is defined by two end points of the possible range of skills, from the total beginner to the theoretical best play by an infallible, omniscient player.
Both are not easy to establish: Is someone already a beginner who just heard the rules, thereby setting the lowest standard or does it need several games until one has immersed the rules of a game and is able to play on its own? On the other end of the range one simply has to take the best player at a given time. The total beginner, yet playing on its own according to the simple rules can in Go safely be set at 30 kyu. Theoretical best play could result in the strength of an imaginable 13 dan according to measurements of standard deviations among professional games.
Only taking 20 kyu and 9 dan as endpoints makes Go a very deep game. A rating difference of 2900 ELO points from (Gu Li) to a 20 kyu with 100 ELO points is a difference in insight into the game by 29 times the standard deviation (100 ELO points).
Chess in comparison has a similar endpoint (Gari Kasparow with once 2851 points, s.a.), yet the standard deviation is set at 200 ELO points. More difficult to compare due to the draws, however it results in a depth of chess of (only) 14 layers of standard deviation if the total beginner in chess had a rating of zero ELO points (which s/he has not AFAIK).

I remember reading something similar to this in Chess magazine (London) probably about eight or nine years ago, but I don't have a cite (I've a feeling it was in one of Fox and James' columns, but can't be sure). If I remember correctly, it reported a study which had counted the number of steps one needed to take in a number of games to get from the weakest player in the world to the strongest, where each intermediate player could score 75% against the one below. Go had the most steps by far (and so was considered the most "deep" or "difficult" game); chess was second; various other things were also considered (checkers I remember was in there, backgammon too, I think). But in any case, I'm not sure something like the above really belongs in this article: it's not about the Elo system per se; the Elo system is just being used as a tool to measure the "depth" of chess. Perhaps a mention could be made in the chess or Go articles or in some new comparison of chess and go article. --Camembert

Sorry I didn't chip in on this topic before. Yes, the ELO system has certainly been used to measure the depth of games in the manner described by the paragraphs which were removed from the article. By this measure go is a deeper game than chess, after which checkers, bridge, and poker follow in close succession. However, there is a serious problem in comparing chess to games like bridge and poker: how many hands of the latter are equal to one game of chess? The luck involved in cards means that it may take a whole evening for the superior skill of one player to manifest itself. Also there is a question of the margin of victory, as one big pot in poker can cover lots of small losses.
I think the appropriateness of this section for the article is marginal, because the fundamental concept is not really that of statistical estimation, but that of a "class interval" being a difference in skill such that the stronger player can win 75% of the time. For different games the statistical model may be different. I believe that for go tests have shown that the normal curve approximates performance better than the logistic curve. When two games use a different model it is a stretch to say that you are comparing the range of ELO ratings in each case. On the other hand, the notion of measuring the depth of a game by the number of class intervals is an interesting topic in its own right, and deserves to be covered somewhere in Wikipedia. Maybe it makes more sense for it to be attached to this article than to be put anywhere else?
Oh, and the explosion of scholastic chess in the U.S. has indeed given rise to ratings of zero. It shouldn't be too surprising that a random 6-year-old with no special gift for that game can play that badly. But if you include a zero rating in chess, you have to go down to something like 35 kyu or lower in go. Furthermore the tradition that 9-dan is the highest rank doesn't allow ratings on the upper end to expand as much as they should. Therefore, if we measure chess in a way that shows 15 class intervals, then a comparable measurement in go may show 45 or more class intervals. No matter how you slice it, the class interval measurement asserts that go is vastly deeper than chess. --Fritzlein 16:18, 14 Nov 2004 (UTC)

Glicko system?

Do we have an article about the Glicko rating system, which is gaining popularity? Apparently Glicko-2 could replace Elo one day.--Sonjaaa 02:26, Jan 31, 2005 (UTC)

Glickman's system has real advantages over the current clunky implementations of Elo's model, but that's not enough to make it a likely replacement. Are you suggesting that the USCF might adopt it any time soon? If so, you know more about USCF politics than I do. I was under the impression that the USCF ratings committee was a fairly conservative body. Or is ICC making the switch? Last I knew (and I confess to being out of date) only FICS was using Glicko ratings. Who else is jumping on the bandwagon? --Fritzlein
While the idea that some players have a better determined rating than others is appealing, and may be useful in other sports, actual sports organizations penalize inactivity by taking away points over time, rather than increasing the rating "uncertainty". Elo system has theoretical underpinnings that make it a true statistical estimator, at least when K is set sufficiently low. But so far there has not been any indication that Glicko is actually an improvement in terms of its predictive ability. Glicko-2 is even less well motivated than Glicko: it has both a rating deviation, RD, and a rating volatility <math>\sigma<math>. I believe that both systems can probably be manipulated by a group of conspirators fixing games against each other in such way as to drive the ratings up for one of the participants.--Kotika
Glickman is a statistician, so it isn't surprising that he thinks improvements in the rating system will come from doing better statistics on the same data. Unfortunately for his project, the underlying model IS NOT QUITE TRUE. Adding layers of refinement to the estimation technique is akin to finding the radius of the earth to the tenth digit: eventually you must face the fact that the earth is not truly spherical (It is wider at the equator than at the poles.), so extra digits of accuracy in the radius have no meaning.
The most compelling evidence that the Elo model doesn't hold true comes from the on-line chess servers. The blatant counter-example to the truth of the model is computer players, but subtler proof comes from the distortions of ratings that arise from players being able to select their opponents, favoring some and avoiding others. It is no coincidence that many ICC members consider the only accurate ratings on the server to be those from which computer players are barred and the games are paired randomly by the server rather than by choice of the participants themselves.
My opinion is that, since the underlying model is false, it is misguided to focus on more accurate estimation. Rather one should focus on the concern Kotika raises, namely rating manipulation. One's primary focus should be to minimize the opportunities for participants, either singly or in collusion, to distort their ratings, particularly opportunities to inflate their ratings. I suspect that Kotika's imputation is not quite right, i.e. I suspect the Glicko system is if anything slightly less vulnerable to manipulation than plain vanilla Elo ratings. But I do think Glicko's energy is somewhat misdirected. In practice, the biggest accuracy problems with the Elo system don't come from the klunky estimation technique, they come from the model being wrong, and from clever people exploiting the wrong model to cheat the system. --Fritzlein 16:35, 27 Mar 2005 (UTC)

Elo for Multiplayer games??

Is there a version of Elo, or a different rating system that's ideal for rating multiplayer games like Scrabble or what not?--Sonjaaa 13:01, Feb 26, 2005 (UTC)

Scrabble is considered a two-player game by serious Scrabble players, because the multiplayer version is hugely influenced by the order of play, so much so that it seems impossible to make multiplayer Scrabble fair enough for tournament play. Nevertheless your question is valid for true multiplayer games like Diplomacy. There is a natural extension of Elo's basic formula for expected number of wins, which can be expressed on the same logarithmic scale Elo chose, i.e. 200 points for a class interval. If there are N players with ratings R1, R2, ... RN, then the expected wins for player I would be 10^(RI/400)/[10^(R1/400) + 10^(R2/400) + ... + 10^(RN/400)]. Based on this model, one can produce ratings estimates from game results in a variety of ways, including simple linear adjustments parallel to Elo's suggestion for chess.
The validity of this method for any given multiplayer game is very much open to question, but I have never heard of anything better. At least this extension of Elo is plausibly fair to all players. --Fritzlein 04:03, 27 Feb 2005 (UTC)

First of all, I LOVE DIP TOO! I was actually thinking of using it for games like Setters or Carcassonne or Ticket to Ride in our group of friends. But anyway, what about this idea suggested by a friend: If player A wins against B and C, then the Elo is calculated as if it were 2 games: A beats B, A beats C. Is that any mathematically better or worse than the one you mention?--Sonjaaa 08:18, Feb 27, 2005 (UTC)

Ah, your idea is also superficially reasonable, and in fact it is what Yahoo Games uses for hearts. The winner is assumed to have beaten all three opponents at individual games. However, it is not at all mathematically equivalent to what I propose, and I don't like it one bit, because your rating adjustment depends on who you lose to. This unbalances the incentives and places the players on an uneven footing in the meta-game of ratings.
Let's say we are playing Settlers. I am rated 1200, you are rated 1600, and Jughead is rated 2000. Now it turns out that late in the game I am about to win (lucky dice), Jughead is close behind, but you have slim chances yourself. You do a quick mental calculation and see that if I win you will lose 29 rating points to me, but if Jughead wins you will lose only 3 rating points to him. Therefore you abandon your own slim chances and give all of your resource cards to Jughead for free, and otherwise try in every way to help him win instead of me.
That shouldn't happen. When you sit down to play you should know that you win X points for winning and lose Y points for losing no matter how the other players fare, so you no incentive to favor anyone. Buz Eddy realized this when he made his Maelstrom ratings for Diplomacy using the extension of Elo ratings I first mentioned, and I haven't seen it improved upon. --Fritzlein 17:02, 27 Feb 2005 (UTC)

Confusion About The Confusion

Is there really any likelihood that the "ELO rating system" will be confused with the acronym for the 70's band "Electric Light Orchestra"? --BadSanta

It seems to me that the disambiguation at ELO should be enough, since to even get to this page you have to say something about rating systems. Does the Electric Light Orchestra page need a link to this one for people who want to know about chess ratings?
Navigation

  • Art and Cultures
    • Art (https://academickids.com/encyclopedia/index.php/Art)
    • Architecture (https://academickids.com/encyclopedia/index.php/Architecture)
    • Cultures (https://www.academickids.com/encyclopedia/index.php/Cultures)
    • Music (https://www.academickids.com/encyclopedia/index.php/Music)
    • Musical Instruments (http://academickids.com/encyclopedia/index.php/List_of_musical_instruments)
  • Biographies (http://www.academickids.com/encyclopedia/index.php/Biographies)
  • Clipart (http://www.academickids.com/encyclopedia/index.php/Clipart)
  • Geography (http://www.academickids.com/encyclopedia/index.php/Geography)
    • Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
    • Maps (http://www.academickids.com/encyclopedia/index.php/Maps)
    • Flags (http://www.academickids.com/encyclopedia/index.php/Flags)
    • Continents (http://www.academickids.com/encyclopedia/index.php/Continents)
  • History (http://www.academickids.com/encyclopedia/index.php/History)
    • Ancient Civilizations (http://www.academickids.com/encyclopedia/index.php/Ancient_Civilizations)
    • Industrial Revolution (http://www.academickids.com/encyclopedia/index.php/Industrial_Revolution)
    • Middle Ages (http://www.academickids.com/encyclopedia/index.php/Middle_Ages)
    • Prehistory (http://www.academickids.com/encyclopedia/index.php/Prehistory)
    • Renaissance (http://www.academickids.com/encyclopedia/index.php/Renaissance)
    • Timelines (http://www.academickids.com/encyclopedia/index.php/Timelines)
    • United States (http://www.academickids.com/encyclopedia/index.php/United_States)
    • Wars (http://www.academickids.com/encyclopedia/index.php/Wars)
    • World History (http://www.academickids.com/encyclopedia/index.php/History_of_the_world)
  • Human Body (http://www.academickids.com/encyclopedia/index.php/Human_Body)
  • Mathematics (http://www.academickids.com/encyclopedia/index.php/Mathematics)
  • Reference (http://www.academickids.com/encyclopedia/index.php/Reference)
  • Science (http://www.academickids.com/encyclopedia/index.php/Science)
    • Animals (http://www.academickids.com/encyclopedia/index.php/Animals)
    • Aviation (http://www.academickids.com/encyclopedia/index.php/Aviation)
    • Dinosaurs (http://www.academickids.com/encyclopedia/index.php/Dinosaurs)
    • Earth (http://www.academickids.com/encyclopedia/index.php/Earth)
    • Inventions (http://www.academickids.com/encyclopedia/index.php/Inventions)
    • Physical Science (http://www.academickids.com/encyclopedia/index.php/Physical_Science)
    • Plants (http://www.academickids.com/encyclopedia/index.php/Plants)
    • Scientists (http://www.academickids.com/encyclopedia/index.php/Scientists)
  • Social Studies (http://www.academickids.com/encyclopedia/index.php/Social_Studies)
    • Anthropology (http://www.academickids.com/encyclopedia/index.php/Anthropology)
    • Economics (http://www.academickids.com/encyclopedia/index.php/Economics)
    • Government (http://www.academickids.com/encyclopedia/index.php/Government)
    • Religion (http://www.academickids.com/encyclopedia/index.php/Religion)
    • Holidays (http://www.academickids.com/encyclopedia/index.php/Holidays)
  • Space and Astronomy
    • Solar System (http://www.academickids.com/encyclopedia/index.php/Solar_System)
    • Planets (http://www.academickids.com/encyclopedia/index.php/Planets)
  • Sports (http://www.academickids.com/encyclopedia/index.php/Sports)
  • Timelines (http://www.academickids.com/encyclopedia/index.php/Timelines)
  • Weather (http://www.academickids.com/encyclopedia/index.php/Weather)
  • US States (http://www.academickids.com/encyclopedia/index.php/US_States)

Information

  • Home Page (http://academickids.com/encyclopedia/index.php)
  • Contact Us (http://www.academickids.com/encyclopedia/index.php/Contactus)

  • Clip Art (http://classroomclipart.com)
Toolbox
Personal tools