Friday, January 13, 2012

Analysing the Distribution of Gamespot and Metacritic Scores

So, I spent an hour today graphing the distribution of scores in a line chart. What we find is rather interesting.

Naturally, we see a spike in every whole and half number, which may also be attributed to Gamespot's previous practice of not using decimal values, but we also see that the chart is skewed with 7.0 being both the median and mode.
So does this mean that there are 70 degrees of badness, but only 30 degrees of goodness? In a proper distribution, we should see a traditional bell curve, with 5.0 being the median. Perhaps this means most games are actually better than average? Or perhaps Gamespot simply likes to give 7.0's for mediocre games?

Personally, I believe this goes back to the American education system which labels 70% as a "C" or "Average" There's nothing inherently wrong with this metric, since it's fairly consistent across all districts, but after 12 years of learning "70% is average" these students grow up and have that mindset throughout their lives. thus, a score of only 50%, which SHOULD be average, reflects an "F" score, and thus not worth their time.

Of course, this information and analysis is nothing new, but I think seeing it so clearly makes for an interesting conversation piece.

What are your thoughts?

(UPDATE!) By request, I did the same thing to aggregate gaming review site Metacritic. These are the results:

And here are the two data sets overlayed:

We can clearly see a much more even distribution of scores between 85 and 55, without spikes on the halves. However, we still see the skewed bell curve with a mean around 73. While this shows that Metacritic is considerably more trustworthy in their scores, since it is collected from many sources around the internet, their data comes from the still-flawed "70% average" paradigm.


  1. The issue is that game rating is completely objective, meaning that it is impossible to have a true median. I personally feel that numerical scoring is worthless, and prefer sites like RPS that analyze the game and let the reader make their own decision.

    1. Well, I think that you're a huge nerd.

      O'Doyle rules!

    2. You call him huge nerd? Really? Thats the best you got? Fuck off troll

  2. I think that many truly terrible games (sub 5.0 rating) don't even make it to market, or are badly distributed.

    Therefore, the computed average score should be higher than the expected average (5.0).

  3. I considered that, but wouldn't that mean our perception of "average" would be shifted, since we wouldn't have as much knowledge of sub-par games?

  4. In the UK education system, the C grade is centred around 50% (sometimes skewed/weighted a bit higher or lower depending on the average for that exam that year). But from what I've experienced, UK-based game review sites still exhibit this clustering around 70%.

  5. I don't like the statement that bad games never make it to market meaning we should have an inflated table. Aborted malformed children never make it to market, yet we seem to have no problem with our rating of men/women being centered around a 5.

  6. A bit of an offensive way of putting it, but I totally see your point.

  7. On a 0-10 scale, 5 is the median. It does not necessarily have to be the mean, though.

    When you use a standardized scale, yes, you would want the mode, median and mean to all converge at the center, which would be 5 on a ten point scale.

    However, if the scale is not standardized, and I do not believe there is a standard for game reviews (or every single test taken in the public education system for that matter), the fact that the average is not the median is simply not a problem.

  8. Results aren't showing up for me