Data Official Smogon University Usage Statistics Discussion Thread, mk.3

Status
Not open for further replies.

New gen, so let's go ahead and start a new thread.

The previous rules still apply, namely:

NOTE: DISCUSSION IN THIS THREAD WILL BE LIMITED TO STATISTICS CALCULATIONS, CLARIFICATIONS AND OVERALL TRENDS. DISCUSSIONS OF INDIVIDUAL POKEMON WILL BE DELETED (each Pokemon has its own thread--discuss there). POSTS THAT SIMPLY QUOTE OR REFERENCE STATISTICS WITH PERSONAL COMMENTARY WILL BE DELETED. POSTS DISCUSSING HYPOTHETICAL LOWER TIERS WILL BE DELETED.

and I'll announce each month when the stats are "up."

Feel free to ask any questions you have about how things are calculated, but be sure to first check the FAQ directly below this post.

Enjoy, data junkies!

Link to previous stats discussion thread
 
Frequently Asked Questions
  1. Where are the stats? Dude, the link is right at the top of the thread, in bold, no less, but in any case: http://smogon.com/stats/ From there, navigate to the desired month.
  2. Why aren't the stats up for the month yet? Because PS sees upwards of 10 million battles a month, and processing everything takes time. Sometimes I'm able to speed things up by preprocessing the stats from the start of the month in advance, but if there's ever a bug, I usually have to start from scratch. That means while the stats for the month could be up as early as the 4th, it could be as long as the 15th. Please don't PM / VM me. They'll be up as soon as possible.
  3. Where are the moveset stats / metagame stats / lead stats / mega stats / changes since last month?
    Respectively, in the "moveset," "metagame," "lead," "mega" and "changes" subfolders of each month.
  4. What's this business with "Raw" and "Real?"
    Jimera0, yeah--I should really include just a bit of text at the top of each Standard Stats post...
    • Usage % : Weighted
    • Raw: Unweighted
    • "Real": Only counts the Pokemon which actually appear in battle (Doubles not supported)
    The reason for the name "real" is historic--back when I first took over the stats and then the running of PO, only the Pokemon that appeared in battle were recorded in the logs, so there was no way to actually *get* the full team stats. When I modified PO to generate logs with full team info in them, we were left with a decision regarding which stats to use, and the argument was that counting only Pokemon appearing in battle was somewhat more legit, because that corresponded to actual, or "real" usage (that argument lost out in the end).
  5. How are usage stats weighted?
    Every player on Pokemon Showdown has a skill rating for each metagame they participate in. This rating--which is different from your ladder score--is calculated using an algorithm called Glicko and consists of an estimated skill value R and an uncertainty in that estimate RD. Based on these two values, we calculate the likelihood that a given player has a "true" skill value above a certain baseline (the conventional baseline was 1500, corresponding to the "average" player). For more about ratings, read here. For more about weightings, read here. Note that, starting with the May stats, if a player has an RD greater than 100, and the baseline is above 1500, then their team is not counted in the stats. Note further that it typically only takes about 5 or 6 battles to get one's RD below 100.
  6. How are tiers determined from usage?
    Tiers are based off a predictive algorithm designed to estimate how often a Pokemon will appear in the next month's usage statistics, based on the usage stats for the past three months (we update our standard tiers every three months). So we start by weighting the last three months' stats like this:
    Code:
    Three month usage= (20x last month + 3x month before that + 1x month before that)/24
    then the "OU" list for that metagame consists of all the Pokemon who appear on at least ~3.41% of teams, which is not as random a number as it might seem. Note that suspect tests are designed to move Pokemon into the Borderline ("BL") teams, which, like Ubers, are not based on usage statistics.

    As for which stats are used to determine the tiers, we're currently using a baseline of 1695 for OU, 1630 for all other tiers.
  7. Why does "Illuminate" sometimes show up in the abilities section of the moveset stats for Pokemon that can't have Illuminate as an ability?
    "Illuminate" is my placeholder for "no ability," or an ability that simply isn't recognized. This kind of situation happens when Showdown glitches out and (should be) exceedingly rare. Note that the nature equivalent is Hardy (though all five neutral natures are also aliased to Hardy) and the item equivalent is "nothing" (though that could also correspond to no item).
  8. What's the deal with the file names?
    You'll notice that for each tier and type of analysis, there are a bunch of of different files, most with names like uu-1630.0.txt. The first part of the filename is the tier, the second part is the weighting baseline (see (4)). If there's no number following the tier name, then the baseline is 1500. Also note that a baseline of 0.0 means that the stats are basically unweighted.
  9. How should I think about Baseline-0 vs. 1500 vs. 1630/1695 vs. 1760/1825 stats?
    • Baseline-0 (unweighted) stats represent everything in the format, no matter how lulzy the player or team. This is what you'd expect to encounter if we stopped doing matchmaking.
    • 1500 (no extension) stats represents what the average player in the metagame sees. Since Showdown's playerbase is more than just Smogonites, this is considerably "below" what the average person reading this thread sees.
    • 1630 (1695 for OU) stats represent "standard" stats, what the typical competitive player should see and be prepared for.
    • 1760 (1825 for OU) stats represent "1337" stats, what the best-of-the-best in the metagame are doing. To some extent, this is what all players should strive to be doing, but there are some Pokemon and strategies that are difficult to pull off and might require a greater amount of skill than the typical competitive player possesses.
  10. Why are the OU stats for 1695 and 1825 instead of for 1630 and 1760?
    OU, aka "Standard," is, well, our standard tier. It sees more battles than any other format and has the largest playerbase (second only to randbats). It also has the smallest fraction of "competitive players" of all non-random formats, due to its prominence and easy accesibility. Since our rating systems are percentile-based (that is, a rating of x roughly corresponds to being better than y% of the ladder, rather than indicating that the player is the nth best in the metagame), that means that it's a lot easier to get a rating of 1630 in OU than it is in UU or LC. Because of that, and because OU has a larger pool of battles to work with, we can up our baseline to 1695 for the "standard" stats. Similarly, while 1760 is the usual value we use for "elite" stats (the best of the best), the number that works better for OU is 1825.
  11. What's the best way to make use of the moveset stats?
    • If you're trying to figure out what's good in a tier (in terms of movesets), 1760/1825 is probably the way to go, since that tells you what the very top players use on their Pokemon.
    • If you want to determine what the likelihood is that your opponent's Pokemon carries X move or Y item, consult the moveset stats closest to your own Glicko R rating.
    • If you're having trouble dealing with a certain Pokemon and are looking for checks/counters, consult the 1500 (or even possibly the 0) stats: the lack of "1337"ness is vastly preferred to the sheer lack of data you encounter when you get that high.
  12. Can you make an analysis of the win rate for a Pokemon / team type? W/L ratio is a horrible metric on PS, because the ladder tries to pair players of similar skill levels. In a perfect world, everyone's W/L ratio would be 50/50, regardless of skill, because you'd always be playing people on your own level. You also don't want to use winrate or average rating or anything like that to measure a Pokemon's effectiveness, because then all you're measuring is how popular the mon is among n00bs. You might want to read into this a bit: http://www.smogon.com/forums/thread...re-of-how-far-a-pokemon-can-take-you.3546373/
  13. Can I perform my own analyses?
    Due to privacy concerns, I can't give you access to the raw logs, but if you have background with a programming language that can parse json, take a look in the "chaos" folder of each month's stats. Those files contain all the information used to generate the moveset statistics and include a lot more data than I could feasibly put into a file. I'm also working on a rewrite of the stats analysis called Project Onix, so if you think your analysis deserves to be part of the monthly suite, or you'd like to contribute to the project (<3), feel free to check it out or submit an issue.
More to come!
 
Last edited:
I was wondering if computing a shapley value in a tier like monotype would be relevant given how the ladder works. I think it's interesting to see how important a pokemon can be for its type and the shapley value applies well on monotype since it's basically 18 sets of players facing each other in a cooperative game.

Thanks for providing the usage stats, I rely on it a lot when I build my teams (even if this months monotype are for the old ladder) and I'm looking forward to more pokemon analytics !
 
I have downloaded the data, but I have some questions about understanding and normalizing the data. (I'm currently reading ou-1850.json)
  • I see 'usage', which sums up at 6. Does Garchomp's 12% usage imply it's in 12% of the teams? (weighted, of course)
  • Garchomp's 'Moves', however, sums up at ~2572. Am I right to assume I should divide any move's value by 2527/4 to get a probability?
  • Same as above for 'Items', 'Abilites' and 'Spreads' (which all add up to 2572/4, so I'm somewhat confident in assuming this)
  • 'Teammates' has negative values for some pokemon. Is it some kind of modifier? For exmaple, Magnezone has 89. How would I get P(Magnezone|Garchomp)? Is it simply (1+89%)*P(Magnezone)?
 
Nailec:

Not adding any new features or metrics at this time. You might want to consider adding this feature request as an issue on GitHub: https://github.com/Antar1011/Onix/issues I haven't had any time for dev lately, but when I resume work on the stats rewrite, that will be where I'll be looking for stuff like this.

West--all excellent questions.

To get a Pokemon's total usage, sum all their abilities. The divisor (for getting percentage) should be in the top-level metadata.

Not all mons have all moveslots filled, but moves cannot be repeated. That should be enough for you to do your calculations.

Teammate stats, IIRC, are P(X|Y) - P(Y). That is, fraction of teams with X that also have Y minus fraction of teams with Y.
 
So here's the sitch: when Pokebank was released, PS decomissioned the pre-bank tier and renamed the "pokebank" tier to be "gen7ou" (e.g.). This means that "gen7ou" is a combination of pre-bank and post-bank data, and the "pokebank" tier doesn't contain any data from post-pokebank release.

I'm going to rerun my scripts to try to combine all "pokebank" data and exclude post-bank from the "pre-bank" stats, but (1) it's going to take a few days and (2) it's not going to be perfect, since the change happened midway through the 26th, and I have no feasible way of separating stats that are of the same tier name and on the same day.

That being said, I'm going to go ahead and post all the stats I have in their current state and just expect to get a bunch of questions regarding why X is in the pre-bank stats.

I'm also not going to hold off on updating the UU banlist, because it's rare for five days' worth of data to change much. When my rerun finishes, I'll update the OU list as appropriate.
 
Nailec, Yeah, they should generate next month.
For the monotype seperate reports, how does it decide which type in case of splits (Ie, a weird team with just gliscor+Landorus, or Gyarados+Mantine+Pelliper, or swampert+Gastrodon+Quagsire)? Does it count for both? Random? Neither? Has it just never come up?
 
I'm confused - why did RU alpha lose mons then?

Edit: It also seems RU Alpha gained Gigalith (who is now usable on the ladder, and wasn't before).
 
Last edited:
DTC, according to this thread, RU Beta doesn't start until next month. I wasn't planning on doing shifts for RU Alpha. If you guys feel improperly served by what's available (the stats rerun should be finished soon), let me know how I can help.

Quoting from Sam's thread: "So the plan is essentially to let Alpha tiers be based off of Beta stats of the above tier, and Beta tiers to be based off of the stats of the first month the tier exits Beta."

Doesn't this mean RU alpha should be based off of UU Beta's stats (aka, this month's stats).
 
Okay, UU (and RU, for good measure) stats have been "combined." OU is also clearly split between pre- and post-bank.

Something similar to what I did with OU should have happened with the non-usage-based tiers which were affected by Pokebank (that is, Anything Goes, Ubers, Doubles [OU / Ubers] and LC), but I don't see that reflected, so I'll have to figure out what happened there.

But for you RU folks, you should feel safe using these stats to define your UU/RU cutoff.
 
Okay, UU (and RU, for good measure) stats have been "combined." OU is also clearly split between pre- and post-bank.

Something similar to what I did with OU should have happened with the non-usage-based tiers which were affected by Pokebank (that is, Anything Goes, Ubers, Doubles [OU / Ubers] and LC), but I don't see that reflected, so I'll have to figure out what happened there.

But for you RU folks, you should feel safe using these stats to define your UU/RU cutoff.
Does it matter to you if RU uses the all month 1695 stats? http://www.smogon.com/stats/2017-01/gen7uubeta-1695.txt

I don't like the idea of basing it off of such a short time period oo
 
Ok, the 1630 / 1760 stats are from the old run (read: only from after bank got released). They should have been deleted. I accidentally generated 1695 / 1825 instead of 1630 / 1760. Fixing now. Should have everything cleaned up by morning. Still don't understand what happened with Doubles / Ubers / LC.
 
Status
Not open for further replies.
Back
Top