The decision to base UU off of 1760 stats

Status
Not open for further replies.
Isnt it possible that 1760 is too high to be deciding tiers off? Im speaking ss somebody who is really frustrated by this change with my rating of 1691+-42 i was weighted 1.0 previously but now only 0.05. I have been on the OU top 500 list twice this generation, so it seems silly that I now barely count at all for tiering.


Edit: on this thread you say "In the short-term, the best and easiest solution is simply to raise the cutoff to the level we believe corresponds to the strength of the "average competitive" player." But in the usage stats thread you say that 1760 corresponds to the top 2% of the ladder, which would seem to be a substantially higher cutoff than an "average competitive player."
 
Last edited:
Isnt it possible that 1760 is too high to be deciding tiers off?
Yes. It's entirely possible. Next month I'm going to have a talk with the councils about it, based on the analytics I generate.

Im speaking ss somebody who is really frustrated by this change with my rating of 1691+-42 i was weighted 1.0 previously but now only 0.05.
While your WEIGHT is less, your CONTRIBUTION is actually greater:

1.0/1420000 = 7x10^-7 < 1x10^-6 = 0.05/40800

But in the usage stats thread you say that 1760 corresponds to the top 2% of the ladder, which would seem to be a substantially higher cutoff than an "average competitive player."

It's entirely conceivable that 98% of the people playing OU on Showdown are worse than the average competitive player (note I'm not saying that only 2% are competitive--I'm saying that 2% are better than the average competitive player--an important distinction).
 
This means the cutoff won't always be 1760. Some months it might be lower, some higher.

From what I gather, there will be monthly stats (and candles) generated every month, with tier updates occurring every three months as usual. But... will we use one candle/cutoff for the whole period, or will it be possible for the Council to choose one for each month and the stats be generated based on that?

I don't think there's a problem per se doing that, mind you; it just feels... fickle.
 
Mario With Lasers, the idea is that the composition of the playerbase may change month-to-month, based on other things that are going on (suspect tests, the release of Gen VI analyses, the opening up of unrated ladders), so locking into one number for a three-month period is pretty limiting.
 
Id just like to point out that there are at least a dozen users on the ou top 500 with a glicko score in the 1600s. I get that the leaderboards are based on ELO, not glicko, but it seems a bit silly to say that these people on the leaderboards are worse than the average competitive player. Just seems like 1760 is way too high to me
 
I get that im just trying to show that 1760 is substantially better than the "average competitive player" in hopes that the bar wont be set this high again in the future
 
While I fully support this decision and think it is the right thing to do, I think you are understating a lot of people by dismissing them as not wanting to be competitive. Before I knew anything about competitive pokemon when I was playing pearl I specifically remember my giratina had hyper beam, and my gengar had both shadow claw and shadow ball. This is what is being dismissed as nonsensical, but it was simply uneducated. I legitimately thought that hyper beam was the best move in the game and used it for that reason. What appeals to the average pokemon fan in PS is the ability to use whatever pokemon they want at first, many of them don't even know what the competitive scene is. So please take note of this and don't dismiss newer players so fast. However I have another idea and now is as good of a time as ever to launch it:

2 OU ladders. One is beginner,one is normal or "advanced" however you want to do it. The way it works is that the lower ladder is unofficial and does not have stats, and for serious competitive players it should be a breeze, they can graduate to the upper one in half an hour. The catch is the upper ladder isn't available until the lower one has been conquered to a certain rating. This way competitive players get their fill and don't have to deal with less experienced players, but newer/noncompetitive players get to play to, preferably against other players of their caliber. Having an unrated ladder is similar, but who wants to play on that when a rated ladder gives them a shot of being the very best like no one ever was. For example I just looked at random battles: 812 rated to 31 not rated.
 
While I fully support this decision and think it is the right thing to do, I think you are understating a lot of people by dismissing them as not wanting to be competitive. Before I knew anything about competitive pokemon when I was playing pearl I specifically remember my giratina had hyper beam, and my gengar had both shadow claw and shadow ball. This is what is being dismissed as nonsensical, but it was simply uneducated. I legitimately thought that hyper beam was the best move in the game and used it for that reason. What appeals to the average pokemon fan in PS is the ability to use whatever pokemon they want at first, many of them don't even know what the competitive scene is. So please take note of this and don't dismiss newer players so fast.
But the uneducated people who don't know that Shadow Claw Gengar and Hyper Beam anything are bad are the exact people who we don't want influencing the usage stats. They are the people who will use Donphan in OU because they think that it is good. This is why we changed things in the first place.
 
I did like the unrated ladder idea but he does have a valid point, and it sorta points at the reasons why it might not, if it doesn't work.
 
For example I just looked at random battles: 812 rated to 31 not rated.

Is random battles really that competitive of a ladder, though? Rated randoms vs. unrated randoms is kind of like the difference between firm and silken tofu: for people who know what they're talking about, they're different products with significantly different uses, but most people don't know the difference and just mentally refer to it as that blobby tasteless hippie stuff.
 
While I'm okay with the idea of raising the cutoff, one thing I'm worried about is that the pool of players that tiers are based off of is too small. Metagames revolve around players checking one another, and if the pool is only 10 players (exaggeration), the OU and UU tiers are going heavily based on random error and player preferences. I'd say even though we want tiers decided by competitive players, we need to make sure the pool is large.
 
While I'm okay with the idea of raising the cutoff, one thing I'm worried about is that the pool of players that tiers are based off of is too small. Metagames revolve around players checking one another, and if the pool is only 10 players (exaggeration), the OU and UU tiers are going heavily based on random error and player preferences. I'd say even though we want tiers decided by competitive players, we need to make sure the pool is large.

This is a legitimate concern, but I honestly feel that the raw number of battles the OU ladder sees every month makes this an appropriate sample size. The 1760 cutoff corresponds to about the top 2% of the ladder, which is still ~51,000 battles based on last month's stats. Just to put this into perspective, this number is similar to what we saw on the entire ladder of some other official metagames back in BW, sometimes even surpassing them depending on the metagame/month. Besides, Antar has said before that the number might be adjusted each month as the need arises, and I'm sure that the sample size will be one of the factors taken into account regarding these adjustments.
 
mrglass, if the pool were only 10 players, we would never set the cutoff that high. The only reason we're able to do this is because PS is so incredibly active: nearly three million OU battles alone last month. Two years ago, we had less than a tenth of that, and we never ever ever would have considered doing something like this.
 
Seems a bit drastic. How many people will have a "significant" impact? Seems like only 30-40, and the most weight will be given to someone who battles frequently. Seems like it could be easily distorted, no?

Especially when you factor that these 30-40 people are using the same team for most of the month, or for most of the battles (afterall, we will continue to use the team as long as there's a "hot streak"). Antar you could probably give me a better picture I might be misunderstanding the #s.
 
Seems a bit drastic. How many people will have a "significant" impact? Seems like only 30-40, and the most weight will be given to someone who battles frequently. Seems like it could be easily distorted, no?

Especially when you factor that these 30-40 people are using the same team for most of the month, or for most of the battles (afterall, we will continue to use the team as long as there's a "hot streak"). Antar you could probably give me a better picture I might be misunderstanding the #s.

It sounds like the number of players accounted for in the 1760 stats is much, much larger than 30-40, as a result of exponential growth of the community in recent years.
 
It sounds like the number of players accounted for in the 1760 stats is much, much larger than 30-40, as a result of exponential growth of the community in recent years.

If a 1500 player accounts for 2% the weight of a 1900 player... that's a narrow range no? Then I'll estimate that a player with 1700 will have 30% the weight of a 1900 player...

Especially since you need to play a lot of games to get to 1900... it'll be really top heavy. I can't really speak to how these formulas work, so wondering what you, antar, have to say. Exactly how many people will have a weight of .5+ if the top player has a 1.0 weight?
 
114, it's true that top players count a lot more than players with decent ratings, but as I showed in the sample calculations, (most) players with decent ratings count just the same (a bit more actually) than they used to, when considering fraction of the entire stats--it's just that players with ratings less than ~1600 (keep in mind, I'm talking Glicko, not Elo) don't count at all, and their contribution is made up for by people at the top.

And no, it turns out that rating does not correlate with number of battles played, even at the top-most levels. Surprising, I know.

As for how many players have a weight of at least 0.5? That's actually an easy question to answer, because all I need to do is count the number of players with Glicko rating of at least 1760: that number is 1510.
 
Status
Not open for further replies.
Back
Top