• Snag some vintage SPL team logo merch over at our Teespring store before January 12th!

Announcement Reintroducing COIL

Aberforth

is a Top Social Media Contributoris a Member of Senior Staffis a Community Contributoris a Metagame Resource Contributoris a Tiering Contributoris a Contributor to Smogon
Ubers Leader
As I alluded to in this post, the topic of revamping the reqs we use has been a discussion among tier leaders for several weeks now, and this is what we've decided to do to try to fix the issues various tiers have with suspect requirements. We are bringing back COIL for tiers that opt-in to it.

Before going into what COIL is, I'd like to specify some of the assumptions we make about what a good system for suspect requirements looks like.

1) It should demonstrate comprehension of the element being suspect tested.
2) It should be entirely meritocratic. There should not be any subjective component by which reqs are easier for some than they are for others.
3) Achieving reqs is a way to demonstrate engagement and care of the outcome of the vote by dedicating time to achieve it.
4) Better players should be able to get reqs quicker, as we assume they start with a higher level of comprehension of the element being suspect tested.
5) Enough people should be able to achieve reqs that it is a representative vote of the invested parties in a tier.

If you disagree with any of these principles, it should be the subject of a separate policy review thread. If you believe there is a system that fits these parameters that is better than COIL, by all means bring it up here. This thread should be focused on COIL and any other issues people have with the current suspect process, working within the scope of the five assumptions above.

I'm going to put some of the more maths-heavy explanations of COIL in the spoilers below, for anyone who wants to understand exactly how the system will work. The short version is that COIL is a function of GXE and games played that can be approximated to a skill threshold. Its akin to an Elo score, but one that does not directly effect matchmaking (aka, you're not only going to play people with similar COIL). As such, its a more refined version of the system we are currently using for reqs, with each tier having far more flexibility with COIL than they currently do.

COIL is a function of GXE and games played, that follows this formula: COIL=40*GXE*2^(-B/N)

B is a value that the tiers themselves set to try to tweak the number of games people need to play to achieve a target COIL, but essentially, as the number of games increases, your COIL increases until it reaches the maximum value (40 times your GXE). So, if you have 80 GXE, your COIL would eventually reach 3200, but wouldnt go above that unless you managed to increase your GXE.

What this roughly boils down to that having a higher B value is good if you want a greater difference in games played between gxes, and a higher COIL if you want higher GXE requirements. The equation is set up so that as N increases, the section of 2^(-B/N) will trend towards 1, as B/N trends lower and lower towards 0. It being a negative power is necessary for COIL to start small and accumulate over larger games.

For further explanation, this thread from Antar from back in the day should explain things in greater detail.

I have also made a desmos program that you can play around with the values with this calculation here, if anyone prefers a more visual representation of what COIL means. In this calculation, C is COIL, b is B, and x is the GXE people have. The y-axis is the number of games people would have to play with x GXE. There are some visual aides in the form of bars at GXE values of 70, 80 and 85. In the example its set to open at, this would mean that 70 GXE would attain COIL reqs of 2400 after 71 games, 80 GXE would attain them after 38 games, and 85 GXE would attain them in 31. By tweaking these values so that C=2600, and b=9, you can see that it would turn into 85 games, 31 games and 24 games, respectively, always rounding the number of games up. You cant play 0.05 of a game, after all.

COIL is not something unfamiliar to Smogon users of a certain age. It was the metric by which reqs were done when I first joined the website, and continued to be used for several years before it was shut down by Zarel on account of people misusing it all the time by having a minimum GXE that didnt match what the actual minimum gxe of the COIL was. In bringing COIL back, it will not be used with game limits of any sort, and if anyone notices a tier using them, message me so I can go yell at the tier leader in question.

COIL is more generally customisable than current GXE-based suspect qualifications. Tiers can customise the formula to suit their own particular ladder needs, both making reqs easier for tiers with inactive ladders, or raising the bar for reqs in tiers where the GXE benchmarks are currently too low. It is also a more tangible goal when laddering for reqs, as COIL will trend upwards over a larger sample of games, as opposed to just having a GXE target.

One of the main aspects of COIL that appeals to me is the fact that it scales down the GXE floor that requires reqs in a lot of cases, especially for lower tiers who currently have ladders where the established GXE values are more difficult to achieve, and currently suffer from lower voter counts than they used to, in spite of the player base still remaining strong. It doesnt clear the hassle hurdle at the moment, and COIL can help overcome that. In addition to that, COIL should also remove a lot of stress from getting reqs, as it trends upwards as the number of games played increases. Getting reqs in 50 games with 80 GXE is easy to measure, but it isnt as stressful if you know that with a GXE of 79.6, you'll still get reqs in 5 more games. This also means players can continue laddering after getting reqs if they want to, because it is easier to gain back any losses in COIL over a number of games played. However, this aspect is not for all tiers, especially those that already have very high voter turnout, which is why this system will be opt-in.

Implementing COIL for a tier will also remove the current 30 game minimum requirement for players who are substantially above the level of the ladder within that tier. Right now, if you can quickly amass a GXE several points higher than the current minimum required in the 30 games, you still have to play the minimum 30 games (potentially losing the GXE to luck along the way). However, with this system, once you reach COIL, you can get reqs immediately, letting players get reqs faster if calibrated right. This will largely be edge cases, or people achieving reqs in around 25 games or so, but is still something that we anticipate to have a positive effect on suspect participation.

This is primarily going to be useful for lower tiers that are currently struggling with the rigidity of the current system. COIL has been introduced to all ladders on PS already, but this is opt-in by tier leaders, tiers like OU have already expressed a preference to stick to the current reqs system. Without being used for suspects, the COIL tab for other tiers will be only for aesthetics. This also will not be used in any Old Gen OU ladders to get suspect reqs, as those tiers will continue with the existing qualification setup.


TL;DR:
COIL was a system we used to use for rating suspect tests, which we're bringing back now for some tiers with a few tweaks as we believe it'll be a fairer and more achievable system for getting reqs in tiers that the current system isnt working for. You ladder until your COIL reaches the number specified in the OP of the suspect. This will not be mandatory for all tiers.
 
Hello if you are a tier leader that is curious about using COIL but have no idea what numbers to use, I have a Google sheet for you.

There are two numbers that matter, the COIL number and the B value. A higher B value means a bigger decline of required GXE with more games. A higher COIL number means a higher GXE required at all levels. You can play around with these two, looking at the GXE required at a few different game numbers, to get a sense of what difficulty of reqs you want. For reference, I also included a few examples of sample values based on common req values.

As a reminder, the only number the average ladderer cares about is the COIL number; ladder until you hit the required COIL. The B value is just a tool for Tier Leaders to set how COIL varies with games played.

For anyone that wants to do the math themselves / at game counts not on the sheet, the formula is: GXE = COIL / 40 / 2^(-B/# of games).
 
To preface this, I am not a complete hater, I simply enjoy this discussion because I think the topic is interesting. But importantly I wanted to bring up that there's a reason that all those years ago we stopped using COIL: people hated it.

COIL is not something unfamiliar to Smogon users of a certain age. It was the metric by which reqs were done when I first joined the website, and continued to be used for several years before it was shut down by Zarel on account of people misusing it all the time by having a minimum GXE that didnt match what the actual minimum gxe of the COIL was. In bringing COIL back, it will not be used with game limits of any sort, and if anyone notices a tier using them, message me so I can go yell at the tier leader in question.

This is a mischaracterization of why COIL was stopped. Yes, people misusing COIL was the final event that led to COIL being discontinued... but why were people misusing COIL in the first place? COIL was not meeting the needs of the community so people started hacking on additional solutions to try and "fix" it. My main fear here is that we are going in a circle and are going to run into the exact same problems that it had 8-9 years ago.

1. Additional Complexity. COIL is absolutely harder to understand than a simple GXE-Games Required table. A GXE-Games Required table doesn't require a calculator or an Excel Spreadsheet. As a general rule for something customer-facing you want to remove as much inertia as possible to understanding something. Yes, people can understand COIL if they put their mind to it, but most people won't. COIL is also non linear, which means it's harder to understand how your COIL is increasing. "COIL go up" is a simple enough answer when people are laddering, but people get increasingly frustrated when the rate of coil increase decreases and they don't understand why. You end up with a system where noticeably more people are confused about how the system works. Of course, if additional complexity leads to a better solution, then it's definitely worth it. l2p. But COIL does not sufficiently lead to better outcomes enough to justify using a more complex system.

2. "Bad" Players voting
1723210754826.png
This is the current minimum game tables for the ongoing RU and NU suspect tests. People with mediocre GXEs can vote by simply playing hundreds of games. The present opinion is that even though you have a lower GXE, because you played a lot of games you are capable of voting on the metagame. Sounds reasonable enough and that was the exact ideology people had back then too. Except, after a couple of suspect tests, people hated it and they realized what it actually led to. Someone playing hundreds of games at a 73 GXE doesn't make them qualified to vote on the metagame, all it means is that even over a long sample size, they still don't know what they're doing.

Very famously, Hydreigon was not banned in ORAS UU because a majority of low GXE voters voted No Ban, while a majority of high GXE voters voted Ban. That level of partition was almost definitely an outlier, but it's illustrative of the frustrations the community had with the system.
People rejected the fundamental idea on which COIL is based, that you compensate a low GXE with more games played. People explicitly did not like low GXE voters being part of the vote.

Of course, you can tune the COIL parameters to have a higher minimum GXE...but then you lose all the benefits of COIL. If you're a fighting a high GXE requirement then it's still going to be "stressful". If you have a high GXE requirement, you might as well be using our current system.
1723210776773.png

B Value: 4 Coil: 3025
This is an example of parameters that lead to a COIL system that requires a higher minimum GXE. Is this preferable to the current system? You could make the argument, but it's more of a lateral shift than anything. COIL is essentially GXE with a coat of paint on it, so you end up with a system that is noticeably more complex, but in exchange offers no real benefit. If you want to make your reqs easier, you can use our existing method and lower your GXE cutoff with a simple table just fine, you don't need COIL to do that. Logarithms are a little more mathematically elegant because they have an asymptote built in, but that's not relevant for our use case.

All that being said, the problems of today are different than the problems of 9 years ago. 9 years ago we didn't have lower tier suspects with <10 voters. It could very well be that given the circumstances, we are ok with lower GXE voters participating and COIL is the right mechanism to enable that. There's also nothing really wrong with trying out new solutions. While I'm personally skeptical of COIL being the answer, it's good that we are proactive in taking agency and trying to solve problems. This is a fundamentally tough problem with no magic bullet. I just wanted to share an alternative perspective because I see a lot of optimism around a previously failed idea.
 
Looking at RU and NU specifically, I see a lot of what freezai (and contributors in the linked post) are talking about with regards to potentially less-qualified players being able to skew outcomes of important tiering actions, but my opinion of this is a bit different than his. I wrote the post referenced in the OP about using an alternative to GXE for suspect tests because I wanted to make the suspect laddering experience more rewarding for 'good' players while not being exclusive of 'less-good' players, which COIL intends to do... intends, at least.

In looking at the scales provided for the recent RU and NU suspect tests (table below for easy reference), you can see that COIL doesn't really provide much of a reward for doing well relative to the previous GXE system, as the minimum game requirements for the 82-78 GXE range only changes marginally (and favorably for the 'less-good' pool). This is partly due to the values chosen by RU and NU of course, and this is the first suspect test using COIL since it was reintroduced so I expect the values to be adjusted in some capacity to fine-tune the suspect system, but for now COIL essentially doesn't change much for the 'good' players. If you were getting suspect reqs in these tiers, chances are you were doing so in 30-50 games regardless, so you probably didn't even notice (I didn't). Moving forward, though, the players who previously failed to get reqs will be able to do so, which can be argued as good or bad depending on how you look at it - I think it's somewhere in the middle.

GXE Requirements - Games NeededCOIL Requirements (GXE) - Games Needed
82 - 3082 - 31
81 - 3581 - 34
80 - 4080 - 37
79 - 4579 - 41
78 - 5078 - 45

Allowing a minimum GXE should be a consideration even outside the scope of COIL - using the tables provided by RU and NU, the minimum GXE listed for reqs is 71, though it takes a significant time investment in the metagame to get there with 343 games minimum. 71 GXE would probably be defined as an okay player, capable of winning games against a fair amount of the community. I think something arbitrary, like 75 or so, would still be inclusive of people who were previously falling shy of suspect reqs without having to set the COIL values so drastically that you essentially use an inferior system like what I believe ZU is doing. For reference, the current system requires 71 games at 75 GXE, so the people qualifying in this range are 'good-ish' players who played enough games to probably know the meta okay.

The downsides of COIL are very real, however. As an unofficial metagame, ZU set their COIL value to 2920 and b=4, making a drastic change to their previous requirements where people were qualifying with ~40 games played at ~75 GXE. To get a suspect vote in this current test at 75 GXE, you'd have to play 2.5x the amount of games than the previous test you did, which is incredibly arduous for such a small community. On top of this, the values are so drastic that somebody who went undefeated took 27 games to qualify, though you could argue they probably played at less-than-peak times... Then again, the next post is a 27-2 run that also barely got reqs, so clearly the top-GXE requirements are very demanding - perhaps too demanding.

Circling back to the actual votes, I think we can ask a few pertinent questions. Does the 'okay' player who gets reqs now using COIL play enough to know a good amount of the metagame? I can see the argument for either side here - they clearly played enough to understand the basics of the meta, but their GXE is indicative of a lack of thorough understanding that is probably needed to tier effectively. If somebody wants to play 300+ games in a tier they enjoy they should be rewarded, right? Well... maybe? I think people who are that invested in a tier probably should have some sort of input into it, but freezai's point about being 'okay' over a long stretch not being sufficient evidence of metagame understanding is also valid. I don't expect a lot of people to actually grind 100+ games on one alt often, as you're probably just better off creating a new one and trying again with your better understanding of the meta, though this isn't perfect either.

At some point you are going to cap out your skill set and, unless you adapt by figuring out the meta better or you have an incredible run, you'll probably remain at that cap. Defining what that cap needs to be to have influence over tiering will always be conversation worthy, and I agree that having this conversation and implementing changes is good. I have hopes that COIL will increase ladder activity, bring people who previously couldn't be involved in tiering into the conversation, and improve the community as a whole.

Thanks again to the mod team and staff overall - regardless of how COIL pans out moving forward, implementing changes with the intent to improve the community for all is always good.
 
Last edited:
Hello ZUTL here,

The downsides of COIL are very real, however. As an unofficial metagame, ZU set their COIL value to 2920 and b=4, making a drastic change to their previous requirements where people were qualifying with ~40 games played at ~75 GXE. To get a suspect vote in this current test at 75 GXE, you'd have to play 2.5x the amount of games than the previous test you did, which is incredibly arduous for such a small community. On top of this, the values are so drastic that somebody who went undefeated took 27 games to qualify, though you could argue they probably played at less-than-peak times... Then again, the next post is a 27-2 run that also barely got reqs, so clearly the top-GXE requirements are very demanding - perhaps too demanding.
The numbers were picked based on our previous gxe reqs. We used to need 81 gxe with 30 games, and 77 gxe with 50 games. Our current COIL and b values make it so we need 80 gxe for 30 games, 77 gxe for 51 games, and can now qualify with down to 73 gxe, which wasn't possible with the old system.

Allowing a minimum GXE should be a consideration even outside the scope of COIL - using the tables provided by RU and NU, the minimum GXE listed for reqs is 71, though it takes a significant time investment in the metagame to get there with 343 games minimum. 71 GXE would probably be defined as an okay player, capable of winning games against a fair amount of the community. I think something arbitrary, like 75 or so, would still be inclusive of people who were previously falling shy of suspect reqs without having to set the COIL values so drastically that you essentially use an inferior system like what I believe ZU is doing. For reference, the current system requires 71 games at 75 GXE, so the people qualifying in this range are 'good-ish' players who played enough games to probably know the meta okay.
At the risk of sounding too elitist, I disagree with that entirely. I wouldn't consider someone who has 71 gxe on the ZU ladder to be knowledgeable enough on the ZU metagame, and setting up our coil to 2920 (i.e 73 gxe) was already a stresh we made to allow players who got reqs with subpar gxe on our winrate suspect tests to get reqs. Our use of COIL is more akin to what freezai suggests (with lower gxe to reflect ZU's smaller ladder) as we even have the same b-value. COIL is a glorified gxe, I do believe it is an improvement in every aspect, as it gives us far more control us leaders on our reqs. It allows to both let excellent runs get reqs earlier, not letting them get ruined by luck, with the 5Dots and I's runs being good examples, and also let lesser great runs get reqs if they manage to prove they can be consistent on the ladder and reaching that gxe was most likely not just due to luck.

1. Additional Complexity. COIL is absolutely harder to understand than a simple GXE-Games Required table. A GXE-Games Required table doesn't require a calculator or an Excel Spreadsheet. As a general rule for something customer-facing you want to remove as much inertia as possible to understanding something. Yes, people can understand COIL if they put their mind to it, but most people won't. COIL is also non linear, which means it's harder to understand how your COIL is increasing. "COIL go up" is a simple enough answer when people are laddering, but people get increasingly frustrated when the rate of coil increase decreases and they don't understand why. You end up with a system where noticeably more people are confused about how the system works. Of course, if additional complexity leads to a better solution, then it's definitely worth it. l2p. But COIL does not sufficiently lead to better outcomes enough to justify using a more complex system.
While I do completely agree with your second point, I think this one isn't fair. There are very few players who know how gxe works itself, so making a linear function of gxe isn't more helpful than a non-linear one. Having to get a set amount of COIL is easy to understand, and you don't have to understand how that value is computed, just like you never needed to understand how gxe worked. In both cases, the better your opponent was the bigger your gxe / coil increases when you win, and the more games you play the less your gxe / coil increases / drops after each game.

I do however want to emphasize on the second issue brought up by freezai. Why did we remove winrate reqs for being too easy, if it is to have reqs which allow players with even worse gxe to qualify? 65 gxe from ORAS was really terrible, but 70 gxe is still mediocre. I understand that there is some level of subjectivity on what's a bad gxe, especially given ladder size. However, the b-value being so high is quite weird. If you believe that someone playing 200 games to get reqs with sub 72 gxe is good enough to vote, then needing above 82 gxe with 30 games is quite inconsistent. We shouldn't reward people for being able to spend 10 hours on the ladder.
 
Why did we remove winrate reqs for being too easy, if it is to have reqs which allow players with even worse gxe to qualify?
Win rate reqs were not "removed for being too easy." Win rate is a bad metric for measuring skill on a ladder, as the ladder attempts to match you with players of similar skill level, so win rate should trend towards 50% over time. It is also easier to manipulate, such as laddering during dead hours when the average opponent is easier or queueing multiple games at once when your ELO is lower.

GXE is just a straight upgrade to win rate, as it weights your wins and losses by quality of opponent. Even if you do aim to manipulate the quality of your opponent, the reward is much smaller and mitigated by taking a larger hit for losing to a lower ranked opponent.
 
Win rate reqs were not "removed for being too easy." Win rate is a bad metric for measuring skill on a ladder, as the ladder attempts to match you with players of similar skill level, so win rate should trend towards 50% over time. It is also easier to manipulate, such as laddering during dead hours when the average opponent is easier or queueing multiple games at once when your ELO is lower.

GXE is just a straight upgrade to win rate, as it weights your wins and losses by quality of opponent. Even if you do aim to manipulate the quality of your opponent, the reward is much smaller and mitigated by taking a larger hit for losing to a lower ranked opponent.
That's not true. During suspect tests, you get matched multiple times with players with bad elo who are just new alts created for the suspect tests. That's the main reason why gxe reqs suck, if you lose to an opponent who just started their suspect run (or even are 15 games in, and still below 1300 like most runs do), your run is often ruined and you've got to restart. This is a vicious circle where suspect testers keep ruining the reqs of each others. Laddering at dead hours lets you more often than not avoid these players and gxe reqs incentivize playing at them (at least for smaller ladders like NU and ZU). My personal experience participating in suspect tests like those is that the worse that can happen to you is laddering while someone else is also trying to get reqs as you're going to get paired multiple time together, and even going neutral is going to ruin your gxe a ton. Facing players above 1500 elo (high ladder for nu and zu) is nothing compared to it, worse that can happen is losing a couple decimal of gxe if you lose. Suspect alts are often as good as these players but losing to one of them more than once often means restarting a new alt. Winrate is not a good metric, but it is surely more fair than gxe on smaller ladders.
 
The main issue I have with reqs is the following.

1) The experience of loading HO for 20 games to get a quick 18-2 start, and then grinding out 10 more games with some mediocre balance realistically does not teach much about the ladder. Yet this is how almost everyone does suspect laddering; you run a style that gives fast games that don't require you to think much, while punishing mediocre building and stomp low ladder, then you swap to something a bit more consistent once you start facing better teams. In this kinda paradigm you only really have 10-20 "good" games in a typical suspect run.

2) Losing to a NUGSS account that is currently 1000 elo really really sucks from a player experience standpoint. You did not really do anything to "deserve" a loss vs someone who has 1500 +-130 glicko, but you lost to them cause you just drew a bad straw. Frankly it should also be noted that this absolutely sucks for the regular ladder population. A 1000 elo Timmy is going to get stomped relentlessly during suspect season, and it'll probably make them feel bad because they normally would not be queueing into "good" teams nearly as often. I should note on the smaller lower tier ladders even at high ladder you'll face a 1040 elo dude, but its not going to be every single game.

----

We should open a thread at the beginning of each suspect and let people submit their mains to it. This serves to look at their current game count, GXE, and elo. If they can show that they've played 30 games since they submitted their main, and have maintained a qualifying COIL, let them get reqs off of it. This should be faster than getting standard reqs because if your main is already high ladder, and you're getting a fair number of high ladder games, you're probably experiencing the meta better than playing rain for 20 games straight vs Pikachu.

----

It is possible that all we should require is a RD < 50, or some other indicator that folks are actually getting games in, but letting people get their reqs on their main is likely something that helps us out. We should still do alt checks / ensure that the (initial) screenshots are current, but these are logistical issues that can be solved. I'm also not claiming that the numbers I'm citing here are correct, the main issue is just that we're basing the entire suspect system around spamming new accounts, rather than established accounts.

----

Edit 1:
Most of the issues folks have with COIL seem to be the exact numbers set. The numbers chosen by NU and RU imply a min GXE of 70, which likely is too low, but given the ladder size I'd argue not too low by an order of magnitude. If they'd set a min GXE of 74 (2960), and tuned b-values to keep the number of games roughly constant you'd probably have a better experience.

----

Edit 2:
COIL is fine but we should just make the formula COIL = GXE * 2^(-b/games)

The factor of 40 does help obfuscate stuff, but practically speaking COIL just converges to your exact GXE at the limit of games => infinity, so like, just let it be transparently that.

----

The downsides of COIL are very real, however. As an unofficial metagame, ZU set their COIL value to 2920 and b=4, making a drastic change to their previous requirements where people were qualifying with ~40 games played at ~75 GXE. To get a suspect vote in this current test at 75 GXE, you'd have to play 2.5x the amount of games than the previous test you did, which is incredibly arduous for such a small community. On top of this, the values are so drastic that somebody who went undefeated took 27 games to qualify, though you could argue they probably played at less-than-peak times... Then again, the next post is a 27-2 run that also barely got reqs, so clearly the top-GXE requirements are very demanding - perhaps too demanding.

In the old system they'd have to play 5 more games to get reqs. A good player went flawless and got games faster than in the previous system so I'm not really 100% sure why this is a downside to COIL.
 
Last edited:
apropos of freezai’s post, I have laddered since barack obama was elected and have absolutely 0 idea how gxe is calculated. the difficulty of understanding the rating system holds very little stock in my opinion, and I implore policymakers to prioritize effectiveness for suspect testing above all else (which isn’t to say we should lean into convolution, but instead to suggest that this is mostly unimportant)
 
Last edited:
agree with dice
Sure a table of gxe + games played is something you can reference, but you shouldn't have to. As your gxe changes every game there is no sure fire way to know whether you hit reqs without checking the forum thread constantly
and that's not even getting into the idea that your gxe converges as your deviation goes down, leading to scenarios where I might think it's possible to get reqs with just a few more wins but every win gives less and less gxe (which i guess happens with coil too but it should be a bit more obvious)
Also dislike the idea of "reaching" target gxe but needing more games just to fill out the min games requirement. With coil once you reach the target you're done
 
Back
Top