Discord's Privileged Intents: Performative Privacy at its best

Disclaimer: I do not work for Discord, or with Discool. If i wanted to look cool I would call myself an “independent security researcher”, but in the end I just stare at Chromium's DevTools while Discord is open for an hour.

What are Intents?

Discord has unveiled Gateway Intents a while ago. A PR on the Discord API Docs repository was created to help library developers start adding support for Intents, and provides some answers to questions developers raised up.

The general idea of Intents is that instead of having all events being thrown away at your gateway connection, only the events you actually want are sent, this gives huge wins to general network usage. Most bots get a ton of typing events while being unable to do anything about that, for example. There was a stop-gap measure for this via the “guild_subscriptions” field, but the general idea is that Intents is the way to go, as it's more generic.

From all intents, two of them are considered “Privileged”: members and presence, providing the member list and user status (online/offline/what game they're playing) respectively.

I'm writing this on October 3rd, 2020 (with many drafts afterward). I did have a lot of time in my hands to study about Intents but I'm only doing it right now and sharing the results since I had to port my personal bot to it.

What's the actual deal/reasoning with Privileged Intents?

From the original blogpost:

We believe that whitelisting access to certain information at scale, as well as requiring verification to reach that scale, will be a big positive step towards combating bad actors and continuing to uphold the privacy and safety of Discord users.

To use Privileged Intents, your bot: – must not be in 100 guilds, or – if you are in more than 100 guilds, you must be a “verified developer” with a “verified bot”. If you attempt to bring your bot down to less than 100 guilds after that, you can't have the intents back

The verification process involves filling a form describing your bot's functionality, and give a government-issued identification document via Stripe Identity. Depending on your country, your passport may be the only available option. You can check that information on the Stripe Identity documentation.

There is a grace period before bots are forced to be verified: October 7th, 2020. Bots beyond 100 guilds past that date will not be able to join any more guilds.

From the blogpost, and the surrounding timing on it, I can safely say the creation of Privileged Intents is to provide an answer to what I dub the “discool discourse”. They haven't said that publicly, and I can only base this on feeling alone.

Discool was a service that stored a lot of Discord user metadata. You could put a User ID in, and get the list of guilds they were in (of course, not all guilds, but a lot of public guilds had their data scraped).

The way they did it was by creating burner user accounts able to join those guilds, scraping all the data already available to users (the Discord client needs the member list after all), and store them in a database. Give enough time and effort, and you have a pretty big privacy scandal regarding user data and Discord.

From my perspective, Privileged Intents were made to make developers that create a large-scale privacy scandal legally binded to Discord, and go to court, and etc, etc, etc.

Are the Privileged Intents a wrong way to protect data?

I think so, or at least, very flawed. As I said, Discool uses user accounts, they never used bots in the first place, any Intent business would never get to those bad actors. Of course, Discord has been implementing various kinds of user extra-verification with “machine learning”, but the point is that anyone with a browser can start up Selenium, point it to Discord, and start scraping. Hell, given enough time and effort someone could gcore the browser process to scrape data while still being classified as having “human sentient behavior”

Even in the case of an actual privacy scandal on Discord, the damage for it, even though you're limited to 100 guilds, even though it's technically less users it's the same level of data, you can still fuck up the lives of some people, but I think Discord only cares about having less of the numbers.

When giving data out to anyone, you should assume the worst, because you don't control their intent (pun unintended). Sure, requiring ID and so providing the existing legal system as a protection against misuse of user data works with more than 0% efficacy, but I don't think that is a good solution.

In regards to giving out my own data, I'm not sure I can trust Discord in handling of that. I lost my trust on that after they explained that their data collection (the famous /api/science route, that was formerly /api/tracking, but got blocked by extensions to the point they had to rename it), if turned off by the client, would still continue (you can prove this with devtools), and in their FAQ post, just said that it's dropped on the server.

The nitty gritty: when you turn the flag off, the events are sent, but we tell our servers to not store these events. They're dropped immediately — they're not stored or processed at all. The reason that we chose to do it this way is so that when you turn it off on your desktop app it also turns off automatically on your phone – and vice-versa. This allows us to keep things the same across all of our apps and clients, across upgrades.

Believe me when I say, since they already sync user configuration across clients, it's easy to make the app disable tracking upon seeing the already existing configuration option to disable tracking.

Learning to bypass limitations imposed by Privileged Intents

For context: in the library that I'm using for my bot, discord.py, there's a full framework, discord.ext.commands, in it you can declare a command like such:

from discord.ext import commands

@commands.command()
async def some_command(ctx, argument1, argument2):
    ...

One specific feature of the framework is the ability to convert from raw strings to useful objects, for example, if I wanted to receive a user, I could say

async def some_command(ctx, user: discord.User):
    ...

and discord.py will convert either a mention, or an ID, or the username back to the User object. This is possible because Discord gives all member data to the bot upon the bot's startup. It is fully dependent on the members privileged intent.

Without the members intent, the member lists are basically empty, and the feature doesn't work anymore.

Bringing the feature back

Discord's message events contain full user objects (which have ID, name, avatar, discriminator, etc) and member objects, (which contain roles, nickname, etc) since the client must be able to draw out the UI containing that user.

Even though you don't have full member lists on startup, you, using that fact, can start passively collecting data to re-create the feature. It works like this: on any message, store the user in persistent storage, when needed, look up from that storage, and maybe put it in cache for performance reasons.

Considering message events also contain the guild's ID, you can safely say that with any message that contains an author and a guild ID, you can say that author is a member of that guild, and still do the same thing Discool thing. It is technically less data than originally, since you're only collecting data about the people that are actually sending messages in the guild.

Keep in mind that messages being sent also include the system messages created upon a user's join (they contain the same user object). They're messages like any other, and will reach any bot in the guild. You can have the same effect with bots that create welcome messages to new users, as they usually put a mention to the user (“welcome @user to $guild!”), and since mentions internally are represented by putting the user ID on the message, bots can still scrape it up.

You can also use that same method to extract relationships with typing events as well, as they contain guild ID and user ID

Future work

New Discord features

UPDATE: Two months after this blogpost, in December 2020, Discord has released slash commands, which is the finalization of the work-in-process that Discord shared.

Recently I've seen work-in-progress inside the Discord client on a way for bots to declare their own commands (you can also see the Figma prototypes on the “The Future of Bots” blogpost), removing the need for bots to have every single message being sent to them (it currently works with bots checking in if the message starts with a given prefix, etc).

It is defnitely a better idea than the current state of things, but I'm not sure if the boundary of “hey, your user data is now in the hands of the bot developer” is very well defined. Many users would still use the bot, even once, and have their user info stored on a database. I'm not using it for malicious intent, but if we're talking about the technical purpose of Privileged Intents, we must assume every bot is malicious, and design around that.

discord.py using Request Guild Members for the User and Member converters

See this issue. It would sidestep the problem of having to extract info out of messages, which would help my usecase immensely.

I don't know if it'll be implemented, and I'm too anxious to open an issue on discord.py to use it, but it is a nice idea.

Closing thoughts

this music is fucking awesome