Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic input sanitisation functions #120

Open
jenseni-git opened this issue May 12, 2023 · 2 comments
Open

Generic input sanitisation functions #120

jenseni-git opened this issue May 12, 2023 · 2 comments
Labels
enhancement update an existing command or cog for some new functionality good first issue Good for newcomers

Comments

@jenseni-git
Copy link
Contributor

As commented in PR #117, would be nice to have some generic input sanitisation functions to call that are centrally managed, and then implemented bot wide. This would ensure that all sanitisation is consistent and if bugs are found, allow them to be patched all at once.

Notes on ideas to include:

  • Stripping trailing white spaces (already in message.strip() though)
  • Removing back tick code segments, both single line and block '`'
  • Single and multi line block quotes '>'
  • Handling emojis, either through removing colons and keeping text (for meaning), or removing them all together.
    • Alternative may be to have one function for stripping (before functionality of method), to save elsewhere, and then one function to replace them back after whatever function is completed
  • Text formatting '*' and '_'
    • N.B. '*' may sometimes not be interpreted as bold or italics. They will also be used for bullets in updates being rolled out soon (hyphens will also add bullets). Depending on function made, it might be desirable to remove bullets, but keep text formatting.
  • Not sure how python handles strings, but "||" spoilers may also be a problem.

I guess a note on all of these that obviously the user put these in their message for a reason. Maybe it's just to try to mess with the bot on purpose, or maybe it's to include some text formatting in their respective output, however, consideration for all of these functions, if possible, should be given to attempting to preserve the user's formatting if it doesn't break things. If these characters do break things, attempts to replace them at the end would be good.

For a start though, generic, bot-wide sanitisation would be good.

After these are created, they would obviously also need to be actually implemented bot-wide.

@jenseni-git jenseni-git added enhancement update an existing command or cog for some new functionality good first issue Good for newcomers labels May 12, 2023
@49Indium
Copy link
Member

I'm just going to link #100 here, as it seems to be a subset of this issue. We might need to remove #100.

@andrewj-brown
Copy link
Member

This is the sort of trick you could handle with a decorator (and it would be pretty cool to do so). Take a function, find every str argument, then call the original function with sanitise() on all of them.

This wouldn't handle in-depth sanitisation (e.g. when you're pulling message.text from an interaction) but it would make base-level sanitisation very easy. For in-depth, you'd still just have to sanitise() the string.

However, that's an implementation detail that I'll leave to whoever picks up this issue. For now, I'm agreeing with Isaac that #100 is a sufficiently narrow subset and closing it.

Quoth Isaac on that PR:

Wow, if I had a nickel for every time I [found a regex for replacing discord emotes] ... I'd have two[three] nickels - which isn't a lot, but it's weird that it happened twice[thrice].

(and also noting that the 3 he identified there are in haiku.py, yelling.py, and cowsay.py)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement update an existing command or cog for some new functionality good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants