Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anti-Spam Content Moderation #56

Open
fedetibaldo opened this issue Mar 29, 2020 · 2 comments
Open

Anti-Spam Content Moderation #56

fedetibaldo opened this issue Mar 29, 2020 · 2 comments

Comments

@fedetibaldo
Copy link

fedetibaldo commented Mar 29, 2020

The website lacks any ability to report content. Unfortunately, this means that taking down publicly visible streaks that promote Australian house inspection companies or Egyptian LG retailers, requires a painfully slow and cumbersome process, also known as human interaction.

Jokes aside, for our convenience, I've tried to lay down the foundation of an auto-regulated system aimed at empowering users to get marginal content out of the way.

I'd love to hear what you think about it, both in terms of feasibility and equity.

Formal rules

  • You start with 1 trust point;
  • each day, the first submission to a public streak earns you 1 trust point;
  • when you flag a submission or a streak, the owner loses 1 trust point;
  • you can flag each user only once;
  • if you get to 0 trust points, your profile gets frozen (see frozen profile);
  • you can't flag your content;
  • you need 10 trust points to flag.

Definitions

Frozen profile: all your submissions become hidden, as well as all the streaks you host, unless they contain non-hidden submissions, in which case you keep your role as host, but you can't manage them anymore. You can't submit new entries. You can't join streaks. You can't create new streaks.

Corollary

When you delete a submission, the trust score for that day must be recomputed.

Scenarios

A spammer creates a streak

A spammer creates a streak, submits an entry, and then fades into the oblivion. One day, one at least two loyal and active users stumble across their content. At the time of the finding, the spammer has 2 trust points. The loyal users then proceed to flag both the submission and the streak either the submission or the streak. The spammer gets to 0 points, and their content gets soft deleted.

A spammer floods others' streaks

In a matter of minutes, a spammer joins many streaks and submits the same entry to each of them. At least one two loyal and active users notice, and proceed to flag all their suspicious submissions. At the time of the finding, the spammer has 2 trust points. As soon as two submissions are flagged people flag him, all the spammer's hard work gets hidden, as he hits quota 0.

A spammer abuses the flag system

I hope someone flags them before they gain enough trust. One person can't do much harm, as N+1 users are needed to freeze someone else, where N is the number of active days the latter managed to achieve.

An innocent user gets frozen

Highly improbable, almost impossible. We send them an email, tell them they are frozen; they answer back and tell us they did nothing wrong; we manually check, find out it's true, restore their trust score, and find out who abused the system, since we thankfully kept a log of the most recent trust transactions. but that's not true. There must be a reason if several users flagged him. The flag table (which puts the users who flag in relation to those who got flagged) should testify so.

Notes

For old users, the number of unique days in which they submitted is their initial trust score. This applies to users that exit the frozen state too.

@fedetibaldo
Copy link
Author

EDIT March 30th

Added rule 4: "you can flag each user only once".
Fixed scenarios accordingly.

@leafo
Copy link
Owner

leafo commented Jan 22, 2021

Thanks for writing this idea out, I think long term a community run moderation system is the best way to go since I might not be available to clean stuff up. This idea sounds interesting and like it could work. I think having regulars who have extra moderation power could be useful as well.

In the meantime though, I wrote out a text classification system for spam detection to run in the background. Most of the spammers that have been currently hitting the site are easily detectable and can be outright banned, so I've added relevant tools to the admin panel to help detect and deal with them. I'm still training the spam database so I haven't gotten everything yet but I hope to get it all cleaned up over time (there are about 20k accounts registered on streak club, and I would guess maybe 2/3 are spam accounts?)

Going forward, new accounts that trip the spam detection system will be in quarantine and only logged in users can view the pages created by those accounts until the page is reviewed. This should hopefully deter most spammers as they create these pages only because they think they are publicly available.

We haven't yet had an issue where spammers attack existing streaks with submissions, but that will be something else to consider. Probably a similar quarantine system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants