Anti-Spam Content Moderation #56

fedetibaldo · 2020-03-29T21:31:59Z

The website lacks any ability to report content. Unfortunately, this means that taking down publicly visible streaks that promote Australian house inspection companies or Egyptian LG retailers, requires a painfully slow and cumbersome process, also known as human interaction.

Jokes aside, for our convenience, I've tried to lay down the foundation of an auto-regulated system aimed at empowering users to get marginal content out of the way.

I'd love to hear what you think about it, both in terms of feasibility and equity.

Formal rules

You start with 1 trust point;
each day, the first submission to a public streak earns you 1 trust point;
when you flag a submission or a streak, the owner loses 1 trust point;
you can flag each user only once;
if you get to 0 trust points, your profile gets frozen (see frozen profile);
you can't flag your content;
you need 10 trust points to flag.

Definitions

Frozen profile: all your submissions become hidden, as well as all the streaks you host, unless they contain non-hidden submissions, in which case you keep your role as host, but you can't manage them anymore. You can't submit new entries. You can't join streaks. You can't create new streaks.

Corollary

When you delete a submission, the trust score for that day must be recomputed.

Scenarios

A spammer creates a streak

A spammer creates a streak, submits an entry, and then fades into the oblivion. One day, ~~one~~ at least two loyal and active users stumble across their content. At the time of the finding, the spammer has 2 trust points. The loyal users then proceed to flag ~~both the submission and the streak~~ either the submission or the streak. The spammer gets to 0 points, and their content gets soft deleted.

A spammer floods others' streaks

In a matter of minutes, a spammer joins many streaks and submits the same entry to each of them. At least ~~one~~ two loyal and active users notice, and proceed to flag ~~all~~ their suspicious submissions. At the time of the finding, the spammer has 2 trust points. As soon as two ~~submissions are flagged~~ people flag him, all the spammer's hard work gets hidden, as he hits quota 0.

A spammer abuses the flag system

~~I hope someone flags them before they gain enough trust.~~ One person can't do much harm, as N+1 users are needed to freeze someone else, where N is the number of active days the latter managed to achieve.

An innocent user gets frozen

Highly improbable, almost impossible. We send them an email, tell them they are frozen; they answer back and tell us they did nothing wrong; ~~we manually check, find out it's true, restore their trust score, and find out who abused the system, since we thankfully kept a log of the most recent trust transactions.~~ but that's not true. There must be a reason if several users flagged him. The flag table (which puts the users who flag in relation to those who got flagged) should testify so.

Notes

For old users, the number of unique days in which they submitted is their initial trust score. ~~This applies to users that exit the frozen state too.~~

fedetibaldo · 2020-03-30T19:53:07Z

EDIT March 30th

Added rule 4: "you can flag each user only once".
Fixed scenarios accordingly.

leafo · 2021-01-22T20:31:31Z

Thanks for writing this idea out, I think long term a community run moderation system is the best way to go since I might not be available to clean stuff up. This idea sounds interesting and like it could work. I think having regulars who have extra moderation power could be useful as well.

In the meantime though, I wrote out a text classification system for spam detection to run in the background. Most of the spammers that have been currently hitting the site are easily detectable and can be outright banned, so I've added relevant tools to the admin panel to help detect and deal with them. I'm still training the spam database so I haven't gotten everything yet but I hope to get it all cleaned up over time (there are about 20k accounts registered on streak club, and I would guess maybe 2/3 are spam accounts?)

Going forward, new accounts that trip the spam detection system will be in quarantine and only logged in users can view the pages created by those accounts until the page is reviewed. This should hopefully deter most spammers as they create these pages only because they think they are publicly available.

We haven't yet had an issue where spammers attack existing streaks with submissions, but that will be something else to consider. Probably a similar quarantine system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anti-Spam Content Moderation #56

Anti-Spam Content Moderation #56

fedetibaldo commented Mar 29, 2020 •

edited

Loading

fedetibaldo commented Mar 30, 2020

leafo commented Jan 22, 2021

Anti-Spam Content Moderation #56

Anti-Spam Content Moderation #56

Comments

fedetibaldo commented Mar 29, 2020 • edited Loading

Formal rules

Definitions

Corollary

Scenarios

A spammer creates a streak

A spammer floods others' streaks

A spammer abuses the flag system

An innocent user gets frozen

Notes

fedetibaldo commented Mar 30, 2020

EDIT March 30th

leafo commented Jan 22, 2021

fedetibaldo commented Mar 29, 2020 •

edited

Loading