Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help migrating away from ORES #444

Open
isaranto opened this issue Jul 28, 2023 · 13 comments
Open

Help migrating away from ORES #444

isaranto opened this issue Jul 28, 2023 · 13 comments

Comments

@isaranto
Copy link

Hi! I am part of the Wikimedia ML team, we are starting the migration of ORES client to another infrastructure, since we are planning to deprecate it. More info in https://wikitech.wikimedia.org/wiki/ORES

TL;DR:

The ORES infrastructure is going to be replaced by Lift Wing, a more modern and kubernetes-based service.
All the ORES models (damaging, goodfaith, etc..) are running on Lift Wing, more on how to use them in https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage
We have new models called Revert Risk, to replace goodfaith and damaging for example. The are available on Lift Wing, and we'd like to offer them as valid and more precise/performant alternative to ORES models. If you'd like to try them we'd help in the migration process!
Thanks in advance,

ML team

@welcome
Copy link

welcome bot commented Jul 28, 2023

Thanks for opening your first issue here! Be sure to follow the issue template!

@xinbenlv
Copy link
Contributor

Hi, @isaranto , that would be awesome!

@AikoChou
Copy link

AikoChou commented Aug 8, 2023

Hello! We have noticed that Wikiloop might be using the mediawiki.revision-score stream. However, the mediawiki.revision-score stream will also be deprecated with ORES. For users who use the stream, the Wikimedia ML team plans to offer several streams, each associated with a single model score, such as:

mediawiki.revision-score-goodfaith
mediawiki.revision-score-damaging

Alternatively, we have new models called Revert Risk to replace goodfaith and damaging, and we could provide a stream for the revert-risk score.

If Wikiloop is currently ingesting events from the mediawiki.revision-score stream, please let us know your preference.

You can find more information in our thread: https://lists.wikimedia.org/hyperkitty/list/[email protected]/thread/X5KUTNHW646KYGE7V6SDSHVGVOL5DFDX/

@elukey
Copy link

elukey commented Sep 6, 2023

@xinbenlv Hi! Is what @AikoChou wrote good in your opinion? We are trying to figure out remaining users of the revision-score stream :)

@xinbenlv
Copy link
Contributor

xinbenlv commented Sep 6, 2023

I will take a look. thank you!

@xinbenlv
Copy link
Contributor

xinbenlv commented Sep 6, 2023

It would be great if we can get a score of "borderline-ness" because we want to let human prioritize reviewing those borderline between damaging and goodfaith

@elukey
Copy link

elukey commented Sep 7, 2023

It would be great if we can get a score of "borderline-ness" because we want to let human prioritize reviewing those borderline between damaging and goodfaith

@xinbenlv could you clarify the above point? More specifically, we'd need to understand if you'd need streams or if you'b be happy to query the new API (https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage).

We also offer a new model called Revert Risk Language Agnostic (specs, API), that should be a replacement of both damaging and goodfaith (they are still available via Lift Wing though, if needed).

@xinbenlv
Copy link
Contributor

xinbenlv commented Sep 7, 2023

let me give a bit context about why we use ORES in WikiLoop DoubleCheck in the firstplace: WikiLoop DoubleCheck intends to "put human in the loop" for fact checking with "AI support", so we use ORES to find "borderline suspicious edits".

"Borderline means:

  • when an edit is obviously bad, it's an easy revert, it's less valuable taking human's time.
  • when an edit is obviously good, it's an easy ok, depriorize for review too.
  • when an edit is neither obviously good nor obviously bad, it's the best use of human's time.

With such context, what's your suggested API?

@elukey
Copy link

elukey commented Sep 9, 2023

@xinbenlv thanks for the explanation! I'd go for Revert Risk for two reasons:

  1. It is a brain new model, trained with recent data, and fully supported by the WMF Research team. The goodfaith/damaging models are still supported but they will not be improved any further, since they are old and difficult to manage (so we'd prefer to simply deprecate them in the future).
  2. It gives a single score on a specific rev-id, assigning to it a value that tells how much confident the model is that a revert needs to happen. Based on this score value you can decide whether it fits in our obviously good/bad use cases, or not. The score is basically a probability, so something like 1-10% or 95-99% could be ranges that you don't want a human involved, meanwhile for the rest yes (I am writing numbers without much thinking, just to give an idea :)).

On the implementation side, we (as ML WMF) are trying to deprecate the revision-score stream from https://stream.wikimedia.org since we'd like to break it down into multiple ones. Basically instead of having a lot of scores fro m different models for every revision-id (like in revision-score), we will have a stream for every model (rev-id -> model score). We still don't have a stream for Revert Risk, but we are planning to add one soon-ish.

We checked your code and we found references of revision-score, so what we are wondering is:

  1. Are you still actively consuming data from it? Or do you get your scores directly from the ORES API on demand?
  2. If you use the stream, would it be ok to move to another stream (like Revert Risk, if you decide to migrate to that model) during the next couple of months (waiting for us to make it available)? In this case it would be without any data from revision-score, since we'd deprecate it for good.

We don't want to break users, so we are trying to follow up as best as we can to support all of you :) Lemme know!

@elukey
Copy link

elukey commented Sep 12, 2023

To be more precise: https://github.com/google/wikiloop-doublecheck/blob/master/server/ingest/ores-stream.ts#L26

The above is the snippet of code that we are referring to, but since I don't see any trace of traffic from you related to it, I am wondering if it is running or not :)

@elukey
Copy link

elukey commented Sep 13, 2023

@xinbenlv thoughts? :)

@xinbenlv
Copy link
Contributor

Sorry for a late response. Let me take a look

@elukey
Copy link

elukey commented Sep 15, 2023

Thanks! We have already stopped the stream (https://phabricator.wikimedia.org/T342116), lemme know if it impacts your project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants