Help migrating away from ORES #444

isaranto · 2023-07-28T14:48:45Z

Hi! I am part of the Wikimedia ML team, we are starting the migration of ORES client to another infrastructure, since we are planning to deprecate it. More info in https://wikitech.wikimedia.org/wiki/ORES

TL;DR:

The ORES infrastructure is going to be replaced by Lift Wing, a more modern and kubernetes-based service.
All the ORES models (damaging, goodfaith, etc..) are running on Lift Wing, more on how to use them in https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage
We have new models called Revert Risk, to replace goodfaith and damaging for example. The are available on Lift Wing, and we'd like to offer them as valid and more precise/performant alternative to ORES models. If you'd like to try them we'd help in the migration process!
Thanks in advance,

ML team

welcome · 2023-07-28T14:48:47Z

Thanks for opening your first issue here! Be sure to follow the issue template!

xinbenlv · 2023-07-30T01:53:56Z

Hi, @isaranto , that would be awesome!

AikoChou · 2023-08-08T10:51:37Z

Hello! We have noticed that Wikiloop might be using the mediawiki.revision-score stream. However, the mediawiki.revision-score stream will also be deprecated with ORES. For users who use the stream, the Wikimedia ML team plans to offer several streams, each associated with a single model score, such as:

mediawiki.revision-score-goodfaith
mediawiki.revision-score-damaging

Alternatively, we have new models called Revert Risk to replace goodfaith and damaging, and we could provide a stream for the revert-risk score.

If Wikiloop is currently ingesting events from the mediawiki.revision-score stream, please let us know your preference.

You can find more information in our thread: https://lists.wikimedia.org/hyperkitty/list/[email protected]/thread/X5KUTNHW646KYGE7V6SDSHVGVOL5DFDX/

elukey · 2023-09-06T15:14:42Z

@xinbenlv Hi! Is what @AikoChou wrote good in your opinion? We are trying to figure out remaining users of the revision-score stream :)

xinbenlv · 2023-09-06T15:26:27Z

I will take a look. thank you!

xinbenlv · 2023-09-06T15:27:44Z

It would be great if we can get a score of "borderline-ness" because we want to let human prioritize reviewing those borderline between damaging and goodfaith

elukey · 2023-09-07T08:19:17Z

It would be great if we can get a score of "borderline-ness" because we want to let human prioritize reviewing those borderline between damaging and goodfaith

@xinbenlv could you clarify the above point? More specifically, we'd need to understand if you'd need streams or if you'b be happy to query the new API (https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage).

We also offer a new model called Revert Risk Language Agnostic (specs, API), that should be a replacement of both damaging and goodfaith (they are still available via Lift Wing though, if needed).

xinbenlv · 2023-09-07T16:47:51Z

let me give a bit context about why we use ORES in WikiLoop DoubleCheck in the firstplace: WikiLoop DoubleCheck intends to "put human in the loop" for fact checking with "AI support", so we use ORES to find "borderline suspicious edits".

"Borderline means:

when an edit is obviously bad, it's an easy revert, it's less valuable taking human's time.
when an edit is obviously good, it's an easy ok, depriorize for review too.
when an edit is neither obviously good nor obviously bad, it's the best use of human's time.

With such context, what's your suggested API?

elukey · 2023-09-09T13:08:18Z

@xinbenlv thanks for the explanation! I'd go for Revert Risk for two reasons:

It is a brain new model, trained with recent data, and fully supported by the WMF Research team. The goodfaith/damaging models are still supported but they will not be improved any further, since they are old and difficult to manage (so we'd prefer to simply deprecate them in the future).
It gives a single score on a specific rev-id, assigning to it a value that tells how much confident the model is that a revert needs to happen. Based on this score value you can decide whether it fits in our obviously good/bad use cases, or not. The score is basically a probability, so something like 1-10% or 95-99% could be ranges that you don't want a human involved, meanwhile for the rest yes (I am writing numbers without much thinking, just to give an idea :)).

On the implementation side, we (as ML WMF) are trying to deprecate the revision-score stream from https://stream.wikimedia.org since we'd like to break it down into multiple ones. Basically instead of having a lot of scores fro m different models for every revision-id (like in revision-score), we will have a stream for every model (rev-id -> model score). We still don't have a stream for Revert Risk, but we are planning to add one soon-ish.

We checked your code and we found references of revision-score, so what we are wondering is:

Are you still actively consuming data from it? Or do you get your scores directly from the ORES API on demand?
If you use the stream, would it be ok to move to another stream (like Revert Risk, if you decide to migrate to that model) during the next couple of months (waiting for us to make it available)? In this case it would be without any data from revision-score, since we'd deprecate it for good.

We don't want to break users, so we are trying to follow up as best as we can to support all of you :) Lemme know!

elukey · 2023-09-12T12:17:47Z

To be more precise: https://github.com/google/wikiloop-doublecheck/blob/master/server/ingest/ores-stream.ts#L26

The above is the snippet of code that we are referring to, but since I don't see any trace of traffic from you related to it, I am wondering if it is running or not :)

elukey · 2023-09-13T16:34:52Z

@xinbenlv thoughts? :)

xinbenlv · 2023-09-14T20:44:16Z

Sorry for a late response. Let me take a look

elukey · 2023-09-15T07:38:29Z

Thanks! We have already stopped the stream (https://phabricator.wikimedia.org/T342116), lemme know if it impacts your project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help migrating away from ORES #444

Help migrating away from ORES #444

isaranto commented Jul 28, 2023

welcome bot commented Jul 28, 2023

xinbenlv commented Jul 30, 2023

AikoChou commented Aug 8, 2023 •

edited

Loading

elukey commented Sep 6, 2023

xinbenlv commented Sep 6, 2023

xinbenlv commented Sep 6, 2023

elukey commented Sep 7, 2023

xinbenlv commented Sep 7, 2023 •

edited

Loading

elukey commented Sep 9, 2023

elukey commented Sep 12, 2023

elukey commented Sep 13, 2023

xinbenlv commented Sep 14, 2023

elukey commented Sep 15, 2023

Help migrating away from ORES #444

Help migrating away from ORES #444

Comments

isaranto commented Jul 28, 2023

welcome bot commented Jul 28, 2023

xinbenlv commented Jul 30, 2023

AikoChou commented Aug 8, 2023 • edited Loading

elukey commented Sep 6, 2023

xinbenlv commented Sep 6, 2023

xinbenlv commented Sep 6, 2023

elukey commented Sep 7, 2023

xinbenlv commented Sep 7, 2023 • edited Loading

elukey commented Sep 9, 2023

elukey commented Sep 12, 2023

elukey commented Sep 13, 2023

xinbenlv commented Sep 14, 2023

elukey commented Sep 15, 2023

AikoChou commented Aug 8, 2023 •

edited

Loading

xinbenlv commented Sep 7, 2023 •

edited

Loading