-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SGD - Noctua migration #34
Comments
Update on managers call:
|
Was delayed because of GPI issue (now fixed) |
Managers call: @suzialeksander says data not yet loaded |
@suzialeksander Can you add the current state of the project here please? ie remaining work and planned delivery date Thanks, Pascale |
These models are in Noctua Dev, and SGD is testing. Hopefully, if no major issues, we can ask to have these loaded to prod quite soon (couple weeks?). Other remaining work includes switching pipelines at SGD/GO to not scoop QuickGO anymore and make sure we aren't eating tails anywhere. |
|
SGD is ready for this project to move forward. steps to get into prod:
|
Thanks @srengel ! Just to check:
|
Hi @pgaudet
FYI, we currently have GO annotations in SGD from these sources: |
Update, SGD is ready for a new file. Pending production of the new file by @alexsign, we should be able to load a new file next week with help from @dustine32. There are a few pipeline tweaks and data checks for both P2GO and SGD after that. |
There is some documentation at https://docs.google.com/document/d/1PZH2SiyF9FJhvW_M_cr3GlReSZfvbj96AkFHs9DD6Qc. To expand the process:
in progress |
@suzialeksander As we get closer to the end of this, it would be good to work out a final timeline for these steps to make sure that we're not causing any double-ups or gaps anywhere. @vanaukenk It might also be interesting to see what the profile of this import set is when viewed through the lens of @balhoff 's recent tooling for geneontology/go-shapes#306 . It might help contextualize some choices before final commitments are made. |
Final (?) planning call today: agenda in Shared Drive Slides with current yeast dataflow, and nearly identical flow post-project Next steps: Feb 1:
Upon successful snapshot containing above models:
|
Also, SGD is waiting for the remainders file that @alexsign is working on. |
After the outage, the models seem to have landed as intended. Success!! However, a tiny issue emerged during spot checking: two curators were assigned the same ORCID when converting the files for loading, ~647 models out of the 7075 loaded. Next steps:
|
@suzialeksander remainders file available now. same name, same place. please take a look and let me know if it's good. |
Noting that #34 (comment) has changed given recent discussions: we will essentially be doing a full clobber with the expectation that we're essentially doing a re-run of yesterday (as SGD is still in their curation freeze). |
Update from the 8 Feb Noctua outage/load: Spot checking has revealed some extraneous inferred annotations, specifically "reproductive process" from
The immediate actions are:
|
Just to clarify in advance of today's call - I can't speak directly for them, but I doubt that MGI would also want redundant ancestor/child annotations with the same evidence code from the same paper. There may, however, be other inferred annotations that they do want. One other option we've considered for the GPAD output is to give inferred annotations their own evidence code so they could more readily be filtered if groups do not want them. That said, it would still be nice to create useful inferences wherever possible. |
After today's call, @kltm and @cmungall will look into diff'ing the terms and seeing if adding Managers agreed that dealing with inferred annotaitons is really a separate project from the import, and further work in this new project would include giving these inferred annotations a more accurate EC than implying the curator made these inferred annotations directly. Inferred annotation situation is analogous to when GO inferred BP-MF annotations, then backtracked. |
Noting that the diff/exploration has a ticket here: geneontology/noctua-models#271 |
After spot checking some models, there are ShEx violations in several models- incorrect relations for the terms, tec. Waiting on the violation report to see a full list, but the ones that have come up are individually fixable so far. As for the inferences, it seems these might be fixable though ontology improvements. Still waiting on a release to make sure the entire cycle SGD-GO works, but starting to test the snapshot that just came out. |
Discussing with @pgaudet |
@alexsign Please start Noctua import. After it’s done, please cross check annotations and delete old ones. Then please make NoctuaSGD public. (this is our understanding of remaining steps for this project. please correct us if this is wrong.) for reference, this is Suzi's email from last Thursday Mar28: Hi Alex, Thanks for these files. We've looked at them, specifically the P2GO_not_in_Noctua, and it looks like these are left out mostly due to being not yeast, or not in the protein-centric world (this is expected, lots of RNAs and such). The lastest Snapshot is 2024-03-21, and Pascale and I cleared it for SGD annotations this week although it doesn't have a lot of our latest edits to save annotations that failed the import. I think everything looks good for you to proceed with Step 4, deleting SGD data from P2GO & make NoctuaSGD public. Thanks |
Project link
https://github.com/orgs/geneontology/projects/61
Project description
Tasks needed to complete the migration of manual SGD annotations from Protein2GO to Noctua for full adoption of Noctua as SGD GO curation tool.
PI
Mike
Project owner (PO)
@suzialeksander
Technical lead (TL)
@dustine32
Other personnel (OP)
TBD
Technical specs
Using current system
Other comments
Code changes will likely be in https://github.com/biolink/ontobio; tickets about bugs/code will be here.
SGD models will be isolated in https://github.com/geneontology/sgd-go-cams; tickets about project progress will be here.
The text was updated successfully, but these errors were encountered: