Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

popV crosswalk for heart lists hepatocyte (liver) #37

Open
andreasbueckle opened this issue Jul 24, 2024 · 8 comments
Open

popV crosswalk for heart lists hepatocyte (liver) #37

andreasbueckle opened this issue Jul 24, 2024 · 8 comments

Comments

@andreasbueckle
Copy link

andreasbueckle commented Jul 24, 2024

See https://github.com/hubmapconsortium/hra-workflows-runner/blob/main/crosswalking-tables/popv.csv#L88

See:

Organ_ID Annotation_Label Annotation_Label_ID CL_Label CL_ID CL_Match
heart UBERON:0000948 hepatocyte PV:0000077 hepatocyte CL:0000182

This has ramifications for these two datasets from the same team:

@axdanbol
Copy link
Collaborator

I did a quick investigation and discovered that hepatocyte is present in some of the popv algorithm's heart models. So this seems to be a problem with popv itself rather than our processing.

@emquardokus
Copy link
Collaborator

Working with Supriya on 2DFTU manuscript, I asked to see which organs had cells in common with each other. For the most part the common cells were typically immune cells (B cells, T cels etc), connective tissue cells (fibroblasts) and in the case where nested FTUs exist nephron<--glomerulus, nephron<--tubules (this includes all the parts of the tubules that we have as separate 2D FTUs). I also then noticed this outlier of hepatocyte in heart and liver, which was WRONG. Tracking it back, I found it in the popv crosswalk file, but that was derived from a list provided from Bruce which I'm not sure where he obtained his list.
I will upload a new popv crosswalk to revise/fix this mistake.
Reviewing this file filtered-CTs-with-datasets-with-organ.csv I saw "hepatocyte associated with heart" which is clearly incorrect, but was able to determine from this file it was enriched from popV crosswalk. This crosswalk was published in the 7th release in HRA-KG-->digital objects--> ctann as well as being used in the hra-workflows-runner github repo for CTAnn/HRApop work.

@andreasbueckle
Copy link
Author

Now also captured in x-atlas-consortia/hra-pop#106

@emquardokus
Copy link
Collaborator

The main reason we can NOT have a table that lists "Need list of cell types that are not supposed to be in organ model" because some cell types like various immune cells will always appear in all organs---different numbers or specific immune cells.
The hepatocyte example is the only one we found because it's such an obvious mistake---hepatocytes should only be in liver not heart or lung.
The other possible issue is if a sample block had contaminating tissue from surrounding organs---example: liver is in close proximity to both heart and lung. In this case, one would expect that the number of these contaminating cell types in a dataset should fall below the threshold one would expect of cells that natively exist in the primary tissue. This is one of the preprocessing steps for single cell RNA sequencing analysis--- remove cells that are 3 or below. This threshold can be modified during the analysis.

@andreasbueckle
Copy link
Author

From Katy:
Please omit the entire organ from popV CTann.
Docu all well in code and paper.

@andreasbueckle
Copy link
Author

@bherr2 pushed a change that maps all hepatocyte => cell for the heart/popV.

@andreasbueckle
Copy link
Author

Update: @bherr2 pushed a commit to remove heart popV cell summaries

@emquardokus
Copy link
Collaborator

@andreasbueckle @axdanbol @bherr2 Tabula sapiens version 2 just came out to cellxgene and bioxiv paper, although popV looks like they are still testing changes based on branches in their repo.
<img width="1143" alt="Screenshot 2024-12-12 at 3 49 06 PM" src="https://github.com/user-attachments/

Old cellxgene for 2022 published tabula sapiens using popV
Screenshot 2024-12-12 at 3 49 06 PM

New cellxgene for tabula sapiens 2 Dec 4, 2024 (no more hepatocytes showing up)
Screenshot 2024-12-12 at 3 47 27 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants