We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello!
OTN is a great project, thank you all for it.
This issue aims to document a possible error in the "resolved" categorization.
While using the dataset, Thiago @thiago-goncalves-souza and I noticed a possible categorization error on the try dataset (https://opentraits.org/datasets/try).
try
If we filter OTN to get only rows that are from the try dataset AND Animalia Kingdom (resolveKingdomName == "Animalia"), we get more than 5k rows.
resolveKingdomName == "Animalia"
# download data from # https://github.com/open-traits-network/otn-taxon-trait-summary/blob/main/traits.csv.gz otn_raw <- readr::read_csv("traits.csv") otn_dataset_try <- otn_raw |> # filter only the animal kingdom dplyr::filter(resolveKingdomName == "Animalia") |> dplyr::filter(datasetId == "https://opentraits.org/datasets/try") dplyr::glimpse(otn_dataset_try) # Rows: 5,311 # Columns: 31 # $ taxonIdVerbatim <chr> "1669", "1669", "1669", "1669", "1669", "1… # $ scientificNameVerbatim <chr> "Agathis philippinensis", "Agathis philipp… # $ resolvedTaxonId <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ resolvedTaxonName <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ parentTaxonId <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ family <chr> "Araucariaceae", "Araucariaceae", "Araucar… # $ phylum <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ traitIdVerbatim <dbl> 37, 3400, 759, 98, 3401, 43, 22, 17, 4, 38… # $ traitNameVerbatim <chr> "Leaf phenology type", "Plant growth form … # $ bucketId <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ bucketName <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ counts <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ datasetId <chr> "https://opentraits.org/datasets/try", "ht… # $ numberOfRecords <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 3, … # $ curator <chr> "https://opentraits.org/members/brian-s-ma… # $ accessDate <date> 2022-08-19, 2022-08-19, 2022-08-19, 2022-… # $ comment <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ relationName <chr> "HAS_ACCEPTED_NAME", "HAS_ACCEPTED_NAME", … # $ resolvedExternalId <chr> "COL:6635V", "COL:6635V", "COL:6635V", "CO… # $ resolvedName <chr> "Agathis philippinensis", "Agathis philipp… # $ resolvedRank <chr> "species", "species", "species", "species"… # $ resolvedCommonNames <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… # $ resolvedPath <chr> "Biota | Animalia | Arthropoda | Insecta |… # $ resolvedPathIds <chr> "COL:5T6MX | COL:N | COL:RT | COL:H6 | COL… # $ resolvedPathNames <chr> "unranked | kingdom | phylum | class | ord… # $ resolvedExternalUrl <chr> "https://www.catalogueoflife.org/data/taxo… # $ resolveKingdomName <chr> "Animalia", "Animalia", "Animalia", "Anima… # $ resolvedPhylumName <chr> "Arthropoda", "Arthropoda", "Arthropoda", … # $ resolvedFamilyName <chr> "Braconidae", "Braconidae", "Braconidae", … # $ providedTraitName <chr> "Leaf phenology type", "Plant growth form … # $ resolvedTraitName <chr> "Phenology", "Morphology", "UNCATEGORIZED_…
But some of the traits seems like they are from plants:
otn_dataset_try |> dplyr::count(datasetId, resolveKingdomName, providedTraitName, sort = TRUE) |> head()
Here are some of the most frequent categories that appear in resolvedPhylumName/resolvedName from this query:
otn_dataset_try |> dplyr::count(datasetId, resolveKingdomName, resolvedPhylumName, resolvedName, sort = TRUE) |> head()
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hello!
OTN is a great project, thank you all for it.
This issue aims to document a possible error in the "resolved" categorization.
While using the dataset, Thiago @thiago-goncalves-souza and I noticed a possible categorization error on the
try
dataset (https://opentraits.org/datasets/try).If we filter OTN to get only rows that are from the
try
dataset AND Animalia Kingdom (resolveKingdomName == "Animalia"
), we get more than 5k rows.But some of the traits seems like they are from plants:
Here are some of the most frequent categories that appear in resolvedPhylumName/resolvedName from this query:
The text was updated successfully, but these errors were encountered: