Skip to content

Commit

Permalink
Major upgrades for new version of AusTraits (#64)
Browse files Browse the repository at this point in the history
* Testing R CMD check on other machines via GitHub Actions

* Bumped version

* Removed ubuntu #40

* Using austraits_lite in vignettes

* Using austraits_lite in structure vignette #40

* Implementing a new solution so dictionary can be precompiled before turning into a vignette #40

* Rebuilt site, vignettes are building now #40

* Changing global options in load-austraits #41

* Bug fix for parsing definitions data in dictionary.rmd #43

* Rebuilt pkgdown .htmls

* Bumped version number to minor release and minor edit to load_austraits

* xlab was not being imported properly, fixed

* Bumped the version number for dev version

* Updated README to make it clear that users need to install all dependencies and may need to upgrade exisiting packages that they may have for smooth install

* Updated pkgdown vignettes

* Moved janitor to Imports #47

* Rebuilt website using precompiled vignettes, new dir for figures

* More tests for plots #48

* Recreated austraits_lite so class is austraits, all extracted subsets are class austraits too, tweaked relevant tests #45

* Since australia_map_raster is a df, we can remove the raster import function in the plot_site_location function

* Bumped minor patch version number for PR

* Removed docs/ from PR
#49

* Revert docs to master

* Rm new files

* Rm new files in man

* Precompiled code for vignettes, figures for .RMD, removed old figs from README

* Updated README for lightweight install #49

* Updated file path to man/figures

* Updated file path to man/figures

* Added a test to detect #52

* ifelse for converting to numeric

* ifelse for converting to numeric #52

* Updated news folliwngfix #52

* Added site changes in NEWS and changed call to site_name to loc_name #54

* Fixed typo in wide_table, added to news, austraits$locations #54

* site_property to location_property #54

* Changed sites to locations in as_wide_table #54

* Renames plot_site_locations #54, added switch for join_taxonomy

* Convert versions to character strings

* Amended Join sites with version switch #55, #54

* Notes in join for changes

* Expanded join taxonomy to all versions

* Join methods tweaks for all versions

* Added alias for join_location and join_site, both functions for all versions

* Fixing contexts join

* updated and saved green .svg

* Condtional switch to print.austraits

* Upgrades to extract_dataset id, works with print.austraits #57

* #55 Join contexts working for multiple link vals

* Added two versions of join_methods

* Added switch for extract_trait #57

* Updated plot beeswarm to work with new and old versions of Austraits

* Added version switch for plot_locations with depracation warning for old plot_sites #59

* Added distinct to join_methods

* Added deprecation statement in join_sites #59

* Collapsed dataset ids in taxnomomic updates, beginnings of switches for trait_pivot_wider and internalised some switch functions

* Started to switch for extract_taxa

* Implemented taxon_name for extract_taxa #51

* Allowing extract_taxa to accept a string of family/genus/taxon_names #51

* Added switch for trait pivot

* changed to str_subset for more reaable code

* Plot_locations, accepts austraits or traits table

* Plot_locations, accepts austraits or traits table

* Added first pass as trait_pivot_wider #60

* Added switch for trait pivot longer #60

* Started to tinker with as_wide_table, removed vars option for join_contexts

* Fixed typo in as_wide_table1 #61

* Small changes to as_wide_table

* Added arg to collapse contexts for join_contexts, added as_wide_table upgrades #61

* Minor tweaks for methods#61

* Minor tweaks to roxygen tags #61

* Upgraded extract_taxa, and minor fix to extract_trait

* Minor tweaks

* Enabled lifecycle badges for soft deprecation for some functions, updated version numbers and news.md

* Minor fixes to rd files to rid warnings

* Created another austraits lite for testing new release

* Tests for as_wide_table

* Tweaked imports and removes .data in wide_table and added to tests

* rm .data$

* Rm import from rlang

* Commented out mem heavy test

* Removed files not needed and updated documentation

* Checked v number for testcoverage github actions

* Added missing topucs and search bar

* Updated version number for GHA

* Rm formatting lines from Rstudio

* Bumped v number

* Fixing up vignettes and adding to NEWS

* Tweaks to NAMESPACE, fixing vignettes so new version ofpackage works with austraits lite, new what_versions function

* Added establishment_means to join_taxonomy

* Changes species to taxa in print

* Added dplyr dependency
  • Loading branch information
fontikar authored Nov 27, 2022
1 parent 5f9775f commit 3a5df92
Show file tree
Hide file tree
Showing 75 changed files with 1,539 additions and 615 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/R-CMD-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- uses: actions/checkout@v2
- uses: r-lib/actions/setup-r@v1
- uses: r-lib/actions/setup-pandoc@v1
- uses: r-lib/actions/setup-r@v2
- uses: r-lib/actions/setup-pandoc@v2
- name: Install dependencies
run: |
install.packages(c("remotes", "rcmdcheck"))
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pkgdown_deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ jobs:
steps:
- uses: actions/checkout@v2

- uses: r-lib/actions/setup-r@v1
- uses: r-lib/actions/setup-r@v2

- uses: r-lib/actions/setup-pandoc@v1
- uses: r-lib/actions/setup-pandoc@v2

- name: Query dependencies
run: |
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test-coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ jobs:
steps:
- uses: actions/checkout@v2

- uses: r-lib/actions/setup-r@master
- uses: r-lib/actions/setup-r@v2

- uses: r-lib/actions/setup-pandoc@master
- uses: r-lib/actions/setup-pandoc@v2

- name: Query dependencies
run: |
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,4 @@ data/austraits/

#
*.pdf
*.svg
7 changes: 4 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: austraits
Title: Helpful functions to access, summarise and wrangle austraits data
Version: 1.1.1
Version: 2.1.1
Authors@R:
c(person(given = "Daniel",
family = "Falster",
Expand All @@ -22,7 +22,7 @@ Encoding: UTF-8
Language: en
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.0
RoxygenNote: 7.2.1
Depends:
R (>= 4.0.0),
RefManageR
Expand All @@ -38,7 +38,8 @@ Imports:
jsonlite,
utils,
magrittr,
janitor
janitor,
lifecycle
Suggests:
ggplot2,
knitr,
Expand Down
13 changes: 11 additions & 2 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ export(get_version_latest)
export(get_versions)
export(join_all)
export(join_contexts)
export(join_locations)
export(join_methods)
export(join_sites)
export(join_taxonomy)
Expand All @@ -20,6 +21,7 @@ export(lookup_trait)
export(my_kable_styling_html)
export(my_kable_styling_markdown)
export(my_kable_styling_pdf)
export(plot_locations)
export(plot_site_locations)
export(plot_trait_distribution_beeswarm)
export(separate_trait_values)
Expand All @@ -28,8 +30,15 @@ export(summarise_trait_means)
export(trait_pivot_longer)
export(trait_pivot_wider)
import(RefManageR)
importFrom(dplyr,arrange)
importFrom(dplyr,filter)
importFrom(dplyr,group_by)
importFrom(dplyr,select)
importFrom(lifecycle,deprecated)
importFrom(magrittr,"%>%")
importFrom(rlang,.data)
importFrom(rlang,abort)
importFrom(stats,family)
importFrom(stringr,str_detect)
importFrom(tidyr,pivot_longer)
importFrom(tidyr,pivot_wider)
importFrom(tidyselect,all_of)
importFrom(utils,methods)
29 changes: 23 additions & 6 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,24 @@
# austraits 2.1.1

# austraits 1.1.1

* Improved vignette building process
* Improved extract_ functions
* Rebuilt website

- Minor bug fix in `extract_trait`
- Minor bug fix in `as_wide_table`
- `site` & `site_name` renamed to `location` & `location_name` within functions
- `austraits$sites` renamed to `austraits$locations`
- `plot_site_locations` is deprecated, use `plot_locations`
- `join_sites` is deprecated, use `join_locations`
- The following functions have been updated to work with AusTraits > v3.0.2, as well <=3.0.2
- `join_taxonomy`
- `join_locations`
- `join_contexts`
- `join_methods`
- `join_all`
- `trait_pivot_wider`
- `trait_pivot_longer`
- `as_wide_table`
- `extract_taxa`
- `extract_trait`
- `extract_dataset`
- `plot_locations`
- `plot_trait_distribution_beeswarm`
- `print.austraits`

194 changes: 147 additions & 47 deletions R/as_wide_table.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,120 @@
#'
#' @param austraits austraits data object
#'
#' @return A single wide table with collapsed contexts and sites text and with
#' some cols renamed for alignment with other reosurces
#' @return A single wide table with collapsed contexts and locations text and with
#' some cols renamed for alignment with other resources
#' @export
#'
#' @examples
#' \dontrun{
#' data <- austraits
#' data %>% as_wide_table()
#' }
#' @importFrom rlang .data
#' @importFrom stats family
#' @importFrom utils methods

as_wide_table <- function(austraits){
# Switch for different versions
version <- what_version(austraits)

switch (version,
'new' = as_wide_table2(austraits),
'old' = as_wide_table1(austraits),
)

}

#' Turning entire AusTraits object into wide table >3.0.2
#' @noRd
#' @keywords internal
as_wide_table2 <- function(austraits){

# Function to collapse columns in locations and contexts into single column
process_table2 <- function(data) {
data %>%
tidyr::pivot_wider(names_from = property, values_from = value) %>%
tidyr::nest(data=-dplyr::any_of(c("dataset_id", "location_id", "latitude (deg)", "longitude (deg)"))) %>%
dplyr::mutate(location = purrr::map_chr(data, collapse_cols)) %>%
dplyr::select(-data)
}

################################################################################
# Define and adapt each table in the list of austraits to prepare for the wide table format

# The contexts table needs the contexts collapsed to one context name per site
austraits %>%
join_contexts(collapse_context = TRUE) -> austraits

# Getting rid of the columns that will soon be deleted in the next austraits release and renaming the description column
austraits$methods <-
austraits$methods %>%
# -----------
# TODO: this section can be removed for next release
# Some studies have multiple records per traits. This breaks things when joining
# For now select first
# dplyr::group_by(dataset_id, trait_name) %>%
# dplyr::slice(1) %>%
# dplyr:: ungroup() %>%
#----------
dplyr::rename(c("dataset_description" = "description"))

# collapse into one column
austraits$locations <-
austraits$locations %>%
dplyr::filter(value!="unknown") %>%
dplyr::rename(c("property" = "location_property")) %>%
split(., .$dataset_id) %>%
purrr::map_dfr(process_table2)

# rename taxonomic_reference field to reflect the APC/APNI name matching process better
austraits$taxa <-
austraits$taxa %>%
dplyr::rename(c("taxonNameValidation" = "taxonomic_reference"))

austraits_wide <-
austraits$traits %>%
dplyr::left_join(by=c("dataset_id", "location_id"), austraits$locations) %>%
dplyr::left_join(by=c("dataset_id", "trait_name"), austraits$methods) %>%
dplyr::left_join(by=c("taxon_name"), austraits$taxa)

# reorder the names to be more intuitive
austraits_wide %>% dplyr::select(

# The most useful (if you are filtering for just one taxon_name)
dataset_id, observation_id, trait_name, taxon_name, value, unit,
entity_type, population_id, individual_id,
value_type, basis_of_value,
replicates,
# tissue, trait_category, # Add after new zenodo release

# More stuff you can filter on
collection_date, basis_of_record, life_stage, sampling_strategy,
treatment_id, temporal_id,

#stuff relating to locations
`latitude (deg)`, `longitude (deg)`, location, plot_id,

#stuff relating to contexts and methods
context, methods, method_id, original_name,

#the citations
dataset_description, source_primary_citation, source_secondary_citation,

#the taxa details
taxonomic_status, taxon_distribution,
taxon_rank, genus, family, #accepted_name_usage_id,
scientific_name_authorship
)

austraits_wide
}

#' Turning entire AusTraits object into wide table <=3.0.2
#' @noRd
#' @keywords internal
as_wide_table1 <- function(austraits){


################################################################################
# TODO: this updated with next zenodo release
# Load the trait classification doc - classifies the tissue type and type of trait based on the trait_name data field
Expand All @@ -34,27 +133,17 @@ as_wide_table <- function(austraits){
#
# Function to collapse columns in sites and contexts into single column
process_table <- function(data) {

# worker function called intop worklfow below
# for a df, combine all column names and values
collapse_cols <- function(data) {

if(ncol(data) ==0) return(NA_character_)

data %>% purrr::imap_dfr(~ sprintf("%s='%s'",.y,.x)) %>%
tidyr::unite("text", sep="; ") %>% dplyr::pull(.data$text)
}


data %>%
tidyr::pivot_wider(names_from = .data$property, values_from = .data$value) %>%
tidyr::pivot_wider(names_from = property, values_from = value) %>%
tidyr::nest(data=-dplyr::any_of(c("dataset_id", "site_name", "context_name", "latitude (deg)", "longitude (deg)"))) %>%
dplyr::mutate(site = purrr::map_chr(data, collapse_cols)) %>%
dplyr::select(-data)
}

################################################################################
# Define and adapt each table in the list of austraits to prepare for the wide table format

# the trait table needs little prep. Rename the value columns as value
austraits$traits <-
austraits$traits %>%
Expand All @@ -67,31 +156,31 @@ as_wide_table <- function(austraits){
split(austraits$contexts$dataset_id) %>%
purrr::map_dfr(process_table) %>%
dplyr::rename(c("context" = "site"))

# Getting rid of the columns that will soon be deleted in the next austraits release and renaming the description column
austraits$methods <-
austraits$methods %>%
# -----------
# TODO: this section can be removed for next release
# Some studies have multiple records per traits. This breaks things when joining
# For now select first
dplyr::group_by(.data$dataset_id, .data$trait_name) %>%
# TODO: this section can be removed for next release
# Some studies have multiple records per traits. This breaks things when joining
# For now select first
dplyr::group_by(dataset_id, trait_name) %>%
dplyr::slice(1) %>%
dplyr:: ungroup() %>%
#------------
dplyr::select(-.data$year_collected_start, -.data$year_collected_end) %>%
dplyr::select(-year_collected_start, -year_collected_end) %>%
dplyr::rename(c("dataset_description" = "description"))

# collapse into one column
austraits$sites <-
austraits$sites %>%
dplyr::filter(.data$value!="unkown") %>%
dplyr::filter(value!="unknown") %>%
# next line is a fix -- one dataset in 3.0.2 has value "site_name"
dplyr::mutate(site_property = gsub("site_name", "name", .data$site_property)) %>%
dplyr::mutate(site_property = gsub("site_name", "name", site_property)) %>%
dplyr::rename(c("property" = "site_property")) %>%
split(., .$dataset_id) %>%
purrr::map_dfr(process_table)

# rename source data field to reflect the APC/APNI name matching process better
austraits$taxa <-
austraits$taxa %>%
Expand All @@ -103,32 +192,43 @@ as_wide_table <- function(austraits){
dplyr::left_join(by=c("dataset_id", "site_name"), austraits$sites) %>%
dplyr::left_join(by=c("dataset_id", "trait_name"), austraits$methods) %>%
dplyr::left_join(by=c("taxon_name"), austraits$taxa) %>%

# reorder the names to be more intuitive
dplyr::select(

# The most useful (if you are filtering for just one taxon_name)
.data$dataset_id, .data$observation_id, .data$trait_name, .data$taxon_name, .data$trait_value, .data$unit,
.data$value_type, .data$replicates,
# tissue, trait_category, # Add after new zenodo release

# More stuff you can filter on
.data$date, .data$collection_type, .data$sample_age_class, .data$sampling_strategy,

#stuff relating to sites
.data$`latitude (deg)`, .data$`longitude (deg)`, .data$site_name, .data$site,

#stuff relating to contexts and methods
.data$context_name, .data$context, .data$methods, .data$original_name,

#the citations
.data$dataset_description, .data$source_primary_citation, .data$source_secondary_citation,

#the taxa details
.data$taxonomicStatus, .data$taxonDistribution,
.data$taxonRank, .data$genus, .data$family, .data$acceptedNameUsageID,
.data$scientificNameAuthorship, .data$ccAttributionIRI
# The most useful (if you are filtering for just one taxon_name)
dataset_id, observation_id, trait_name, taxon_name, trait_value, unit,
value_type, replicates,
# tissue, trait_category, # Add after new zenodo release
# More stuff you can filter on
date, collection_type, sample_age_class, sampling_strategy,
#stuff relating to sites
`latitude (deg)`, `longitude (deg)`, site_name, site,
#stuff relating to contexts and methods
context_name, context, methods, original_name,
#the citations
dataset_description, source_primary_citation, source_secondary_citation,
#the taxa details
taxonomicStatus, taxonDistribution,
taxonRank, genus, family, acceptedNameUsageID,
scientificNameAuthorship, ccAttributionIRI
)

austraits_wide
}

#' Collapse columns into text string
#' @keywords internal
#' @noRd
collapse_cols <- function(data) {

if(ncol(data) ==0) return(NA_character_)

data %>% purrr::imap_dfr(~ sprintf("%s='%s'",.y,.x)) %>%
tidyr::unite("text", sep="; ") %>% dplyr::pull(text)
}
Loading

0 comments on commit 3a5df92

Please sign in to comment.