From d8ef12c44c8bb9b9641e0e73e2606256b04fcbd5 Mon Sep 17 00:00:00 2001 From: rociojoo Date: Thu, 16 May 2019 12:52:35 -0400 Subject: [PATCH] README edits on format --- README.Rmd | 77 +++++++++++++++++++++++++++++------------------------- README.md | 26 +++++++++++------- 2 files changed, 58 insertions(+), 45 deletions(-) diff --git a/README.Rmd b/README.Rmd index e61f1e2..a150f5c 100644 --- a/README.Rmd +++ b/README.Rmd @@ -1,7 +1,7 @@ --- title: "Navigating through the R packages for movement: Supporting information" -author: "Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille" -date: "May 15, 2019" +author: "Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille." +date: "May 16, 2019" output: github_document: toc: true @@ -52,21 +52,24 @@ pkg_info <- read.csv(paste0(data_dir, "pkg-info.csv"), stringsAsFactors = FALSE) ``` +## Overview + This repository is a companion to the manuscript "*Navigating through the R packages for movement: a review for users and developers*", from Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille (pre-print available on [arXiv.org](https://arxiv.org/abs/1901.05935)). This document is -actually a dynamic R report, which RMarkdown sources are available +actually a dynamic R report, for which RMarkdown sources are available [here](README.Rmd) with full code. The repository also serves to store data about: 1. Information for [74 R packages](data/pkg-info.csv) related to tracking data processing and analysis. Information was collected - between March and August 2018. 59 of the packages were described in - the review, and 72 of those packages were the focus of a survey on - their users about their use, relevant and quality of their - documentation. The information was collected between March and + between March and August 2018. **59** of the packages were described in + the review, and **72** of those packages were the focus of a survey on + their users about their use, relevance and quality of their + documentation (see [packages included in the survey](#packages-included-in-the-survey) + for more details). The information was collected between March and August 2018. Additional details about this data file are available [here](data/README_pkg-info.md). 2. [Responses to an anonymous survey](data/survey-responses.csv) about @@ -76,6 +79,7 @@ data about: [here](data/README_survey-responses.md). + ## A large amount of R packages for movement The manuscript presents a review of R packages for movement. R is one @@ -126,14 +130,16 @@ ggplot(theTable, aes(x = Year, y = Total)) + ``` -Even worse, many packages are actually not connected to each others, -showing a very fragmented landscape of tracking packages in R. Here we -show a network representation of the dependency and suggestion between -tracking packages (this is **Figure 4** of the manuscript). The arrows -go towards the package the others suggest (dashed arrows) or depend on -(solid arrows). Bold font corresponds to active packages. The size of -the circle is proportional to the number of packages that suggest or -depend on this one. +Since the packages were reviewed between March and August 2018, this +last year was incomplete and not included in the graph. + +Many packages are actually not connected to each others, showing a very +fragmented landscape of tracking packages in R. Here we show a network +representation of the dependency and suggestion between tracking packages +(this is **Figure 4** of the manuscript). The arrows go towards the package +the others suggest (dashed arrows) or depend on (solid arrows). Bold font +corresponds to active packages. The size of the circle is proportional to +the number of packages that suggest or depend on this one. ```{r ms-fig-4, fig.width = 12, fig.height = 12} # loading the import + suggest information for each package @@ -317,7 +323,8 @@ of the participants and no probabilistic sampling was involved. The survey was advertised by Twitter, mailing lists (r-sig-geo and r-sig-ecology), individual emails to researchers and the [lab's website](https://mablab.org/post/2018-08-31-r-movement-review/). -The survey got exemption from the Institutional Review Board aqt University of Florida (IRB02 Office, Box 112250, University of Florida, Gainesville, FL 32611-2250). +The survey got exemption from the Institutional Review Board at University of Florida +(IRB02 Office, Box 112250, University of Florida, Gainesville, FL 32611-2250). A total of `r data %>% filter(!is.na(completion)) %>% nrow()` people participated in the survey, and `r data_all %>% nrow()` answered all @@ -425,8 +432,8 @@ ggplot(data = use_counts, aes(x = levels, y = total)) + ``` Most participants considered themselves in an intermediate level -(`r prop[2]`), meaning that they could write functions in R. Some others -were beginners (`r prop[1]`) and advanced (`r prop[3]`) R users. +(`r prop[2]`%), meaning that they could write functions in R. Some others +were beginners (`r prop[1]`%) and advanced (`r prop[3]`%) R users. ### Package use @@ -585,8 +592,8 @@ use_per$counts <- apply(df2[, c("Good", "Excellent")], 1, sum) Remember that participants could only give their opinion on documentation regarding the packages they had used. Hence, the -packages with many users got many documentation answers (Fig. 1 and -4). Figure 5 allows for a closer look at the proportion of type of +packages with many users got many documentation answers. The figure +above allows for a closer look at the proportion of type of response for each package. To identify some packages with remarkably good documentation, let's @@ -594,31 +601,31 @@ first only consider those packages with at least 10 responses on the quality of documentation (regardless of the "Don't remember"). These are 27 (you can see the table of responses below). Among them, `momentuHMM` had more than 50% of the responses -(`r round(use_per["momentuHMM","Excellent"],2)`; +(`r round(use_per["momentuHMM","Excellent"],2)`%; `r use_counts["momentuHMM","Excellent"]`) as "excellent documentation", meaning that the documentation was so good that thanks to it, more than half of its users discovered additional features of the package and were able to do more analyses than what they initially planned. Moreover, 11 packages had more than 75% of the responses as either "good" or "excellent": -`momentuHMM` (`r use_per["momentuHMM","good_excellent"]`; +`momentuHMM` (`r use_per["momentuHMM","good_excellent"]`%; `r use_per["momentuHMM","counts"]`), -`moveHMM` (`r use_per["moveHMM","good_excellent"]`; +`moveHMM` (`r use_per["moveHMM","good_excellent"]`%; `r use_per["moveHMM","counts"]`), -`adehabitatLT` (`r use_per["adehabitatLT","good_excellent"]`; +`adehabitatLT` (`r use_per["adehabitatLT","good_excellent"]`%; `r use_per["adehabitatLT","counts"]`), -`adehabitatHS` (`r use_per["adehabitatHS","good_excellent"]`; +`adehabitatHS` (`r use_per["adehabitatHS","good_excellent"]`%; `r use_per["adehabitatHS","counts"]`), -`adehabitatHR` (`r use_per["adehabitatHR","good_excellent"]`; +`adehabitatHR` (`r use_per["adehabitatHR","good_excellent"]`%; `r use_per["adehabitatHR","counts"]`), -`EMbC` (`r use_per["EMbC","good_excellent"]`; `r use_per["EMbC","counts"]`), -`wildlifeDI` (`r use_per["wildlifeDI","good_excellent"]`; +`EMbC` (`r use_per["EMbC","good_excellent"]`%; `r use_per["EMbC","counts"]`), +`wildlifeDI` (`r use_per["wildlifeDI","good_excellent"]`%; `r use_per["wildlifeDI","counts"]`), -`ctmm` (`r use_per["ctmm","good_excellent"]`; `r use_per["ctmm","counts"]`), -`GeoLight` (`r use_per["GeoLight","good_excellent"]`; +`ctmm` (`r use_per["ctmm","good_excellent"]`%; `r use_per["ctmm","counts"]`), +`GeoLight` (`r use_per["GeoLight","good_excellent"]`%; `r use_per["GeoLight","counts"]`), -`move` (`r use_per["move","good_excellent"]`; `r use_per["move","counts"]`), -`recurse` (`r use_per["recurse","good_excellent"]`; +`move` (`r use_per["move","good_excellent"]`%; `r use_per["move","counts"]`), +`recurse` (`r use_per["recurse","good_excellent"]`%; `r use_per["recurse","counts"]`). The two leading packages, `momentuHMM` and `moveHMM`, focus on the use of Hidden Markov models which allow identifying different patterns of behavior called states. @@ -773,10 +780,10 @@ that were highly relevant for their users, considering only those packages with at least 10 responses. Among these 33 packages, three were regarded as either "Important" or "Essential" for more than 75% of their users: -`bsam` (`r use_per['bsam','good_excellent']`; `r use_per['bsam','counts']`), -`adehabitatHR` (`r use_per['adehabitatHR','good_excellent']`; +`bsam` (`r use_per['bsam','good_excellent']`%; `r use_per['bsam','counts']`), +`adehabitatHR` (`r use_per['adehabitatHR','good_excellent']`%; `r use_per['adehabitatHR','counts']`), and -`adehabitatLT` (`r use_per['adehabitatLT','good_excellent']`; +`adehabitatLT` (`r use_per['adehabitatLT','good_excellent']`%; `r use_per['adehabitatLT','counts']`). `bsam` allows fitting Bayesian state-space models to animal tracking data. diff --git a/README.md b/README.md index dd775d4..d71b701 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,9 @@ Navigating through the R packages for movement: Supporting information ================ -Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille -May 15, 2019 +Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille. +May 16, 2019 +- [Overview](#overview) - [A large amount of R packages for movement](#a-large-amount-of-r-packages-for-movement) - [The survey](#the-survey) - [Packages included in the survey](#packages-included-in-the-survey) @@ -15,9 +16,12 @@ May 15, 2019 - [Package Relevance](#package-relevance) - [Summary](#summary) -This repository is a companion to the manuscript "*Navigating through the R packages for movement: a review for users and developers*", from Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille (pre-print available on [arXiv.org](https://arxiv.org/abs/1901.05935)). This document is actually a dynamic R report, which RMarkdown sources are available [here](README.Rmd) with full code. The repository also serves to store data about: +Overview +-------- -1. Information for [74 R packages](data/pkg-info.csv) related to tracking data processing and analysis. Information was collected between March and August 2018. 59 of the packages were described in the review, and 72 of those packages were the focus of a survey on their users about their use, relevant and quality of their documentation. The information was collected between March and August 2018. Additional details about this data file are available [here](data/README_pkg-info.md). +This repository is a companion to the manuscript "*Navigating through the R packages for movement: a review for users and developers*", from Rocio Joo, Matthew E. Boone, Thomas A. Clay, Samantha C. Patrick, Susana Clusella-Trullas, and Mathieu Basille (pre-print available on [arXiv.org](https://arxiv.org/abs/1901.05935)). This document is actually a dynamic R report, for which RMarkdown sources are available [here](README.Rmd) with full code. The repository also serves to store data about: + +1. Information for [74 R packages](data/pkg-info.csv) related to tracking data processing and analysis. Information was collected between March and August 2018. **59** of the packages were described in the review, and **72** of those packages were the focus of a survey on their users about their use, relevance and quality of their documentation (see [packages included in the survey](#packages-included-in-the-survey) for more details). The information was collected between March and August 2018. Additional details about this data file are available [here](data/README_pkg-info.md). 2. [Responses to an anonymous survey](data/survey-responses.csv) about the use, relevance and quality of the documentation of 72 packages related to movement. The survey was executed in the Fall of 2018. Additional details about this data file are available [here](data/README_survey-responses.md). A large amount of R packages for movement @@ -27,7 +31,9 @@ The manuscript presents a review of R packages for movement. R is one of the mos ![](figures/ms-fig-2-1.png) -Even worse, many packages are actually not connected to each others, showing a very fragmented landscape of tracking packages in R. Here we show a network representation of the dependency and suggestion between tracking packages (this is **Figure 4** of the manuscript). The arrows go towards the package the others suggest (dashed arrows) or depend on (solid arrows). Bold font corresponds to active packages. The size of the circle is proportional to the number of packages that suggest or depend on this one. +Since the packages were reviewed between March and August 2018, this last year was incomplete and not included in the graph. + +Many packages are actually not connected to each others, showing a very fragmented landscape of tracking packages in R. Here we show a network representation of the dependency and suggestion between tracking packages (this is **Figure 4** of the manuscript). The arrows go towards the package the others suggest (dashed arrows) or depend on (solid arrows). Bold font corresponds to active packages. The size of the circle is proportional to the number of packages that suggest or depend on this one. ![](figures/ms-fig-4-1.png) @@ -54,7 +60,7 @@ A total of 72 packages were included in this survey: `acc`, `accelerometry`, `ad The survey was designed to be completely anonymous, meaning that we had no way to know who participated. There was no previous selection of the participants and no probabilistic sampling was involved. The survey was advertised by Twitter, mailing lists (r-sig-geo and r-sig-ecology), individual emails to researchers and the [lab's website](https://mablab.org/post/2018-08-31-r-movement-review/). -The survey got exemption from the Institutional Review Board aqt University of Florida (IRB02 Office, Box 112250, University of Florida, Gainesville, FL 32611-2250). +The survey got exemption from the Institutional Review Board at University of Florida (IRB02 Office, Box 112250, University of Florida, Gainesville, FL 32611-2250). A total of 446 people participated in the survey, and 225 answered all four questions. To answer all questions the participant had to have tried at least one of the packages. In the following sections, we analyze only completed surveys. @@ -83,7 +89,7 @@ Let's see first the level of use in R of the participants. The options were: ![Level of R use](figures/user-experience-1.png) -Most participants considered themselves in an intermediate level (60.9), meaning that they could write functions in R. Some others were beginners (18.7) and advanced (20.4) R users. +Most participants considered themselves in an intermediate level (60.9%), meaning that they could write functions in R. Some others were beginners (18.7%) and advanced (20.4%) R users. ### Package use @@ -187,9 +193,9 @@ In this survey we asked the participants how helpful was the documentation provi ![Bar plots of absolute frequency of each category of package documentation](figures/documentation-1.png) -Remember that participants could only give their opinion on documentation regarding the packages they had used. Hence, the packages with many users got many documentation answers (Fig. 1 and 4). Figure 5 allows for a closer look at the proportion of type of response for each package. +Remember that participants could only give their opinion on documentation regarding the packages they had used. Hence, the packages with many users got many documentation answers. The figure above allows for a closer look at the proportion of type of response for each package. -To identify some packages with remarkably good documentation, let's first only consider those packages with at least 10 responses on the quality of documentation (regardless of the "Don't remember"). These are 27 (you can see the table of responses below). Among them, `momentuHMM` had more than 50% of the responses (59.38; 19) as "excellent documentation", meaning that the documentation was so good that thanks to it, more than half of its users discovered additional features of the package and were able to do more analyses than what they initially planned. Moreover, 11 packages had more than 75% of the responses as either "good" or "excellent": `momentuHMM` (93.75; 30), `moveHMM` (89.47; 51), `adehabitatLT` (88.57; 124), `adehabitatHS` (86.14; 87), `adehabitatHR` (83.23; 139), `EMbC` (81.82; 18), `wildlifeDI` (81.25; 13), `ctmm` (80; 32), `GeoLight` (77.78; 21), `move` (76.56; 49), `recurse` (76.47; 13). The two leading packages, `momentuHMM` and `moveHMM`, focus on the use of Hidden Markov models which allow identifying different patterns of behavior called states. +To identify some packages with remarkably good documentation, let's first only consider those packages with at least 10 responses on the quality of documentation (regardless of the "Don't remember"). These are 27 (you can see the table of responses below). Among them, `momentuHMM` had more than 50% of the responses (59.38%; 19) as "excellent documentation", meaning that the documentation was so good that thanks to it, more than half of its users discovered additional features of the package and were able to do more analyses than what they initially planned. Moreover, 11 packages had more than 75% of the responses as either "good" or "excellent": `momentuHMM` (93.75%; 30), `moveHMM` (89.47%; 51), `adehabitatLT` (88.57%; 124), `adehabitatHS` (86.14%; 87), `adehabitatHR` (83.23%; 139), `EMbC` (81.82%; 18), `wildlifeDI` (81.25%; 13), `ctmm` (80%; 32), `GeoLight` (77.78%; 21), `move` (76.56%; 49), `recurse` (76.47%; 13). The two leading packages, `momentuHMM` and `moveHMM`, focus on the use of Hidden Markov models which allow identifying different patterns of behavior called states. One way to visualize the quality of documentation is to relate the rating to the number of respondents who declared using each package (this is **Figure 3** of the manuscript). This figure shows the proportion of good and excellent documentation for packages with at least 10 respondents; light green corresponds to packages with standard documentation only, blue is for packages with vignettes, and purple is for packages that also have peer-reviewed articles published: @@ -277,7 +283,7 @@ Participants were asked how relevant was each of the packages they use for their ![Bar plots of absolute frequency of each category of package relevance](figures/importance-1.png) -The two barplots show the absolute and relative frequency of the answers for each package, respectively. We identified the packages that were highly relevant for their users, considering only those packages with at least 10 responses. Among these 33 packages, three were regarded as either "Important" or "Essential" for more than 75% of their users: `bsam` (81.48; 22), `adehabitatHR` (81.08; 150), and `adehabitatLT` (75.16; 118). `bsam` allows fitting Bayesian state-space models to animal tracking data. +The two barplots show the absolute and relative frequency of the answers for each package, respectively. We identified the packages that were highly relevant for their users, considering only those packages with at least 10 responses. Among these 33 packages, three were regarded as either "Important" or "Essential" for more than 75% of their users: `bsam` (81.48%; 22), `adehabitatHR` (81.08%; 150), and `adehabitatLT` (75.16%; 118). `bsam` allows fitting Bayesian state-space models to animal tracking data. ![Bar plots of relative frequency of each category of package relevance (for packages with more than 5 users)](figures/importance-percentage-1.png)