Skip to content

Commit

Permalink
Use base pipe (#76)
Browse files Browse the repository at this point in the history
* Use base pipe.

Closes #75.

I also updated the GHA to check all the way back to 4.1 (for now) to make sure we don't use it in a way that wasn't initially supported.

* Namespace all dplyr function calls.
  • Loading branch information
jonthegeek authored Sep 5, 2024
1 parent 9d71290 commit a39d2e5
Show file tree
Hide file tree
Showing 15 changed files with 57 additions and 63 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ jobs:
- {os: windows-latest, r: 'release'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}
- {os: ubuntu-latest, r: 'oldrel-2'}
- {os: ubuntu-latest, r: 'oldrel-3'}

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
Expand Down
1 change: 0 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ Imports:
cli,
dplyr,
glue,
magrittr,
purrr,
readr,
rlang,
Expand Down
4 changes: 0 additions & 4 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,4 @@ export(gutenberg_get_all_mirrors)
export(gutenberg_get_mirror)
export(gutenberg_strip)
export(gutenberg_works)
importFrom(dplyr,count)
importFrom(dplyr,distinct)
importFrom(dplyr,filter)
importFrom(magrittr,"%>%")
importFrom(rlang,"%||%")
5 changes: 3 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# gutenbergr (development version)

* `gutenberg_download()` tries the `.txt` version of files when the `.zip` is unavailable (@jrdnbradford, #55).
* `gutenberg_download()` tries the `.txt` version of files when the `.zip` is unavailable (@jrdnbradford, #55, #70).
* New function `gutenberg_get_all_mirrors()` retrieves all mirror data (@jrdnbradford, #58).
* The package infrastructure has been updated to make the package more robust and maintainable.
* The package infrastructure has been updated to make the package more robust and maintainable (#60, #64, #69).
* We now use the base R pipe (`|>`) in code and examples, not the magrittr pipe (`%>%`) (@jonthegeek, #75).

# gutenbergr 0.2.4

Expand Down
16 changes: 8 additions & 8 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,19 +37,19 @@
#'
#' gutenberg_metadata
#'
#' gutenberg_metadata %>%
#' gutenberg_metadata |>
#' count(author, sort = TRUE)
#'
#' # look for Shakespeare, excluding collections (containing "Works") and
#' # translations
#' shakespeare_metadata <- gutenberg_metadata %>%
#' shakespeare_metadata <- gutenberg_metadata |>
#' filter(
#' author == "Shakespeare, William",
#' language == "en",
#' !str_detect(title, "Works"),
#' has_text,
#' !str_detect(rights, "Copyright")
#' ) %>%
#' ) |>
#' distinct(title)
#'
#' \donttest{
Expand Down Expand Up @@ -101,17 +101,17 @@
#' library(dplyr)
#' library(stringr)
#'
#' gutenberg_subjects %>%
#' filter(subject_type == "lcsh") %>%
#' gutenberg_subjects |>
#' filter(subject_type == "lcsh") |>
#' count(subject, sort = TRUE)
#'
#' sherlock_holmes_subjects <- gutenberg_subjects %>%
#' sherlock_holmes_subjects <- gutenberg_subjects |>
#' filter(str_detect(subject, "Holmes, Sherlock"))
#'
#' sherlock_holmes_subjects
#'
#' sherlock_holmes_metadata <- gutenberg_works() %>%
#' filter(author == "Doyle, Arthur Conan") %>%
#' sherlock_holmes_metadata <- gutenberg_works() |>
#' filter(author == "Doyle, Arthur Conan") |>
#' semi_join(sherlock_holmes_subjects, by = "gutenberg_id")
#'
#' sherlock_holmes_metadata
Expand Down
2 changes: 1 addition & 1 deletion R/gutenberg_download.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
#' dplyr::count(books, title)
#'
#' # download all books from Jane Austen
#' austen <- gutenberg_works(author == "Austen, Jane") %>%
#' austen <- gutenberg_works(author == "Austen, Jane") |>
#' gutenberg_download(meta_fields = "title")
#' austen
#' dplyr::count(austen, title)
Expand Down
34 changes: 17 additions & 17 deletions R/gutenberg_works.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,16 @@
#'
#' # language specifications
#'
#' gutenberg_works(languages = "es") %>%
#' gutenberg_works(languages = "es") |>
#' count(language, sort = TRUE)
#'
#' gutenberg_works(languages = c("en", "es")) %>%
#' gutenberg_works(languages = c("en", "es")) |>
#' count(language, sort = TRUE)
#'
#' gutenberg_works(languages = c("en", "es"), all_languages = TRUE) %>%
#' gutenberg_works(languages = c("en", "es"), all_languages = TRUE) |>
#' count(language, sort = TRUE)
#'
#' gutenberg_works(languages = c("en", "es"), only_languages = FALSE) %>%
#' gutenberg_works(languages = c("en", "es"), only_languages = FALSE) |>
#' count(language, sort = TRUE)
#' }
#' @export
Expand All @@ -82,37 +82,37 @@ gutenberg_works <- function(..., languages = "en",
)
}
)
ret <- filter(gutenberg_metadata, ...)
ret <- dplyr::filter(gutenberg_metadata, ...)

if (!is.null(languages)) {
lang_filt <- gutenberg_languages %>%
filter(language %in% languages) %>%
count(gutenberg_id, total_languages)
lang_filt <- gutenberg_languages |>
dplyr::filter(language %in% languages) |>
dplyr::count(gutenberg_id, total_languages)

if (all_languages) {
lang_filt <- lang_filt %>%
filter(n >= length(languages))
lang_filt <- lang_filt |>
dplyr::filter(n >= length(languages))
}
if (only_languages) {
lang_filt <- lang_filt %>%
filter(total_languages <= n)
lang_filt <- lang_filt |>
dplyr::filter(total_languages <= n)
}

ret <- ret %>%
filter(gutenberg_id %in% lang_filt$gutenberg_id)
ret <- ret |>
dplyr::filter(gutenberg_id %in% lang_filt$gutenberg_id)
}

if (!is.null(rights)) {
.rights <- rights
ret <- filter(ret, rights %in% .rights)
ret <- dplyr::filter(ret, rights %in% .rights)
}

if (only_text) {
ret <- filter(ret, has_text)
ret <- dplyr::filter(ret, has_text)
}

if (distinct) {
ret <- distinct(ret, title, gutenberg_author_id, .keep_all = TRUE)
ret <- dplyr::distinct(ret, title, gutenberg_author_id, .keep_all = TRUE)
# in older versions of dplyr, distinct_ didn't need .keep_all
if (any(colnames(ret) == ".keep_all")) {
ret$.keep_all <- NULL # nocov
Expand Down
4 changes: 0 additions & 4 deletions R/gutenbergr-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,6 @@
"_PACKAGE"

## usethis namespace: start
#' @importFrom dplyr count
#' @importFrom dplyr distinct
#' @importFrom dplyr filter
#' @importFrom magrittr %>%
#' @importFrom rlang %||%
## usethis namespace: end
NULL
6 changes: 3 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Suppose we wanted to download Emily Bronte's "Wuthering Heights." We could find
library(dplyr)
library(gutenbergr)
gutenberg_works() %>%
gutenberg_works() |>
filter(title == "Wuthering Heights")
# or just:
Expand All @@ -87,14 +87,14 @@ wuthering_heights
books <- gutenberg_download(c(768, 1260), meta_fields = "title")
books
books %>%
books |>
count(title)
```

It can also take the output of `gutenberg_works` directly. For example, we could get the text of all Aristotle's works, each annotated with both `gutenberg_id` and `title`, using:

```{r}
aristotle_books <- gutenberg_works(author == "Aristotle") %>%
aristotle_books <- gutenberg_works(author == "Aristotle") |>
gutenberg_download(meta_fields = "title")
aristotle_books
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ could find the book’s ID by filtering:
library(dplyr)
library(gutenbergr)

gutenberg_works() %>%
gutenberg_works() |>
filter(title == "Wuthering Heights")
#> # A tibble: 1 × 8
#> gutenberg_id title author gutenberg_author_id language
Expand Down Expand Up @@ -137,7 +137,7 @@ books
#> 10 768 "" Wuthering Heights
#> # ℹ 33,333 more rows

books %>%
books |>
count(title)
#> # A tibble: 2 × 2
#> title n
Expand All @@ -151,7 +151,7 @@ we could get the text of all Aristotle’s works, each annotated with both
`gutenberg_id` and `title`, using:

``` r
aristotle_books <- gutenberg_works(author == "Aristotle") %>%
aristotle_books <- gutenberg_works(author == "Aristotle") |>
gutenberg_download(meta_fields = "title")

aristotle_books
Expand Down
2 changes: 1 addition & 1 deletion man/gutenberg_download.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/gutenberg_metadata.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions man/gutenberg_subjects.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/gutenberg_works.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 7 additions & 7 deletions vignettes/intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ gutenberg_metadata
For example, you could find the Gutenberg ID(s) of Jane Austen's _Persuasion_ by doing:

```{r filter}
gutenberg_metadata %>%
gutenberg_metadata |>
filter(title == "Persuasion")
```

Expand Down Expand Up @@ -107,7 +107,7 @@ books
Notice that the `meta_fields` argument allows us to add one or more additional fields from the `gutenberg_metadata` to the downloaded text, such as title or author.

```{r count books}
books %>%
books |>
count(title)
```

Expand All @@ -122,10 +122,10 @@ gutenberg_subjects
This is useful for extracting texts from a particular topic or genre, such as detective stories, or a particular character, such as Sherlock Holmes. The `gutenberg_id` column can then be used to download these texts or to link with other metadata.

```{r filter subjects}
gutenberg_subjects %>%
gutenberg_subjects |>
filter(subject == "Detective and mystery stories")
gutenberg_subjects %>%
gutenberg_subjects |>
filter(grepl("Holmes, Sherlock", subject))
```

Expand All @@ -140,13 +140,13 @@ gutenberg_authors
What's next after retrieving a book's text? Well, having the book as a data frame is especially useful for working with the [tidytext](https://github.com/juliasilge/tidytext) package for text analysis.

```{r tidytext}
words <- books %>%
words <- books |>
unnest_tokens(word, text)
words
word_counts <- words %>%
anti_join(stop_words, by = "word") %>%
word_counts <- words |>
anti_join(stop_words, by = "word") |>
count(title, word, sort = TRUE)
word_counts
Expand Down

0 comments on commit a39d2e5

Please sign in to comment.