Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use base pipe #76

Merged
merged 2 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ jobs:
- {os: windows-latest, r: 'release'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}
- {os: ubuntu-latest, r: 'oldrel-2'}
- {os: ubuntu-latest, r: 'oldrel-3'}

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
Expand Down
1 change: 0 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ Imports:
cli,
dplyr,
glue,
magrittr,
purrr,
readr,
rlang,
Expand Down
4 changes: 0 additions & 4 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,4 @@ export(gutenberg_get_all_mirrors)
export(gutenberg_get_mirror)
export(gutenberg_strip)
export(gutenberg_works)
importFrom(dplyr,count)
importFrom(dplyr,distinct)
importFrom(dplyr,filter)
importFrom(magrittr,"%>%")
importFrom(rlang,"%||%")
5 changes: 3 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# gutenbergr (development version)

* `gutenberg_download()` tries the `.txt` version of files when the `.zip` is unavailable (@jrdnbradford, #55).
* `gutenberg_download()` tries the `.txt` version of files when the `.zip` is unavailable (@jrdnbradford, #55, #70).
* New function `gutenberg_get_all_mirrors()` retrieves all mirror data (@jrdnbradford, #58).
* The package infrastructure has been updated to make the package more robust and maintainable.
* The package infrastructure has been updated to make the package more robust and maintainable (#60, #64, #69).
* We now use the base R pipe (`|>`) in code and examples, not the magrittr pipe (`%>%`) (@jonthegeek, #75).

# gutenbergr 0.2.4

Expand Down
16 changes: 8 additions & 8 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,19 +37,19 @@
#'
#' gutenberg_metadata
#'
#' gutenberg_metadata %>%
#' gutenberg_metadata |>
#' count(author, sort = TRUE)
#'
#' # look for Shakespeare, excluding collections (containing "Works") and
#' # translations
#' shakespeare_metadata <- gutenberg_metadata %>%
#' shakespeare_metadata <- gutenberg_metadata |>
#' filter(
#' author == "Shakespeare, William",
#' language == "en",
#' !str_detect(title, "Works"),
#' has_text,
#' !str_detect(rights, "Copyright")
#' ) %>%
#' ) |>
#' distinct(title)
#'
#' \donttest{
Expand Down Expand Up @@ -101,17 +101,17 @@
#' library(dplyr)
#' library(stringr)
#'
#' gutenberg_subjects %>%
#' filter(subject_type == "lcsh") %>%
#' gutenberg_subjects |>
#' filter(subject_type == "lcsh") |>
#' count(subject, sort = TRUE)
#'
#' sherlock_holmes_subjects <- gutenberg_subjects %>%
#' sherlock_holmes_subjects <- gutenberg_subjects |>
#' filter(str_detect(subject, "Holmes, Sherlock"))
#'
#' sherlock_holmes_subjects
#'
#' sherlock_holmes_metadata <- gutenberg_works() %>%
#' filter(author == "Doyle, Arthur Conan") %>%
#' sherlock_holmes_metadata <- gutenberg_works() |>
#' filter(author == "Doyle, Arthur Conan") |>
#' semi_join(sherlock_holmes_subjects, by = "gutenberg_id")
#'
#' sherlock_holmes_metadata
Expand Down
2 changes: 1 addition & 1 deletion R/gutenberg_download.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
#' dplyr::count(books, title)
#'
#' # download all books from Jane Austen
#' austen <- gutenberg_works(author == "Austen, Jane") %>%
#' austen <- gutenberg_works(author == "Austen, Jane") |>
#' gutenberg_download(meta_fields = "title")
#' austen
#' dplyr::count(austen, title)
Expand Down
34 changes: 17 additions & 17 deletions R/gutenberg_works.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,16 @@
#'
#' # language specifications
#'
#' gutenberg_works(languages = "es") %>%
#' gutenberg_works(languages = "es") |>
#' count(language, sort = TRUE)
#'
#' gutenberg_works(languages = c("en", "es")) %>%
#' gutenberg_works(languages = c("en", "es")) |>
#' count(language, sort = TRUE)
#'
#' gutenberg_works(languages = c("en", "es"), all_languages = TRUE) %>%
#' gutenberg_works(languages = c("en", "es"), all_languages = TRUE) |>
#' count(language, sort = TRUE)
#'
#' gutenberg_works(languages = c("en", "es"), only_languages = FALSE) %>%
#' gutenberg_works(languages = c("en", "es"), only_languages = FALSE) |>
#' count(language, sort = TRUE)
#' }
#' @export
Expand All @@ -82,37 +82,37 @@ gutenberg_works <- function(..., languages = "en",
)
}
)
ret <- filter(gutenberg_metadata, ...)
ret <- dplyr::filter(gutenberg_metadata, ...)

if (!is.null(languages)) {
lang_filt <- gutenberg_languages %>%
filter(language %in% languages) %>%
count(gutenberg_id, total_languages)
lang_filt <- gutenberg_languages |>
dplyr::filter(language %in% languages) |>
dplyr::count(gutenberg_id, total_languages)

if (all_languages) {
lang_filt <- lang_filt %>%
filter(n >= length(languages))
lang_filt <- lang_filt |>
dplyr::filter(n >= length(languages))
}
if (only_languages) {
lang_filt <- lang_filt %>%
filter(total_languages <= n)
lang_filt <- lang_filt |>
dplyr::filter(total_languages <= n)
}

ret <- ret %>%
filter(gutenberg_id %in% lang_filt$gutenberg_id)
ret <- ret |>
dplyr::filter(gutenberg_id %in% lang_filt$gutenberg_id)
}

if (!is.null(rights)) {
.rights <- rights
ret <- filter(ret, rights %in% .rights)
ret <- dplyr::filter(ret, rights %in% .rights)
}

if (only_text) {
ret <- filter(ret, has_text)
ret <- dplyr::filter(ret, has_text)
}

if (distinct) {
ret <- distinct(ret, title, gutenberg_author_id, .keep_all = TRUE)
ret <- dplyr::distinct(ret, title, gutenberg_author_id, .keep_all = TRUE)
# in older versions of dplyr, distinct_ didn't need .keep_all
if (any(colnames(ret) == ".keep_all")) {
ret$.keep_all <- NULL # nocov
Expand Down
4 changes: 0 additions & 4 deletions R/gutenbergr-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,6 @@
"_PACKAGE"

## usethis namespace: start
#' @importFrom dplyr count
#' @importFrom dplyr distinct
#' @importFrom dplyr filter
#' @importFrom magrittr %>%
#' @importFrom rlang %||%
## usethis namespace: end
NULL
6 changes: 3 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Suppose we wanted to download Emily Bronte's "Wuthering Heights." We could find
library(dplyr)
library(gutenbergr)

gutenberg_works() %>%
gutenberg_works() |>
filter(title == "Wuthering Heights")

# or just:
Expand All @@ -87,14 +87,14 @@ wuthering_heights
books <- gutenberg_download(c(768, 1260), meta_fields = "title")
books

books %>%
books |>
count(title)
```

It can also take the output of `gutenberg_works` directly. For example, we could get the text of all Aristotle's works, each annotated with both `gutenberg_id` and `title`, using:

```{r}
aristotle_books <- gutenberg_works(author == "Aristotle") %>%
aristotle_books <- gutenberg_works(author == "Aristotle") |>
gutenberg_download(meta_fields = "title")

aristotle_books
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ could find the book’s ID by filtering:
library(dplyr)
library(gutenbergr)

gutenberg_works() %>%
gutenberg_works() |>
filter(title == "Wuthering Heights")
#> # A tibble: 1 × 8
#> gutenberg_id title author gutenberg_author_id language
Expand Down Expand Up @@ -137,7 +137,7 @@ books
#> 10 768 "" Wuthering Heights
#> # ℹ 33,333 more rows

books %>%
books |>
count(title)
#> # A tibble: 2 × 2
#> title n
Expand All @@ -151,7 +151,7 @@ we could get the text of all Aristotle’s works, each annotated with both
`gutenberg_id` and `title`, using:

``` r
aristotle_books <- gutenberg_works(author == "Aristotle") %>%
aristotle_books <- gutenberg_works(author == "Aristotle") |>
gutenberg_download(meta_fields = "title")

aristotle_books
Expand Down
2 changes: 1 addition & 1 deletion man/gutenberg_download.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/gutenberg_metadata.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions man/gutenberg_subjects.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/gutenberg_works.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 7 additions & 7 deletions vignettes/intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ gutenberg_metadata
For example, you could find the Gutenberg ID(s) of Jane Austen's _Persuasion_ by doing:

```{r filter}
gutenberg_metadata %>%
gutenberg_metadata |>
filter(title == "Persuasion")
```

Expand Down Expand Up @@ -107,7 +107,7 @@ books
Notice that the `meta_fields` argument allows us to add one or more additional fields from the `gutenberg_metadata` to the downloaded text, such as title or author.

```{r count books}
books %>%
books |>
count(title)
```

Expand All @@ -122,10 +122,10 @@ gutenberg_subjects
This is useful for extracting texts from a particular topic or genre, such as detective stories, or a particular character, such as Sherlock Holmes. The `gutenberg_id` column can then be used to download these texts or to link with other metadata.

```{r filter subjects}
gutenberg_subjects %>%
gutenberg_subjects |>
filter(subject == "Detective and mystery stories")

gutenberg_subjects %>%
gutenberg_subjects |>
filter(grepl("Holmes, Sherlock", subject))
```

Expand All @@ -140,13 +140,13 @@ gutenberg_authors
What's next after retrieving a book's text? Well, having the book as a data frame is especially useful for working with the [tidytext](https://github.com/juliasilge/tidytext) package for text analysis.

```{r tidytext}
words <- books %>%
words <- books |>
unnest_tokens(word, text)

words

word_counts <- words %>%
anti_join(stop_words, by = "word") %>%
word_counts <- words |>
anti_join(stop_words, by = "word") |>
count(title, word, sort = TRUE)

word_counts
Expand Down