Skip to content

Commit

Permalink
Merge branch 'master' into fix-numeric-type
Browse files Browse the repository at this point in the history
  • Loading branch information
YaoxiangLi authored Oct 1, 2024
2 parents d1f1a20 + 9ddf213 commit e54c155
Show file tree
Hide file tree
Showing 6 changed files with 43 additions and 54 deletions.
6 changes: 5 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,12 @@ Package: medrxivr
Title: Access and Search MedRxiv and BioRxiv Preprint Data
Version: 0.0.5.9000
Authors@R: c(
person("Yaoxiang", "Li",
role = c("aut", "cre"),
email = "[email protected]",
comment = c(ORCID="0000-0001-9200-1016")),
person("Luke", "McGuinness",
role = c("aut", "cre"),
role = c("aut"),
email = "[email protected]",
comment = c(ORCID = "0000-0001-8730-9761")),
person("Lena", "Schmidt",
Expand Down
2 changes: 0 additions & 2 deletions LICENSE

This file was deleted.

5 changes: 1 addition & 4 deletions R/mx_api.R
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,8 @@ mx_api_content <- function(from_date = "2013-01-01",
details_link <- api_link(server, from_date, to_date, "0")
details <- api_to_df(details_link)

# Ensure 'count' is numeric
count <- as.numeric(details$messages[1, 6])
if (is.na(count)) {
stop("Count value is not numeric.")
}

pages <- floor(count / 100)

message("Estimated total number of records as per API metadata: ", count)
Expand Down
3 changes: 1 addition & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ knitr::opts_chunk$set(
library(medrxivr)
```

# medrxivr <img src="man/figures/hex-medrxivr.png" align="right" width="20%" height="20%" />
# medrxivr <img src="man/figures/logo.png" align="right" width="20%" height="20%" />


<!-- badges: start -->
Expand All @@ -28,7 +28,6 @@ library(medrxivr)
[![CRAN Downloads.](https://cranlogs.r-pkg.org/badges/grand-total/medrxivr)](https://CRAN.R-project.org/package=medrxivr)
<br>
[![R build status](https://github.com/ropensci/medrxivr/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/medrxivr/actions)
[![Travis build status](https://travis-ci.com/ropensci/medrxivr.svg?branch=master)](https://travis-ci.com/ropensci/medrxivr)
[![Codecov test coverage](https://codecov.io/gh/ropensci/medrxivr/branch/master/graph/badge.svg)](https://codecov.io/gh/ropensci/medrxivr?branch=master)

<!-- badges: end -->
Expand Down
81 changes: 36 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

<!-- README.md is generated from README.Rmd. Please edit that file -->

# medrxivr <img src="man/figures/hex-medrxivr.png" align="right" width="20%" height="20%" />
# medrxivr <img src="man/figures/logo.png" align="right" width="20%" height="20%" />

<!-- badges: start -->

Expand All @@ -15,8 +15,6 @@ Badge](https://badges.ropensci.org/380_status.svg)](https://github.com/ropensci/
Downloads.](https://cranlogs.r-pkg.org/badges/grand-total/medrxivr)](https://CRAN.R-project.org/package=medrxivr)
<br> [![R build
status](https://github.com/ropensci/medrxivr/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/medrxivr/actions)
[![Travis build
status](https://travis-ci.com/ropensci/medrxivr.svg?branch=master)](https://travis-ci.com/ropensci/medrxivr)
[![Codecov test
coverage](https://codecov.io/gh/ropensci/medrxivr/branch/master/graph/badge.svg)](https://codecov.io/gh/ropensci/medrxivr?branch=master)

Expand Down Expand Up @@ -66,27 +64,23 @@ library(medrxivr)

`medrixvr` provides two ways to access medRxiv data:

- `mx_api_content(server = "medrxiv")` creates a local copy of all
data available from the medRxiv API at the time the function is run.

<!-- end list -->
- `mx_api_content(server = "medrxiv")` creates a local copy of all data
available from the medRxiv API at the time the function is run.

``` r
# Get a copy of the database from the live medRxiv API endpoint
preprint_data <- mx_api_content()
```

- `mx_snapshot()` provides access to a static snapshot of the medRxiv
database. The snapshot is created each morning at 6am using
`mx_api_content()` and is stored as CSV file in the [medrxivr-data
repository](https://github.com/mcguinlu/medrxivr-data). This method
does not rely on the API (which can become unavailable during peak
usage times) and is usually faster (as it reads data from a CSV
rather than having to re-extract it from the API). Discrepancies
between the most recent static snapshot and the live database can be
assessed using `mx_crosscheck()`.

<!-- end list -->
- `mx_snapshot()` provides access to a static snapshot of the medRxiv
database. The snapshot is created each morning at 6am using
`mx_api_content()` and is stored as CSV file in the [medrxivr-data
repository](https://github.com/mcguinlu/medrxivr-data). This method
does not rely on the API (which can become unavailable during peak
usage times) and is usually faster (as it reads data from a CSV rather
than having to re-extract it from the API). Discrepancies between the
most recent static snapshot and the live database can be assessed
using `mx_crosscheck()`.

``` r
# Get a copy of the database from the daily snapshot
Expand All @@ -102,13 +96,10 @@ summarised in the figure below:

Only one data source exists for the bioRxiv repository:

- `mx_api_content(server = "biorxiv")` creates a local copy of all
data available from the bioRxiv API endpoint at the time the
function is run. **Note**: due to it’s size, downloading a complete
copy of the bioRxiv repository in this manner takes a long time (\~
1 hour).

<!-- end list -->
- `mx_api_content(server = "biorxiv")` creates a local copy of all data
available from the bioRxiv API endpoint at the time the function is
run. **Note**: due to it’s size, downloading a complete copy of the
bioRxiv repository in this manner takes a long time (~ 1 hour).

``` r
# Get a copy of the database from the live bioRxiv API endpoint
Expand All @@ -125,12 +116,12 @@ advanced search strategy.
``` r
# Import the medrxiv database
preprint_data <- mx_snapshot()
#> Using medRxiv snapshot - 2021-01-28 09:31
#> Using medRxiv snapshot - 2022-07-06 01:09

# Perform a simple search
results <- mx_search(data = preprint_data,
query ="dementia")
#> Found 192 record(s) matching your search.
#> Found 427 record(s) matching your search.

# Perform an advanced search
topic1 <- c("dementia","vascular","alzheimer's") # Combined with Boolean OR
Expand All @@ -139,7 +130,7 @@ myquery <- list(topic1, topic2) # Combined with Boolean AND

results <- mx_search(data = preprint_data,
query = myquery)
#> Found 70 record(s) matching your search.
#> Found 143 record(s) matching your search.
```

You can also explore which search terms are contributing most to your
Expand All @@ -149,15 +140,15 @@ search by setting `report = TRUE`:
results <- mx_search(data = preprint_data,
query = myquery,
report = TRUE)
#> Found 70 record(s) matching your search.
#> Total topic 1 records: 1078
#> dementia: 192
#> vascular: 917
#> Found 143 record(s) matching your search.
#> Total topic 1 records: 2272
#> dementia: 427
#> vascular: 1918
#> alzheimer's: 0
#> Total topic 2 records: 203
#> lipids: 74
#> statins: 25
#> cholesterol: 136
#> Total topic 2 records: 410
#> lipids: 157
#> statins: 61
#> cholesterol: 255
```

## Further functionality
Expand Down Expand Up @@ -222,14 +213,14 @@ and then search medRxiv and bioRxiv data. Below are a list of
complementary packages that provide distinct but related functionality
when working with medRxiv and bioRxiv data:

- [`rbiorxiv`](https://github.com/nicholasmfraser/rbiorxiv) by
[Nicholas Fraser](https://github.com/nicholasmfraser) provides
access to the same medRxiv and bioRxiv *content* data as `medrxivr`,
but also provides access to the *usage* data (e.g. downloads per
month) that the Cold Spring Harbour Laboratory API offers. This is
useful if you wish to explore, for example, [how the number of PDF
downloads from bioRxiv has grown over
time.](https://github.com/nicholasmfraser/rbiorxiv#pdf-downloads-over-time)
- [`rbiorxiv`](https://github.com/nicholasmfraser/rbiorxiv) by [Nicholas
Fraser](https://github.com/nicholasmfraser) provides access to the
same medRxiv and bioRxiv *content* data as `medrxivr`, but also
provides access to the *usage* data (e.g. downloads per month) that
the Cold Spring Harbour Laboratory API offers. This is useful if you
wish to explore, for example, [how the number of PDF downloads from
bioRxiv has grown over
time.](https://github.com/nicholasmfraser/rbiorxiv#pdf-downloads-over-time)

## Code of conduct

Expand All @@ -242,4 +233,4 @@ project, you agree to abide by its terms.
This package and the data it accesses/returns are provided “as is”, with
no guarantee of accuracy. Please be sure to check the accuracy of the
data yourself (and do let me know if you find an issue so I can fix it
for everyone\!)
for everyone!)
Binary file added man/figures/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e54c155

Please sign in to comment.