-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial switch to httr2 #738
Conversation
Merge branch 'main' of https://github.com/DOI-USGS/dataRetrieval into httr2 # Conflicts: # R/constructNWISURL.R # man/constructNWISURL.Rd
Merge branch 'main' of https://github.com/DOI-USGS/dataRetrieval into httr2 # Conflicts: # NEWS # R/getWebServiceData.R # man/getWebServiceData.Rd
@jesse-ross @ehinman @mikejohnson51 @dblodgett-usgs @mikemahoney218-usgs A fair number of tests were updated, but mostly it was URL attributes that the arguments got shifted around (so, same URL, different order). If any of you have any comments, feedback, questions, let me know. I think generally everything looks to be behaving the same, except for a few cases where we had non-200 results returning without error. At the moment, those cases now automatically error. Maybe that should be the case (I think a few of our old behaviors were mostly because the service would come back with useful information even though it was a 400 or something like that). Anyway, if it's a problem, we can add a tryCatch, but I kind of think this is a more expected behavior anyway. We're not planning an update to CRAN anytime soon. I expect this will stay in the "develop" branch for awhile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These httr2 changes look good. I walked through each of the changed files and ran at least some of the examples for each function with the interactive debugger. I also ran R CMD CHECK and didn't run into issues.
I've made some comments, but I don't think any are required to address in this PR. Approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment on the line, but the @return field could be more precise. As it reads now, it sounds like the returned object is a character string object. Maybe something like the following:
@return An HTTP request URL: an S3 list with class httr2_request.
#' custom user agent, create an environmental variable: CUSTOM_DR_UA | ||
#' This function accepts a url parameter, and returns the raw data. | ||
#' | ||
#' To add a custom user agent, create an environmental variable: CUSTOM_DR_UA | ||
#' | ||
#' @param obs_url character containing the url for the retrieval | ||
#' @param \dots information to pass to header request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I can't comment on the @return field but perhaps the specific data type can be mentioned for getWebServiceData
. Is it always going to be an xml_document?
@return xml_document object containing raw data from web services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is written now, it would be XML for WaterML calls and characters for everything else (double checking that now). I believe since the WQP stopped recommending (or allowing in the new services) zip, we aren't returning raw results either. ANYWHO, this is a pattern I do really want to reconsider as we move into the new APIs.
Side note...not sure why you can't comment on the specific lines....I'll double check permissions....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Regarding commenting, I don't think GitHub allows comments on unchanged lines (thread)
} else if (is.raw(input)) { | ||
returnedDoc <- xml2::read_xml(input) | ||
raw <- TRUE | ||
} | ||
|
||
response <- xml2::xml_name(returnedDoc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely out of scope for this PR, but this line throws an error when input
is a URL as in the doc example. Perhaps a character input
could be cast as a httr2_request object earlier on? This seemed to work for me:
obs_url <- paste("https://cida.usgs.gov/ngwmn_cache/sos?request=GetObservation",
"service=SOS", "version=2.0.0",
"observedProperty=urn:ogc:def:property:OGC:GroundWaterLevel",
"responseFormat=text/xml",
"featureOfInterest=VW_GWDP_GEOSERVER.USGS.403836085374401",
sep = "&"
)
importNGWMN(httr2::request(obs_url))
R/importNGWMN_wml2.R
Outdated
#' | ||
#' timesereies <- importWaterML2(URL, asDateTime = TRUE, tz = "UTC") | ||
#' timesereies <- importWaterML2(baseURL, asDateTime = TRUE, tz = "UTC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo and baseURL needs to be converted to a httr2_request object
#' timesereies <- importWaterML2(baseURL, asDateTime = TRUE, tz = "UTC") | |
#' timeseries <- importWaterML2(httr2::request(baseURL), asDateTime = TRUE, tz = "UTC") |
R/readNWISpCode.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't comment on the line, but line 3 should have measured
instead of meaured
.
R/whatWQPsites.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 11 has an incomplete sentence. Also, I'd change the @return
tag on line 31 to something slightly more descriptive, although it'd be great to provide more information on the difference between samples vs metrics vs sites
@return data frame containing basic metadata on WQP sites
DESCRIPTION
Outdated
@@ -39,14 +39,14 @@ Copyright: This software is in the public domain because it contains materials | |||
Depends: | |||
R (>= 3.5.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, httr2 (and the tidyverse) now require R >= 4.0, and in the spring will require R >= 4.1 . If you bump this to 4.1 (which, depending when you're planning on launching things, will be the minimum version to use the package anyway) then you can use the base pipe, which might make some of the query-building more fluid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳I've been avoiding the 4.1 requirement, but if httr2 has it already....HELLO PIPES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also re-export the magrittr pipe from httr2 (either directly or by adding magrittr as a direct dependency) if this is going out before the spring. Here's how httr2 does it:
https://github.com/r-lib/httr2/blob/main/R/utils-pipe.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When magrittr first came out, we very firmly came to the conclusion that we would never use it within a package, so I've very much avoided re-exporting it. However, my understanding is that the base R pipe does a better job of allowing the "traceback" when an error occurs. So, I'm happy to start using native pipes - but would not introduce magittr at this stage of the game.
R/constructNWISURL.R
Outdated
format = "rdb" | ||
) | ||
url <- httr2::req_url_query(baseURL, | ||
site_no = siteNumbers,.multi = "comma") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
site_no = siteNumbers,.multi = "comma") | |
site_no = siteNumbers, .multi = "comma") |
Sorry, this is such a small nit but I saw it 😆
R/findNLDI.R
Outdated
#' get_nldi(paste0(base, "comid/101"), type = "feature", use_sf = TRUE) | ||
#' get_nldi(url = paste0(base, "nwissite/USGS-11120000"), type = "feature", use_sf = TRUE) | ||
#' get_nldi(paste0(base, "nwissite/USGS-11120000"), type = "feature", use_sf = TRUE) | ||
#' dataRetrieval:::get_nldi(paste0(base, "comid/101"), type = "feature", use_sf = FALSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CRAN gets mad at :::
even if it's never actually run -- believe this gets caught by their automated checks. Maybe not an issue due to @noRd
, but I'm not 100% sure.
If it is an issue, the workaround is to use getFromNamespace()
to assign the non-exported object into the current environment, which defeats the static analysis checks. (If it's not an issue with CRAN, well, sorry for the distraction!)
@@ -132,7 +115,7 @@ check_non_200s <- function(returnedList){ | |||
default_ua <- function() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have I mentioned how much I like this, by the way? This approach to tagging user agents is great
R/whatWQPdata.R
Outdated
sites <- values[["siteid"]] | ||
sites <- paste0(sites, collapse = ";") | ||
baseURL <- httr2::req_url_query(baseURL, | ||
siteid = sites) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sites <- values[["siteid"]] | |
sites <- paste0(sites, collapse = ";") | |
baseURL <- httr2::req_url_query(baseURL, | |
siteid = sites) | |
baseURL <- httr2::req_url_query(baseURL, | |
siteid = values[["siteid"]], | |
.multi = function(x) paste0(x, collapse = ";")) |
I don't know if that's actually any more clear, but this could be handled inside of the httr2 function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Didn't notice we could add custom functinos.
This looks awesome 😄 Left a few minor comments (apologies everyone for the email spam -- I didn't actually intend to review the full PR tonight 😆) but this is great. |
🤗 |
… of Joe's comments.
… the site and extra articles)
Not sure if you already know this, but just in case -- that qpdf error is an error in Homebrew, and (according to Gabor) will fix itself when the next version of the GHA runner image is deployed (but no ETA for when that will be finished) |
Unfortunately we're not allowed to call github actions by a version (like v2 which the r-lib folks really want us to use), so we need to occasionally include the actual commit hash in the GH action yaml file. It makes sense from a security standpoint - since the version tag is sliding - they could any anything in there and we wouldn't know. At one point I had a script that would pull the latest commit hash from a version number, but I cannot find that anymore...if anyone on this list remembers how to do that, it would save a bit of fiddling when these things need to be updated. |
No description provided.