Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing dependencies #75

Closed
msuchard opened this issue May 6, 2023 · 28 comments · Fixed by #106
Closed

missing dependencies #75

msuchard opened this issue May 6, 2023 · 28 comments · Fixed by #106
Labels
sos-challenge Issues raised during the Save Our Sisyphus challenge
Milestone

Comments

@msuchard
Copy link
Member

msuchard commented May 6, 2023

All modules AFAIK require keyring, but this package is not included the module renv.lock files. As a result, keyring is missing when trying to execute in environments behind firewalls and all (?) modules fail.

What other dependencies (maybe Strategus itself) are missing? @pbr6cornell

@msuchard
Copy link
Member Author

msuchard commented May 8, 2023

please fix in all modules ASAP if you're looking for SOS participation from data sources behind firewalls. @anthonysena @pbr6cornell

@anthonysena
Copy link
Collaborator

@msuchard Strategus will attempt to install keyring as part of the module instantiation process:

script <- "
renv::restore(prompt = FALSE)
if (!require('ParallelLogger', quietly = TRUE)) {
install.packages('ParallelLogger')
}
if (!require('keyring', quietly = TRUE)) {
install.packages('keyring')
}
"
tempScriptFile <- tempfile(fileext = ".R")
fileConn <- file(tempScriptFile)
writeLines(script, fileConn)
close(fileConn)

That said, I can add this explicitly to the modules to ensure that keyring and other dependencies are declared and installed.

@anthonysena
Copy link
Collaborator

From our discussion @msuchard, just noting here that the current implementation will not work for data partners that are executing behind a firewall where the use of install.packages is not possible. Instead it is better to record the dependencies in the renv.lock file so the dependencies are available to module explicitly and there is no dependence on the globally installed version.

@schuemie
Copy link
Member

schuemie commented May 9, 2023

@msuchard : could you help me understand how you're running Strategus behind the firewall? I would have assumed the process would be to install all Strategus modules using ensureAllModulesInstantiated() on a machine connected to the internet, and then moving all instantiated modules to the machine not connected to the internet. If so, then perhaps the solution is to make sure ensureAllModulesInstantiated() also installs keyring, instead of requiring each module to have it in its renv lock file.

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

@schuemie -- each module needs to have all of its dependencies in it's renv/library, i.e. a module should not depend on a library in the R's global or user specific-library. current, Strategus assumes that keyring is in the global/user library (and maybe other packages as well). handling this in ensureAllModulesInstantiated() is fine, but having it in the module renv.lock will better help debugging, i.e. one can try renv::restore() etc.

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

@anthonysena -- re: the script in ModuleInstantiation.R. on my clean (no packages in global/user library), this does not install keyring (or ParallelLogger) into the module-specific renv/library. there's probably a conflict with renv environments in which require gets executed. also, please use renv::install() (instead of install.packages). The shim that renv adds over install.packages often (in my hands) fails.

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

also, perhaps the missing ParallelLogger is why some peeps during the office-hours were reporting no log files.

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

further, @schuemie , i am not sure that ensureAllModulesInstantiated() works as intended:

> ensureAllModulesInstantiated(analysisSpecifications = analysisSpecifications)
# A tibble: 7 x 4
  module                         version remoteRepo remoteUsername
  <chr>                          <chr>   <chr>      <chr>         
1 CohortGeneratorModule          0.1.0   github.com ohdsi         
2 CohortDiagnosticsModule        0.0.7   github.com ohdsi         
3 CharacterizationModule         0.3.2   github.com ohdsi         
4 CohortIncidenceModule          0.0.6   github.com ohdsi         
5 CohortMethodModule             0.1.0   github.com ohdsi         
6 SelfControlledCaseSeriesModule 0.1.3   github.com ohdsi         
7 PatientLevelPredictionModule   0.1.0   github.com ohdsi    

but i still get errors dependencies errors (even after I manually add keyring via renv::install("keyring") into the renv/library for each module):

* start target CohortGeneratorModule_1
Loading required package: CohortGenerator
Loading required package: DatabaseConnector
Loading required package: R6
Error: package or namespace load failed for 'CohortGenerator' in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
 namespace 'rlang' 1.0.5 is already loaded, but >= 1.1.0 is required
Error: package 'CohortGenerator' could not be loaded
Execution halted

@anthonysena
Copy link
Collaborator

Working out these changes on the sos-challenge branch. I'm also making an update to the CohortGeneratorModule to address open issues there, including adding keyring and Strategus. I will let you know when it is ready. Thanks!

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

simultaneously, am also going to try again from a clean install of R v4.2 (was using R v4.1).

@anthonysena
Copy link
Collaborator

anthonysena commented May 9, 2023

Running GHA on a newer version of the CohortGeneratorModule (https://github.com/OHDSI/Strategus/actions/runs/4926598482) and the updated version of Strategus. If this works properly, I'll work through updating the other Strategus modules to include keyring, Strategus and to also ensure that the renv::restore is clean.

For my own reference, I've set some RENV specific environment variables so that I can clear the cache for each module:

RENV_PATHS_CACHE="E:/renv/cache"

I found this helpful when reviewing this issue.

UPDATE: don't set this environment variable:

RENV_PATHS_LIBRARY_ROOT="E:/renv"

RENV_PATHS_LIBRARY_ROOT will override the renv/library path that is relative to the module. This then causes issues later since Strategus cannot verify if the module was actually initialized (since the renv library is not present in the relative directory).

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

documenting here: module instantiation (renv::restore() called from Strategus::execute()) fails:

  • with DatabaseConnector for Rv4.1 and
  • with Rcpp for Rv4.2

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

it appears that i fix the dependency issues by:

  • manually opening each module, and
  • running renv::restore() a couple of times, followed by renv::install("keyring")

@anthonysena
Copy link
Collaborator

@msuchard - thanks for your continued work on this and for reporting these issues.

If you have an opportunity: could you install the sos-challenge branch of Strategus and attempt to run the example study on Eunomia? This new version of the study specification uses v0.1.1-1 of CohortGeneratorModule which now explicitly declares the keyring and Strategus dependencies in the renv.lock file and also bumps up some of the other dependencies (i.e. DatabaseConnector v6.2). Hoping that this will work better - assuming it does, I'll work on updating the other modules accordingly.

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

@anthonysena -- are we missing a tag?

trying URL 'https://github.com/ohdsi/CohortGeneratorModule/archive/refs/tags/v0.1.1-1.zip' Error in utils::download.file(url = moduleUrl, destfile = moduleFile) :    cannot open URL 'https://github.com/ohdsi/CohortGeneratorModule/archive/refs/tags/v0.1.1-1.zip' In addition: Warning message: In utils::download.file(url = moduleUrl, destfile = moduleFile) :   cannot open URL 'https://github.com/ohdsi/CohortGeneratorModule/archive/refs/tags/v0.1.1-1.zip': HTTP status was '404 Not Found'
--
``` 

@msuchard
Copy link
Member Author

msuchard commented May 9, 2023

ok ... i fixed the tag for you @anthonysena -- it was missing a v

@msuchard
Copy link
Member Author

@anthonysena -- the example study on Eunomia (using sos-challenge) runs through to the PLP and then errors out. Nonetheless, this does appear to fix the keyring-missing issue.

Also, the manually having to run renv::restore() in each module to ensure that all packages are installed only occurred for me the first time I downloaded (and cached) all of the packages. Once my cache was already populated, I had no issues. This suggests that when renv times-out downloading the package from the web, things break.

It would be really, really helpful to have a function that checks to make sure that all dependencies are actually installed in their renv/library directories!!!

@anthonysena
Copy link
Collaborator

Thanks @msuchard for fixing my tag error and for testing this out. I'll work to update all of the modules now and also work on having a mechanism to check to make sure that all dependencies are actually installed in the renv/library directory.

@msuchard
Copy link
Member Author

Sounds great! FYI, here's another fail example installing on the clean system:

• start target CohortGeneratorModule_1
ℹ Using R 4.1.3 (lockfile was generated with R 4.2.1)
Loading required package: CohortGenerator
Loading required package: DatabaseConnector
Loading required package: R6
Validating inputs
An error report has been created at  /Users/msuchard/Dropbox/Projects/FluoroquinoloneAorticAneurysm/output/Build/strategusOutput/CohortGeneratorModule_1/errorReport.R
Error in loadNamespace(x) : there is no package called ‘CirceR’
Calls: execute ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted
✖ error target CohortGeneratorModule_1
• end pipeline [10.35 seconds]

@msuchard
Copy link
Member Author

Sounds great! FYI, here's another fail example installing on the clean system:

• start target CohortGeneratorModule_1
ℹ Using R 4.1.3 (lockfile was generated with R 4.2.1)
Loading required package: CohortGenerator
Loading required package: DatabaseConnector
Loading required package: R6
Validating inputs
An error report has been created at  /Users/msuchard/Dropbox/Projects/FluoroquinoloneAorticAneurysm/output/Build/strategusOutput/CohortGeneratorModule_1/errorReport.R
Error in loadNamespace(x) : there is no package called ‘CirceR’
Calls: execute ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted
✖ error target CohortGeneratorModule_1
• end pipeline [10.35 seconds]

To help with writing a function, renv seems to be able to see that things are out-of-sync:

* Project '~/Dropbox/StrategusInstantiatedModules/CohortGeneratorModule_0.1.0' loaded. [renv 0.15.5]
* The project library is out of sync with the lockfile.
* Use `renv::restore()` to install packages recorded in the lockfile.

@anthonysena
Copy link
Collaborator

I'm also thinking of using ParallelLogger to capture the output of renv::restore() in the module itself since I do not think such output is captured and then it becomes difficult to find errors.

Looking at your output, it looks like you may still be using the older CG Module:

Project '~/Dropbox/StrategusInstantiatedModules/CohortGeneratorModule_0.1.0' loaded. [renv 0.15.5]

I would have expected it to be CohortGeneratorModule_0.1.1-1 which should have addressed the problem with CirceR (which for some reason it cannot retrieve v1.2.1 so should not reference v1.3.0)

@msuchard
Copy link
Member Author

yes, the example i just posted is using CG Module v0.1.0; it's the FQ_AA study (and not the Eunomia test). CirceR was not the only missing package; manual renv.lock revealed about 30 missing packages.

@anthonysena
Copy link
Collaborator

OK - I'm confirming that I've addressed this in the current development build of CG's Module (v0.1.1-1). I'll also see if I can use renv::status() as a pre-flight check to make sure the modules do not have any obvious problems with renv ahead of running things.

@msuchard
Copy link
Member Author

renv:::renv_project_synchronized_check() does a pretty good job of checking and could be used pre-flight.

@hmorgancooper
Copy link

Hey :) I am getting the same issue trying to run the AntiVegF SOS study.

It seems like the modules aren't able to find the packages I have in my r library - here are two examples.

Error: package or namespace load failed for ‘DatabaseConnector’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘DBI’ Execution halted ✖ error target CohortGeneratorModule_1 ▶ end pipeline [2.555 seconds]

Error: package or namespace load failed for ‘dplyr’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): there is no package called ‘utf8’ Execution halted ✖ error target CohortGeneratorModule_1 ▶ end pipeline [2.218 seconds]

I have previously been able to run DatabaseConnector without errors, but haven't yet been able to run an entire SOS study.
I can also load both of the 'missing' r packages outside the study.

I have the latest version of Strategus (v0.06) and have tried to run with the main and the develop branch. I also updated the CohortGenerator version in the analysisSpecification.json to be 0.02 to solve this error https://github.com/OHDSI/Strategus/issues/26.

Any help would be appreciated! I am very stuck :)

@anthonysena
Copy link
Collaborator

Hi @hmorgancooper sorry to hear of these challenges. @ablack3 also mentioned you were facing some challenges.

Just to try and address a few items you raised:

I have the latest version of Strategus (v0.06) and have tried to run with the main and the develop branch.

Installing the 'develop' branch of Strategus should bump you to version v0.1.0 and this is the one I'd recommend you use as many of these issues should be addressed in that build. You will want to restart your R session after installing Strategus just to be safe. Additionally, you'll want to remove any previous execution folders associated with running the study - it may still have references to the old study specification (and the old module). In the case of the AntiVegF study, the execution folder I'm referring to is specified here.

I also updated the CohortGenerator version in the analysisSpecification.json to be 0.02 to solve this error

I think you mean 0.2.0 of CohortGeneratorModule which is great - please try to use that along with the latest version of all of the modules. You can find those by running Strategus::getModuleList() as shown here:

> Strategus::getModuleList()
# A tibble: 8 x 7                                                                                                                                                                                                                          
  module                         version remoteRepo remoteUsername moduleType mainPackage              mainPackageTag
  <chr>                          <chr>   <chr>      <chr>          <chr>      <chr>                    <chr>         
1 CharacterizationModule         v0.4.0  github.com OHDSI          cdm        Characterization         v0.1.1        
2 CohortDiagnosticsModule        v0.1.0  github.com OHDSI          cdm        CohortDiagnostics        v3.2.3        
3 CohortGeneratorModule          v0.2.0  github.com OHDSI          cdm        CohortGenerator          v0.8.0        
4 CohortIncidenceModule          v0.2.0  github.com OHDSI          cdm        CohortIncidence          v3.2.0        
5 CohortMethodModule             v0.2.0  github.com OHDSI          cdm        CohortMethod             v5.1.0        
6 PatientLevelPredictionModule   v0.2.0  github.com OHDSI          cdm        PatientLevelPrediction   v6.3.4        
7 SelfControlledCaseSeriesModule v0.2.0  github.com OHDSI          cdm        SelfControlledCaseSeries v4.2.0        
8 EvidenceSynthesisModule        v0.2.1  github.com OHDSI          results    EvidenceSynthesis        v0.5.0    

Hoping this helps to get you moving along :-)

@hmorgancooper
Copy link

Hey @anthonysena thanks for getting back to me!

This is my output from Strategus::getModuleList():

image

I removed and reinstalled CohortIncidence, the DESCRIPTION file says it's 3.2.0 but only v3.1.5 is showing up here...
not sure what's going on there.

I double checked Strategus and I have version 0.1.0, removed the output from StrategusExecution and tried rerunning.
I got this output:

image

I definitely have the package rappdirs in my r_env.

@anthonysena anthonysena added the sos-challenge Issues raised during the Save Our Sisyphus challenge label Oct 2, 2023
@anthonysena anthonysena added this to the v0.1.1 milestone Dec 4, 2023
@anthonysena
Copy link
Collaborator

I need to add a function to Strategus that will check to ensure all dependencies for each module are available. @schuemie shared this function that should help:

https://github.com/ohdsi-studies/ScyllaEstimation/blob/master/R/Main.R#L150-L169

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sos-challenge Issues raised during the Save Our Sisyphus challenge
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants