Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockerization guide clarity #37

Open
wenyunie opened this issue Nov 23, 2023 · 14 comments
Open

dockerization guide clarity #37

wenyunie opened this issue Nov 23, 2023 · 14 comments
Labels
documentation Improvements or additions to documentation

Comments

@wenyunie
Copy link
Collaborator

image

Teammates, the dockerization part is done, but there is this part about guidelines that we should add, so I invite you guys to walk through the procedures given by these guidelines, and see if the project computational environment is reproduced.

@wenyunie wenyunie added the documentation Improvements or additions to documentation label Nov 23, 2023
@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

Thanks for doing that @wenyunie! I tried testing out the docker container and it worked on my computer (although it was kind of slow to run, maybe because I'm on a windows machine 🥲 )

One thing I did notice was that the package Hmsic doesn't seem to be installed (I was given this prompt):
Screenshot 2023-11-22 232249

I was looking through the renv file and noticed that Hmsic wasn't installed there - but I seemed to have some issues when I tried to take a snapshot on my end (Hmsic wasn't one of the packages being added as shown in the screenshot here). Just as a note, I didn't install anything by the way, and pruned the local branch I created to try and look into this issue.

image

I'm kind of at a loss as to why this is the case 🤔

@wenyunie
Copy link
Collaborator Author

I replicated this phenomenon in the container. The strange thing though, is that the analysis.rmd and analysis_report.rmd can be rendered properly without any prompts popping out saying we need to install some more packages. Only when I separately ran this code block, it is popped out:
image

@wenyunie
Copy link
Collaborator Author

OMG, the rendered html was actually off the track! I did not know markdown package is able to skip the unrunnable part of one piece of code and have the file rendered anyway!
image

@wenyunie
Copy link
Collaborator Author

wenyunie commented Nov 23, 2023

@monazhu I will add Hmisc to our Dockerfile first, so that our Dockerization Path could go correctly.

The case with renv is indeed very strange. You mean you tried to renv::snapshot() on top of the current renv.lock, but Hmisc is still not added right?

@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

Okay sounds good - thanks!!

Yeah, what I did was I installed then loaded the Hmsic in the 03-analysis.Rmd file, but when I tried to run renv::snapshot(), Hmsic doesn't show up 🥲

@wenyunie
Copy link
Collaborator Author

Okay sounds good - thanks!!

Yeah, what I did was I installed then loaded the Hmsic in the 03-analysis.Rmd file, but when I tried to run renv::snapshot(), Hmsic doesn't show up 🥲

It seems it is because Hmisc is not explicitly 'library(Hmisc)'ed even though we used functions from it. When I tried to explicitly load it, it shows up:
image

renv::snapshot() looks for packages in the code of the project, not the loaded packages in the environment:
image

@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

That's super strange, because I had explicitly loaded the package before running renv::snapshot() but Hmisc wasn't showing up. I'm not sure doing this in the parent file should matter (in theory, I don't think it should make a difference, since renv was able to pick up the other packages when you did the snapshot before)

@wenyunie
Copy link
Collaborator Author

That's super strange, because I had explicitly loaded the package before running renv::snapshot() but Hmisc wasn't showing up. I'm not sure doing this in the parent file should matter (in theory, I don't think it should make a difference, since renv was able to pick up the other packages when you did the snapshot before)

Did you load it in the console or write it in a file within the project (it seems you do not need to load it in the environment but adding the line 'library(Hmisc)' and saving the code is important here).

Indirect dependencies are also included in the snapshot. I guess the case with Hmisc is, Hmisc is loaded directly or indirectly in our environment but 'library(Hmisc)' is not in the code within the scope of the project and it is also not an indirect dependency of those that are within the scope.

@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

Yup, library(Hmisc) was called explicitly (that what I meant by having loaded it explicitly in my message early - sorry if that was confusing!)

I just tried running things again (I thought maybe I needed to run renv::restore() first just in case) but that didn't seem to matter:

image

@wenyunie
Copy link
Collaborator Author

That's indeed very strange. I tried to replicate your exact behavior, and what I get is within expectation. Before saving the code, nothing, after saving the new line, Hmisc is detected:
image
image

@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

Yeah I also initially thought it was because I wasn't saving the code - but even after saving the code and re-running renv::snapshot(), that didn't seem to change

Here's what I'm doing for full reproducibility:

  1. restarting R session (at this point, the file is saved with 'library(Hmisc)` included)
  2. ran the chunk of code that contains the libraries
  3. ran renv::snapshot() and was prompted with the following:

image

  • I am getting a message about speed_dating_analysis and testthat not being installed, but in theory this shouldn't affect what Hmisc is doing if I select (1)
  1. This ends up being the output I get:

image

🥲

It's getting quite late for me (I need to wake up early since my commute is quite long) but at the very least, we can get Hmisc in the environment one way or another

The other thing to note is that given the prompt I saw, we might want to take out the following from the testthat.R file (and include testthat in the snapshot)

library(speed_dating_analysis)
test_check("speed_dating_analysis")

@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

One last message before I log off: I tried installing testthat just now, and I'm prompted with the expected behaviour:
image

I'm not sure if Hmisc just doesn't like my computer 😂

@wenyunie
Copy link
Collaborator Author

wenyunie commented Nov 23, 2023

Yeah I also initially thought it was because I wasn't saving the code - but even after saving the code and re-running renv::snapshot(), that didn't seem to change

Here's what I'm doing for full reproducibility:

  1. restarting R session (at this point, the file is saved with 'library(Hmisc)` included)
  2. ran the chunk of code that contains the libraries
  3. ran renv::snapshot() and was prompted with the following:

image

  • I am getting a message about speed_dating_analysis and testthat not being installed, but in theory this shouldn't affect what Hmisc is doing if I select (1)
  1. This ends up being the output I get:

image

🥲

It's getting quite late for me (I need to wake up early since my commute is quite long) but at the very least, we can get Hmisc in the environment one way or another

The other thing to note is that given the prompt I saw, we might want to take out the following from the testthat.R file (and include testthat in the snapshot)

library(speed_dating_analysis)
test_check("speed_dating_analysis")

Thanks for the elaboration Mona, then I seriously do not have an idea why Hmisc does not show in the renv.lock now. But just now I realized that renv::snapshot(type='all') could produce a more exhaustive list of dependencies which includes Hmisc even if it is not explicitly written in package loading codes. I think I am going to update the renv.lock file using this new command tmr.

The issue with 'testthat' is strange though. 'testthat' is already in the current version of renv.lock. It is also installed in the Dockerized version. Is the screenshot your computer host environment where testthat is not installed? Then it would not have anything to do with the renv.lock or the Dockerized container environment.

The thing with 'speed_dating_analysis' is supposed to be OK though. It is indeed not an installed package because it is our project itself. Tiff had this in her demo (although she deleted it on lecture 4, not because it is wrong, but beyond our current knowledge scope). I find this testthat.R very useful acutally, because it is possible to run all the tests globally all at once with one command. I am also using this as the trigger for all unit tests in my CI workflow:

image image image

@monazhu
Copy link
Collaborator

monazhu commented Nov 23, 2023

@wenyunie yeah, I was running it on my local environment initially to see if I could just add Hmisc to the renv.lock file (if I were to run it on docker I don't believe you would be able to update the renv.lock file right?). I actually wasn't sure why I was prompted to install testtthat on my local environment (because I did see it in the renv.lock file), so that was also weird...

Also, thanks for clarifying your re: testthat.R file - I understand that it was for test development but didn't realize you were using it for the automated checks on github! I'm not sure if this is correct, but my understanding was that you only needed the file itself and the package loaded in order to automate the testthat. I got curious and checked the documentation for testthat::test_check() and 'testthat::test_dir()- it seems that we might not actually be runningtest_check("speed_dating_analysis")based on the documentation (because it should calltest_dir()` through the function). But I'm not sure if my understanding is correct.

image

That said, I think you're absolutely right - we don't need to take out those 2 lines since we should be able to bypass it when we take a snapshot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants