-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Run dataget as non-root user UID 1000 #13
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @surchs! This is fancy stuff. Left a few 🍒 and clarification comments, but overall looks good👍
Dockerfile
Outdated
RUN git config --global user.email "[email protected]" && \ | ||
git config --global user.name "Bagel User" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍒 for clarity:
RUN git config --global user.email "bageluser@neuroabgel.org" && \ | |
git config --global user.name "Bagel User" | |
RUN git config --global user.email "user@neurobagel.org" && \ | |
git config --global user.name "Neurobagel User" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we also configure these variables above in the datalad install RUN
step - maybe we can remove one set of these git config steps to minimize redundancy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OH, good catch! I didn't realize this and added the git config below because I got an error that the config was missing. However the config is scoped to the user, and because we now change the user to USERNAME it likely fails.
I can fix this by moving the git config below the user change line. But it makes me wonder if this container is going to work when I change the user at runtime with the -u
flag
Dockerfile
Outdated
@@ -14,4 +22,13 @@ RUN apt-get update -qq && \ | |||
datalad wtf | |||
|
|||
COPY ./entrypoint.sh /entrypoint.sh | |||
|
|||
# Change ownership of the entrypoint script | |||
RUN chown $USERNAME:$USERNAME /entrypoint.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, even after changing ownership here, the script should still be executable by all users, correct?
i.e., if we override the user in the container at runtime, they should still be able to run entrypoint.sh
despite it still being owned by neurobagel_user
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and that's a good point! I will add execution permission for everyone instead
Dockerfile
Outdated
RUN groupadd --gid $USER_GID $USERNAME \ | ||
&& useradd --uid $USER_UID --gid $USER_GID -m $USERNAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool, but I don't find it super intuitive. To make it easier for ourselves if we need to revisit these settings, how about a comment briefly describing the type of user/user group you are trying to create? e.g., Creating a user in the container which is the same as the first created user on the host system, with a home directory (?), and adding it to a new group with the same name as the user.
Also, I encountered the option --system
or -r
for groupadd
/useradd
, which seems to be specifically creating users for running services rather than login/interactive use (no home directory) - see https://serverfault.com/a/350932 and https://www.linuxquestions.org/questions/linux-newbie-8/useradd-r-option-and-system-account-question-892978/ for example. I believe system users are created by default with UIDs < 1000, so it would be an alternative to explicitly setting UID/GID. Is this something worth looking into? Or would having a home directory be actually important in our case, since we are presumably doing some git operations during the datalad install step? (If so, I think the current approach still makes sense)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the comment, agree that this can help. I don't see a benefit of using the -r flag you looked up. On a cursory read it seems that it might rather make things more complicated as it seems to be a user for deamons, not actual users. An extra home directory is cheap as well, so I don't think we should do more things more complicated than necessary
Co-authored-by: Alyssa Dai <[email protected]>
So the tricky thing with this PR is that it will now always run the container as the user with UID 1000. Unless of course the user manually changes the ID. But there won't be a nice warning message if they don't change the ID and also don't happen to be the user with UID 1000, e.g. on a shared system. If the user with UID does not have write permission to the output directory specified for dataget, then the container will just fail with a "Permission denied" error. That may not be a huge problem, because most likely they wouldn't be allowed to user docker on a shared system anyway. But the question is: how useful is this change going to be if we:
So I'm wondering a bit how clever it is to merge this change here - or if we should just leave it at changing the default instructions so users always specify the uid:gid they want to run the container as |
After discussion we decided that it makes more sense to keep the container run as root by default, as most users will either override the user now or be root on the host anyway. For shared systems, we'll add a singularity command example in the query tool |
root
#7Changes proposed in this pull request:
Checklist
This section is for the PR reviewer
[ENH]
,[FIX]
,[REF]
,[TST]
,[CI]
,[MNT]
,[INF]
,[MODEL]
,[DOC]
) (see our Contributing Guidelines for more info)Closes #XXXX
For new features:
For bug fixes: