-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update desc-pyspark to use Spark3 #79
Comments
I just tried the image on interactive nodes cori08/~ $ salloc -N 5 -t 30 -C haswell -q debug --image=lsstdesc/desc-python:spark-v3.1.1 --volume='/global/cscratch1/sd/plaszczy/tmpfiles:/tmp:perNodeCache=size=200G' but I get stuck on : salloc: Granted job allocation 41422941 in real bad way (even C^C nor C^z can help me!) |
Thanks for giving this a try @plaszczy
Then I went back and tried the debug queue.. it took 5-10 minutes, but it finally started up:
Maybe try again and see if it works after a few minutes? The queue may have been busy. |
yes you are right, I could log now. conda does not seems to be there it does not seem to be the right python version |
actually it should be run within shifter. but if we forget about the setup and just run (within the session) shifter pyspark so it seems that within the image , the user does not have access rights to the conda setup. |
ah yes.. this is due to how the NERSC spark docker image is set up to install under /root. Let me go back to that NERSC ticket with Lisa and ask about that to get their suggestion. If I were setting up this image myself, I would have installed everything in another part of the directory tree to avoid that problem. |
@plaszczy I see Lisa isn't available this week, so I went ahead and updated the docker image to reinstall Anaconda under
Note to get Once Lisa is back and I get a new NERSC image we can clean up a bit, but I think this new |
OK that's intresting, I could reproduce. I avoid using -E that is very general, we found with Lisa that using: So after the source+activate conda, I ran a basic test in pyspark (which is an "ang2pix" in spark)
but we are back to the case pyspark does not find the libs:
|
HI @plaszczy
Let me see if I can figure out a way to get it to work without relying on that |
HI @plaszczy & @JulienPeloton Coming back to this and following the NERSC Ticket. I have updated the I think it would help to have some explicit use case I can try out and to see if things work or not. |
I think I messed with the tickets, so I recopy my message here: I still can't make it run without the -E shifter option. here is what do:
since it is OK with -E shifter option I guess something else remains... cheers, |
Hi @plaszczy I haven't forgotten this - just became very busy. I am planning to look again at this tomorrow morning! |
thanks no hurry, if you mange to run in your own env, we'll have to compare our full list en env var (otherwise...?). Other option clean all (-E) but must find a way to transmit still some env var to shifter. |
As discussed on Slack.
We have a new shifter image
lsstdesc/desc-python:spark-v3.1.1
based on NERSC's most recent Spark docker image.I would like to update the
desc-pyspark
jupyter kernel to use this version.Additionally we will pull in Stephane's desc-spark scripts and install them along with the typical desc-python installations at NERSC.
The text was updated successfully, but these errors were encountered: