Skip to content

Supplementary material for arXiv:1807.03078

License

Notifications You must be signed in to change notification settings

abualia4/1807.03078

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Analysing billion-objects catalogue interactively: Apache Spark for physicists

This repository contains supplementary material for arXiv:1807.03078.

How to run the notebook

You must have Apache Spark and Jupyter notebook installed on your machine or your cluster. Other Python dependencies are described in the notebook.

On a local machine

PACK="com.github.astrolabsoftware:spark-fits_2.11:0.7.2"
PYSPARK_DRIVER_PYTHON_OPTS="jupyter-notebook" pyspark \
     --master local[*] \
     --packages $PACK 

On a cluster

Standalone mode:

PACK="com.github.astrolabsoftware:spark-fits_2.11:0.7.2"
PYSPARK_DRIVER_PYTHON_OPTS="jupyter-notebook --debug --no-browser --port=$PORT1" pyspark \
     --master $SPARKURL \
     --packages $PACK \
     --driver-memory $MEMDRIVER --executor-memory $MEMEXEC --executor-cores $EXECCORES --total-executor-cores $TOTALCORES

DESC members: working at NERSC

Source your DESC environment. Then go to the Jupyter Lab web interface, and execute the notebook with the desc-pyspark kernel.

About

Supplementary material for arXiv:1807.03078

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 95.2%
  • Scala 4.8%