From 6964773e62928e9dd7ee8d3b6016616aa4ca920d Mon Sep 17 00:00:00 2001 From: Bouwe Andela Date: Mon, 6 Nov 2023 14:30:14 +0100 Subject: [PATCH] Remove nlesc specific chapter --- _sidebar.md | 3 - nlesc_specific/e-infrastructure/das5.md | 175 ------------------ .../e-infrastructure/e-infrastructure.md | 100 ---------- 3 files changed, 278 deletions(-) delete mode 100644 nlesc_specific/e-infrastructure/das5.md delete mode 100644 nlesc_specific/e-infrastructure/e-infrastructure.md diff --git a/_sidebar.md b/_sidebar.md index 66eec35..08d48c9 100644 --- a/_sidebar.md +++ b/_sidebar.md @@ -19,6 +19,3 @@ * [C and C++](/best_practices/language_guides/ccpp.md) * [Fortran](/best_practices/language_guides/fortran.md) * [Contributing to this Guide](/CONTRIBUTING.md) -* NLeSC specific - * [Access to (Dutch) e-Infrastructure](/nlesc_specific/e-infrastructure/e-infrastructure.md) - * [DAS-5](/nlesc_specific/e-infrastructure/das5.md) diff --git a/nlesc_specific/e-infrastructure/das5.md b/nlesc_specific/e-infrastructure/das5.md deleted file mode 100644 index 50c9120..0000000 --- a/nlesc_specific/e-infrastructure/das5.md +++ /dev/null @@ -1,175 +0,0 @@ -# DAS-5 - -*This text is about DAS-5. However, most of the advice should also be -applicable to its successor [DAS-6](https://www.cs.vu.nl/das/home.shtml).* - -This text gives a couple of practical hints to get you started using the -DAS-5 quickly. It is intended for people with little to no experience -using compute clusters. - -First of all, and this is the most important point in this text: read -the usage policy and make sure you understand every word of it: -https://www.cs.vu.nl/das5/usage.shtml - -The DAS-5 consists of multiple cluster sites, the largest one is located -at the VU, which you can reach using by the hostname -`fs0.das5.cs.vu.nl`. The firewall requires that your IP is whitelisted, -which means you will be able to access the DAS from the eScience Center -office, but not directly when you are somewhere else. To use the DAS -from anywhere you can use eduVPN. - -When you login in it means you are logged into the headnode, this node -should not be used for any computational work. The cluster uses a -reservation system, if you want to use any node that is not the head -node, you must use the reservation system to gain access to a compute -node. The reserveration system on DAS-5 is called Slurm, you can see all -running jobs on the cluster using `squeue` and cancel any of your -running jobs with `scancel `. - -The files in your home directory `/home/username/` will be backed up -automatically, if you accidently delete an important file you can email -the maintainer and kindly request him to put back an old version of the -file. If you have to store large data sets put them under -`/var/scratch/username/`, the scratch space is not backed up. - -You can use the command `module` to gain access to a large set of -preinstalled software. Use `module list` to see what modules are -currently loaded and `module avail` to see all available modules. You -can load or unload modules with the 'module load' and `module unload`. -You may want to add some of the modules you frequently use to your -bashrc. Note that all that these modules do is add or remove stuff from -your `PATH` and `LD_LIBRARY_PATH` environment variables. If you need -software that is not preinstalled, you can install it into your home -directory. For installing Python packages, you have to use Anaconda or -`pip install --user`. - -If you want an interactive login on any of the compute nodes through the -reservation system, you could use: `srun -N 1 --pty bash`. The srun -command is used to run a program on a compute node, -N specifies the -number of nodes, --pty specifies this is an interactive job, bash is the -name of the program being launched. This reservation is only cancelled -when you logout of the interactive session, please observe the rules -regarding reservation lengths. - -To access the nodes you've reserved quickly it's a good idea to generate -an ssh key and add your own public key to your 'authorized_keys' file. -This will allow you to ssh to nodes you have reserved without password -prompts. - -To reserve a node with a particular GPU you have to specify to srun what -kind of node you want. I have the following alias in my bashrc, because -I use it all the time: -`alias gpurun="srun -N 1 -C TitanX --gres=gpu:1"` -If you prefix any command with `gpurun` the command will be executed on -one of the compute nodes with an Nvidia GTX Titan X GPU in them. You can -also type `gpurun --pty bash` to get an interactive login on such a -node. - - -## Running Jupyter Notebooks on DAS-5 nodes - -If you have a Jupyter notebook that needs a powerfull GPU it can be -useful to run the notebook not on your laptop, but on a GPU-equipped -DAS-5 node instead. - -### How to set it up - -It can be a bit tricky to get this to work. In short, what you need is -to install jupyter, for example using the following command: -``` -pip install jupyter -``` -And it's recommended that you add this alias to your .bashrc file: -``` -`alias notebook-server="srun -N 1 -C TitanX --gres=gpu:1 bash -c 'hostname; XDG_RUNTIME_DIR= jupyter notebook --ip=* --no-browser'"` -``` -Now you can start the server with the command ``notebook-server``. - -You just need to connect to your jupyter notebook server after this. -The easiest way to do this is to start firefox on the headnode (fs0) and connect to the node that was printed by the ``notebook-server`` command. Depending on what node you got from the scheduler you can go to the address ``http://node0XX:8888/``. For more details and different ways of connecting to the server see the longer explanation below. - -### More detailed explanation - -First of all, you need to install jupyter into your DAS-5 account. I -recommend using miniconda, but any Python environment works. If you are -using the native Python 2 installation on the DAS don't forget to add -the `--user` option to the following pip command. You can install -Jupyter using: `pip install jupyter`. - -Now comes the tricky bit, we are going to connect to the headnode of the DAS5 and reserve -a node through the reservation system and start a notebook server on that node. -You can use the following alias for that, I suggest storing it in your .bashrc file: -`alias notebook-server="srun -N 1 -C TitanX --gres=gpu:1 bash -c 'hostname; XDG_RUNTIME_DIR= jupyter notebook --ip=* --no-browser'"` - -Let's first explain what this alias actually does for you. -The first part of the command is similar to the `gpurun` alias explained above. If you -do not require a GPU in your node, please remove the `-C TitanX --gres=gpu:1` part. -Now let's take a look at what the rest of this command is doing. - -On the node that we reserve through `srun` we execute the following bash command: -`hostname; XDG_RUNTIME_DIR= jupyter notebook --ip=* --no-browser'` -This is actually two commands, the first only prints the name of the host, -which is important because you'll need to connect to that node later. The -second command starts with unsetting the environment variable XDG_RUNTIME_DIR. - -On the DAS, we normally do not have access to the default directory -pointed to by the environment variable XDG_RUNTIME_DIR. The Jupyter notebook -server wants to use this directory for storing temporary files, if -XDG_RUNTIME_DIR is not set it will just use /tmp or something for -which it does have permission to access. - -The notebook server that we start would normally only listen to -connections from localhost, which is the node on which the notebook -server is running. That is why we pass the `--ip=*` option, to configure the -notebook server to listen to incoming connections from the headnode. Be warned -that this is actually highly insecure and should only be used within trusted -environments with strict access control, like the DAS-5 system. - -We also need the ``--no-browser`` no browser option, because we do not want to run the browser on the DAS node. - -You can type ``notebook-server`` now to actually reserve a node and start the jupyter notebook server. - -Now that we have a running Jupyter notebook server, there are 2 different approaches to connect to our notebook server: - 1. run your browser locally and setup a socks proxy to forward your http traffic to the headnode of the DAS - 2. starting a browser on the headnode of the DAS and use X-forwarding to access that browser - -Approach 1 is very much recommended, but if you can't get it to work, you can defer to option 2. - -### Using a SOCKS proxy - -In this step, we will create an ssh tunnel that we will use to forward -our http traffic, effectively turning the headnode of the DAS into your -private proxy server. Make sure you that you can connect to the headnode -of the DAS, for example using a VPN. -If you are using another ssh host in between, it makes sense to configure your SSH client with a proxyjump or use proxycommand. -The following command is rather handy, you might want to -save it in your bashrc: -`` alias dasproxy="ssh -fNq -D 8080 @fs0.das5.cs.vu.nl" `` -Do not forget to replace `` with your own username on the DAS. - -Option `-f` stands for background mode, which means the process started with this command will keep running in the background, `-N` means there is no command to be executed on the remote host, and `-q` stands for quiet mode, meaning that most output will be surpressed. - -After executing the above ssh command, start your local browser and -configure your browser to use the proxyserver. Manually configure the proxy -as a "Socks v5" proxy with the address 'localhost' and port 8080. -Do not forget to also tick the box to also proxy DNS traffic over this proxy. - -After changing these settings navigate to the page `http://node0XX:8888/`, -where `node0XX` should be replaced with the hostname of the node you -are running the notebook server on. Now in the browser open your -notebook and get started using notebooks on a remote server! - -### Using X-Forwarding - -Using another terminal, create an `ssh -X` connection to the headnode of -the DAS-5. Note that, it is very important that you use `ssh -X` for the -whole chain of connections to node, including the one used to connect to -the headnode of the DAS and any number of intermediate servers you are -using. This also requires that you have an X server on your local -machine, if you are running Windows I recommend installing VirtualBox -with a Linux GuestOS. - -On the headnode type `firefox http://node0XX:8888/`, where `node0XX` -should be replaced with the hostname of the node you are running the -notebook server on. Now in the browser open your notebook and get -started using notebooks on a remote server! diff --git a/nlesc_specific/e-infrastructure/e-infrastructure.md b/nlesc_specific/e-infrastructure/e-infrastructure.md deleted file mode 100644 index 0e24742..0000000 --- a/nlesc_specific/e-infrastructure/e-infrastructure.md +++ /dev/null @@ -1,100 +0,0 @@ -# Access to (Dutch) e-Infrastructure - -To successfully run a project and to make sure the project is sustainable after it has ended, it is important to choose the e-Infrastructure carefully. Examples of e-Infrastructure used by eScience Center projects are High Performance Computing machines (Supercomputers, Grids, Clusters), Clouds, data storage infrastructure, and web application servers. - -In general PI's will already have access to (usually local) e-Infrastructure, and are encouraged to think about what e-Infrastructure they need in the project proposal. Still, many also request our help in finding suitable e-Infrastructure during the project. - -Which infrastructure is best very much depends on the project, so we will not attempt to describe the optimal infrastructure here. Instead, we describe what is most commonly used, and how to gain access to this e-Infrastructure. - -Lack of e-Infrastructure should never be a reason for not being able to to a project (well). If you ever find yourself without proper e-Infrastructure, come talk to the Efficient Computing team. We should be able to get you going quickly. - -## SURF - -SURF is the most obvious supplier of e-Infrastructure for Netherlands eScience Center projects. For all e-Infrastructure needs we usually first look to SURF. This does not mean SURF is our exclusive e-Infrastructure provider. We use whatever infrastructure is best for the project, provided by SURF or otherwise. - -### Getting access to SURF infrastructure - -In general access to SURFsara resources is free of charge for scientists in The Netherlands. For most infrastructure gaining access is a matter of filling in a simple web-form, which you can do yourself on behalf of the scientists in the project. Exceptions are the Cartesius and Lisa, for which a more involved process is required. For these machines, only the PI of a project can submit (or anyone else with an NWO Iris account). - -The Netherlands eScience Center also has access to the infrastructure provided by SURFnet. Access is normally done on a per-organization basis, so may vary from one project partner to the next. - -### Available systems at SURF - -Here we list some of the most likely to be used resources at SURF. See the [overview of SURF services and products](https://www.surf.nl/en/research-it), and [detailed information on the SURFsara infrastructure](https://userinfo.surfsara.nl/systems). - -SURFsara: - -- **Snellius**: Snellius is the Dutch national supercomputer. Snellius is a general purpose capability system and is designed to be a well balanced system. If you need one or more of: many cores, large symmetric multi-processing nodes, high memory, a fast interconnect, a lot of work space on disk, or a fast I/O subsystem then Snellius is the machine of choice. -- **Lisa**: National Cluster. Similar machines as the Cartesius (the previous Dutch national supercomputer), without the interconnect (about 8000 cores in total). Storage also more limited. Lisa is typically designed to run lots of small (1 to 16 core) applications at the same time. -- **Grid**: Same machines again, now with a Grid Middleware. Not recommended for use in eScience Center projects. -- **HPC Cloud**: On demand computing infrastructure. Nice if you need longer running services, or have a lot of special software requirements. -- **Hadoop**: Big Data analytics framework. -- **BeeHub**: Lots of storage with a webDAV interface. -- **Elvis**: Remote rendering cluster. Creates a remote desktop session to a Linux machine with powerful Nvidia Graphics installed. -- **Data Archive**: Secure, long-term storage of research data on tape. Access to archive included with Cartesius and Lisa project accounts. - -SURFnet: - -- **SURFconext**: Federated identity management. Allows scientists to login to services using their home organization account. Best known example is SURFspot. Can be added to custom services as well. -- **SURFdrive**: Dropbox-like service hosted by SURF. - -Ask questions to: helpdesk@surfsara.nl. - -## DAS-5 - -The Netherlands eScience Center participates in the [DAS-5 (Distributed ASCI Supercomputer)](http://www.cs.vu.nl/das5), a system for experimental computer science. Though not intended for production work, it is great for developing software on, especially HPC, parallel and/or distributed software. - -DAS-5 consists of 6 clusters at 5 different locations in the Netherlands, with a total of about 200 machines, over 3000 cores, and about 800Tb total storage. These clusters are connected with dedicated lightpaths. Internally, each cluster has a fast interconnect. DAS-5 also contains an ever increasing amount of accelerators (mostly GPU's). - -DAS-5 is explicitly meant as an experimentation platform: any job should be able to run instantly, long queue times should be avoided. Running long jobs is therefore not allowed during working hours. During nights and weekends these rules do not apply. See [the usage policy](http://www.cs.vu.nl/das5/usage.shtml). - -Any eScience Center employee can get a DAS-5 account, usually available within a few hours. - -## Security and convenience when committing code to GitHub from a cluster - -When accessing a cluster, it is generally [safer to use a pair of keys than to login using a username and password](https://superuser.com/questions/303358/why-is-ssh-key-authentication-better-than-password-authentication). There is a [guide on how to setup those keys](https://www.cyberciti.biz/faq/how-to-set-up-ssh-keys-on-linux-unix/). Make sure you encrypt your private key and that it is not automatically decrypted when you login to your local machine. -Make a separate pair of keys to access your GitHub account following [GitHub's instructions](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/). It involves [uploading your public key to your GitHub account](https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/) and [testing your connection](https://help.github.com/articles/testing-your-ssh-connection/). - -When committing code from a cluster to GitHub, one needs to store an encrypted private key in the $HOME/.ssh directory on the cluster. This is inconvenient, because it requires submitting a password to unlock the private key. This password has to be resubmitted when SSHing to a local node from the head node. To bypass this inconvenience [SSH agent forwarding](https://developer.github.com/guides/using-ssh-agent-forwarding/) is recommended. It is very simple. On your local machine, make a $HOME/.ssh/config file to contain the following: -``` -Host example.com - ForwardAgent yes -``` -Replace example.com by the head node of your cluster, i.e. the node you use to login to. -Next, -``` -chmod 600 $HOME/.ssh/config. -``` -Done! - -The only remaining problem is that SSH keys cannot be used when git cloning was done using https instead of SSH, but that can be [corrected](http://stackoverflow.com/questions/6565357/git-push-requires-username-and-password): -``` -git remote set-url origin git@github.com:username/repo.git -``` -## Commercial Clouds - -If needed a project can use commercial cloud resources, normally only if all SURF resources do not meet the requirements. As long as the costs are within limits these can come out of the eScience Center general project budget, for larger amounts the PI will need to provide funding. - -We do not have an official standard commercial cloud provider, but have the most experience with Amazon AWS. - -## Procolix - -If a more long term infrastructure is needed which cannot be provided by SURF, the default company we use for managed hosting is [Procolix](https://www.procolix.com/). Procolix hosts our eduroam/surfconext authentication machines. - -In principle the eScience Center will not pay for infrastructure needed by projects. In these cases the PIs will have to pay the bill. - -## GitHub Pages - -If a project is in need of a website or webapp using only static content (javascript, html, etc), it is also possible to host this at github. See https://pages.github.com/ - -## Local Resources - -A scientist may have access to locally available infrastructure. - -## Other - -This list does not include any resources from Nikhef, CWI, RUG, Target, etc, as these are (as far as we know) not open to all scientists. - -## Avoid if possible - -Try to avoid using self-managed resources (the proverbial machine under the Postdoc's desk). This may seem an easy solution at first, but will most probably require significant effort over the course of the project. It also increases the changes of the infrastructure disappearing at some random moment after the project has finished.