diff --git a/pages/docs/community-docs/watcloud/user-requests.mdx b/pages/docs/community-docs/watcloud/user-requests.mdx new file mode 100644 index 0000000..c23c1f6 --- /dev/null +++ b/pages/docs/community-docs/watcloud/user-requests.mdx @@ -0,0 +1,57 @@ +# User Requests + +Sometimes, users may request access to additional software packages, more disk quota, or longer SLURM job time limits. +This document outlines the process of handling such requests. + +Prior to performing changes for the request, WATcloud admins should understand the request fully and consider the implications of the change. +The following questions are examples of what the admins should consider: + +**Does the request make use of cluster resources in a fair and efficient manner?** + +For example, if a user requests more disk quota, the WATcloud admin should consider whether the type of data +the user is storing is appropriate for the storage location. +For instance, long-term storage of large datasets on SSD-backed storage may not be the most efficient use of resources. + +**Can the request be fulfilled without the help of WATcloud admins?** + +Sometimes, the requested work can be done by the users themselves. +The WATcloud admin should point the user to the relevant documentation or resources, and improve the documentation if necessary. +For example, if a user requests a software package that can be installed in user-space (e.g. Conda), +the WATcloud admin should point the user to some options (e.g. [miniconda](https://docs.anaconda.com/miniconda/)). + +In subsequent sections, we will discuss specific user requests and how to handle them. + +## SLURM Job Time Limit Extension + +Users may request an extension to the time limit of their SLURM[^slurm-docs] jobs. +The time limits are in place to ensure that the cluster remains maintainable. +For example, if the maximum time limit jobs is 7 days, then in the worst case scenario, +we need to wait 7 days for a node to be drained before performing maintenance[^slurm-maintenance] on it. + +If a user requests an extension to the time limit and the request does not conflict with upcoming maintenance, +then the WATcloud admin can extend the time limit as follows: + +```bash copy +scontrol update job= TimeLimit= +``` + +[^slurm-docs]: See the [SLURM documentation](/docs/compute-cluster/slurm) for more information on the use of SLURM in the WATcloud cluster. +[^slurm-maintenance]: See the [Maintenance Manual](./maintenance-manual#slurm) for information on performing maintenance on SLURM nodes. + +## Software Package Installation + +Users may request the installation of software packages that are not already available on the cluster. +If the software package is required to be installed system-wide and the demand for the software outweighs +the cost (in terms of storage space and maintenance complexity) of installing the software, +then the WATcloud admin can add the software package to the cluster Ansible configuration[^ansible-config]. + +[^ansible-config]: At the time of writing, the Ansible configuration can be found [here](https://github.com/WATonomous/infra-config/blob/7d59786e61ce779f20e7cdb8c29b4864fe1b6c31/ansible/roles/ubuntu_dev_vm/tasks/main.yml#L94-L144). + Note that removing software packages is a separate step [here](https://github.com/WATonomous/infra-config/blob/7d59786e61ce779f20e7cdb8c29b4864fe1b6c31/ansible/roles/ubuntu_dev_vm/tasks/main.yml#L27-L65). + +{ +// Separate footnotes from the main content +} + +import { Separator } from "@/components/ui/separator" + +