Skip to content
This repository has been archived by the owner on Aug 9, 2024. It is now read-only.

Commit

Permalink
initialize new slurmd charm project
Browse files Browse the repository at this point in the history
Bring forward core components from the monorepo for slurmd.
  • Loading branch information
jamesbeedy committed Jan 4, 2023
1 parent 9aa777e commit fb7b4ca
Show file tree
Hide file tree
Showing 15 changed files with 1,617 additions and 0 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
venv/
build/
*.charm
.tox/
.coverage
__pycache__/
*.py[cod]
.idea
.vscode/
103 changes: 103 additions & 0 deletions actions.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
version:
description: Return version of installed software.
node-configured:
description: Remove a node from DownNodes when the reason is `New node`.
get-node-inventory:
description: Return node inventory.
set-node-inventory:
description: Modify node inventory.
params:
real-memory:
type: integer
description: Total amount of memory of the node, in MB.

show-nhc-config:
description: Display the currently used `nhc.conf`.

get-infiniband-repo:
description: >
Display the currently configured repository for Infiniband drivers.
set-infiniband-repo:
description: >
Set the new infiniband repository.
Note: The repository file must be base64 encoded when using this action.
Example usage:
$ juju run-action slurmd/leader set-infiniband-repo repo="$(cat repo.file | base64)"
params:
repo:
type: string
description: >
Base64 encoded string that holds all information about the repository.
required:
- repo
description: >
Overrides the repository file with a custom repository for Infiniband
installation.
Note: This file should be base64 encoded.
On CentOS, the file is placed at `/etc/yum.repos.d/infiniband.repo`, while
on Ubuntu it is at `/etc/apt/sources.list.d/infiniband.list`.
install-infiniband:
description: >
Install Mellanox Infiniband drivers. This might take a few minutes to
complete.
If no custom repository was specified before, this action will set the
Mellanox repository as the default and install the latest drivers from it.
uninstall-infiniband:
description: Uninstall Mellanox Infiniband drivers.
start-infiniband:
description: Start Infiniband systemd service.
enable-infiniband:
description: Enable Infiniband systemd service.
stop-infiniband:
description: Stop Infiniband systemd service.
is-active-infiniband:
description: Check if Infiniband systemd service is active.

nvidia-repo:
description: >
Get or set the repository used to install Nvidia drivers.
This value must be set **before** installing the drivers. Changing it
afterwards has no impact on the system.
Note: The repository file must be base64 encoded when using this action.
params:
repo:
type: string
description: >
If specified, set the repository to the value specified.
nvidia-package:
description: >
Get or set the Nvidia driver package name.
This value must be set **before** installing the drivers. Changing it
afterwards has no impact on the system.
params:
package:
type: string
description: >
If specified, set the package name to the value specified
nvidia-install:
description: >
Install Nvidia GPU drivers. This might take a few minutes to complete.
If no custom repository was specified before, this action will set the
Nvidia repository as the default and install the latest drivers from it.
singularity-install:
description: >
Install Singularity. This might take a few minutes to complete.
This action will install singularity using the official .deb (Ubuntu)
or .rpm (CentOS) packages retrieved from GitHub Releases.
Note: The .deb or .rpm files must be supplied as Juju resources.
mpi-install:
description: >
Install MPI (`mpich`). This might take a few minutes to complete.
16 changes: 16 additions & 0 deletions charmcraft.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
type: charm
bases:
- build-on:
- name: ubuntu
channel: "20.04"
run-on:
- name: ubuntu
channel: "20.04"
architectures: [amd64]
- name: centos
channel: "7"
architectures: [amd64]
parts:
charm:
build-packages: [git]
charm-python-packages: [setuptools]
48 changes: 48 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
options:
custom-slurm-repo:
type: string
default: ""
description: >
Use a custom repository for Slurm installation.
This can be set to the Organization's local mirror/cache of packages and
supersedes the Omnivector repositories. Alternatively, it can be used to
track a `testing` Slurm version, e.g. by setting to
`ppa:omnivector/osd-testing` (on Ubuntu), or
`https://omnivector-solutions.github.io/repo/centos7/stable/$basearch`
(on CentOS).
Note: The configuration `custom-slurm-repo` must be set *before*
deploying the units. Changing this value after deploying the units will
not reinstall Slurm.
partition-name:
type: string
default:
description: >
Name by which the partition may be referenced (e.g. `Interactive`).
Note: The partition name should only contain letters, numbers, and
hyphens. Spaces are not allowed.
partition-config:
type: string
default: ""
description: >
Extra partition configuration, specified as a space separated `key=value`
in a single line.
Example usage:
$ juju config slurmd partition-config="DefaultTime=45:00 MaxTime=1:00:00"
partition-state:
type: string
default: "UP"
description: >
State of partition or availability for use. Possible values are `UP`,
`DOWN`, `DRAIN` and `INACTIVE`. The default value is `UP`. See also the
related `Alternate` keyword.
nhc-conf:
default: ""
type: string
description: >
Custom extra configuration to use for Node Health Check.
These lines are appended to a basic `nhc.conf` provided by the charm.
44 changes: 44 additions & 0 deletions dispatch
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/bin/bash
# This hook installs the centos dependencies needed to run the charm,
# creates the dispatch executable, regenerates the symlinks for start and
# upgrade-charm, and kicks off the operator framework.

set -e

# Source the os-release information into the env.
. /etc/os-release

if ! [[ -f '.installed' ]]
then
# Determine if we are running in centos or ubuntu, if centos
# provision the needed prereqs.
if [[ $ID == 'ubuntu' ]]
then
echo "Running Ubuntu."
# necessary to compile and install NHC
apt-get install --assume-yes make automake
elif [[ $ID == 'centos' ]]
then
# Determine the centos version and install prereqs accordingly
major=$(cat /etc/centos-release | tr -dc '0-9.'|cut -d \. -f1)
echo "Running CentOS$major, installing prereqs."
if [[ $major == "7" ]]
then
yum -y install epel-release
yum -y install yum-priorities python3 make automake yum-utils
elif [[ $major == "8" ]]
then
dnf -y install epel-release
dnf -y install yum-priorities python3 make automake yum-utils
else
echo "Running unsuppored version of centos: $major"
exit -1
fi
else
echo "Running unsuppored os: $ID"
exit -1
fi
touch .installed
fi

JUJU_DISPATCH_PATH="${JUJU_DISPATCH_PATH:-$0}" PYTHONPATH=lib:venv ./src/charm.py
Loading

0 comments on commit fb7b4ca

Please sign in to comment.