This repository has been archived by the owner on Aug 9, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathcharmcraft.yaml
144 lines (119 loc) · 3.74 KB
/
charmcraft.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# Copyright 2020 Omnivector Solutions, LLC
# See LICENSE file for licensing details.
name: slurmctld
summary: |
Slurmctld, the central management daemon of Slurm.
description: |
This charm provides slurmctld, munged, and the bindings to other utilities
that make lifecycle operations a breeze.
slurmctld is the central management daemon of SLURM. It monitors all other
SLURM daemons and resources, accepts work (jobs), and allocates resources
to those jobs. Given the critical functionality of slurmctld, there may be
a backup server to assume these functions in the event that the primary
server fails.
links:
contact: https://matrix.to/#/#hpc:ubuntu.com
issues:
- https://github.com/charmed-hpc/slurmctld-operator/issues
source:
- https://github.com/charmed-hpc/slurmctld-operator
requires:
slurmd:
interface: slurmd
slurmdbd:
interface: slurmdbd
slurmrestd:
interface: slurmrestd
assumes:
- juju
type: charm
bases:
- build-on:
- name: ubuntu
channel: "22.04"
run-on:
- name: ubuntu
channel: "22.04"
architectures: [amd64]
config:
options:
cluster-name:
type: string
default: osd-cluster
description: |
Name to be recorded in database for jobs from this cluster.
This is important if a single database is used to record information from
multiple Slurm-managed clusters.
default-partition:
type: string
default: ""
description: |
Default Slurm partition. This is only used if defined, and must match an
existing partition.
slurm-conf-parameters:
type: string
default: ""
description: |
User supplied Slurm configuration as a multiline string.
Example usage:
$ juju config slurmcltd slurm-conf-parameters="$(cat additional.conf)"
cgroup-parameters:
type: string
default: |
CgroupAutomount=yes
ConstrainCores=yes
description: |
User supplied configuration for `cgroup.conf`.
health-check-params:
default: ""
type: string
description: |
Extra parameters for NHC command.
This option can be used to customize how NHC is called, e.g. to send an
e-mail to an admin when NHC detects an error set this value to.
`-M [email protected]`.
health-check-interval:
default: 600
type: int
description: Interval in seconds between executions of the Health Check.
health-check-state:
default: "ANY,CYCLE"
type: string
description: Only run the Health Check on nodes in this state.
actions:
show-current-config:
description: |
Display the currently used `slurm.conf`.
Example usage:
```bash
juju run slurmctld/leader show-current-config \
--quiet --format=json | jq .[].results.slurm.conf | xargs -I % -0 python3 -c 'print(%)'
```
drain:
description: |
Drain specified nodes.
Example usage:
$ juju run slurmctld/leader drain nodename="node-[1,2]" reason="Updating kernel"
params:
nodename:
type: string
description: The nodes to drain, using the Slurm format, e.g. `"node-[1,2]"`.
reason:
type: string
description: Reason to drain the nodes.
required:
- nodename
- reason
resume:
description: |
Resume specified nodes.
Note: Newly added nodes will remain in the `down` state until configured,
with the `node-configured` action.
Example usage: $ juju run slurmctld/leader resume nodename="node-[1,2]"
params:
nodename:
type: string
description: |
The nodes to resume, using the Slurm format, e.g. `"node-[1,2]"`.
required:
- nodename