Slurm Code Usage Analysis, SCUA

The SCUA tool queries the Slurm accounting database, extracts data on job steps and uses this data to match the executable names to a known set of applications. Once it has done this, it analyses the data and can produce output on various facets of usage.

You can use the scua command on a system with Slurm available to extract the data from the Slurm database and run the analyses or you can use the scua.py command to analyse a data file that contains a dump from the Slurm database in the correct format - this option is useful if you want to run multiple, separate analyses as it runs much quicker (extracting data from the Slurm database is typically the most time consuming step) and you can move the data dump to a different system to perform the analysis.

Note: at the moment, if graphical graphical plots are requested, they are only produced for analysis broken down by software use. Tables of data (as CSV and/or markdown) are available for all analysis breakdowns.

Requirements

scua is a bash script and so has no requirements other than it must be run on the system where the Slurm sacct command is available.

scua.py uses Pandas, numpy, matplotlib, tabulate and seaborn on top of a standard Python 3 installation.

### Environment setup

You must set the environment variable SCUA_BASE to point to the location of the downloaded repository.

Analyses available

The type of analysis you wish to run on the data is specified by command line options (see Usage, below). The following analysis types are available:

Analysis	`scua` Argument Combination	`scua.py` Argument Combination	Description
Software + Size	(default, always performed)	(default, always performed)	Analyses usage by parallel job step size and software package
Software + Node Power	`-w`	`--power`	Analyses job step mean node power draw and software package.
Motif + Size	`-t`	`--motif`	Analyses usage by parallel job step size and computational motif.
Motif + Power	`-t -w`	`--motif --power`	Analyses job step mean node power draw and computational motif.
Area + Size	`-a <project list CSV>`	`--projects=<project list CSV>`	Analyses usage by parallel job step size and research area. Requires a CSV file linking account codes to research areas.
Area + Power	`-a <project list CSV> -w`	`--projects=<project list CSV> --power`	Analyses job step mean node power draw and research area. Requires a CSV file linking account codes to research areas.

`scua` Usage

Usage: scua [options]
Options:
 -a account_csv  Perform analysis by research area. account_csv is a CSV file with a mapping of account codes to research areas
 -A account      Limit to specified account code, e.g. z01
 -c              Save tables of data (as csv)
 -E date/time    End date/time as YYYY-MM-DDTHH:MM, e.g. 2021-02-01T00:00
 -k              Keeps the intermediate output from sacct in `scua_sacct.csv`
 -g              Produce graphs of usage data (as png)
 -h              Show this help
 -m              Save tables of data (as markdown)
 -p prefix       Prefix for output file names
 -S date/time    Start date/time as YYYY-MM-DDTHH:MM, e.g. 2021-02-01T00:00
 -t              Perform analysis by computational motif
 -u user         Limit to specific user
 -w              Perform analysis of mean node power draw

`scua.py` Usage

Usage: scua.py [options] input
Options:
 -projects=account_csv  Perform analysis by research area. account_csv is a CSV file with a mapping of account codes to research areas
 -A account             Limit to specified account code, e.g. z01
 --csv                  Save tables of data (as csv)
 --plots                Produce graphs of usage data (as png)
 --help                 Show this help
 --md                   Save tables of data (as markdown)
 --prefix=prefix        Prefix for output file names
 --motif                Perform analysis by computational motif
 --power                Perform analysis of mean node power draw
 --dropnan              Drop all rows that contain NaN. Useful for strict comparisons between usage and energy use as some job steps may be missing energy use data

Output

SCUA prints the usage statistics to Markdown-formatted tables on STDOUT. If you specify the -c option, it will save the same tables as CSV files and if you specify the -m option it will save the same tables as markdown files. All files will be prefixed with the specified prefix or scua if no prefix is supplied.

If you specify the -g option (to produce graphs) you will also obtain additional image files:

${prefix}_codes_usage.png: Bar chart of CU usage broken down by code
${prefix}_overall_boxplot.png: Boxplot representing overall job size statistics weighted by CU usage per job
${prefix}_top15_boxplot.png: Boxplot representing job size statistics for the top 15 codes by usage weighted by CU usage per job

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
app-data/code-definitions		app-data/code-definitions
bin		bin
python-modules		python-modules
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slurm Code Usage Analysis, SCUA

Requirements

Analyses available

`scua` Usage

`scua.py` Usage

Output

About

Releases 1

Packages

Contributors 5

Languages

License

ARCHER2-HPC/usage-analysis

Folders and files

Latest commit

History

Repository files navigation

Slurm Code Usage Analysis, SCUA

Requirements

Analyses available

scua Usage

scua.py Usage

Output

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 5

Languages

`scua` Usage

`scua.py` Usage

Packages