Skip to content

BUDA job submission

David Anderson edited this page Nov 26, 2024 · 3 revisions

BUDA science apps and variants

We call BUDA applications 'science apps'. Each science app has a name, like 'worker' or 'autodock'. A science app can have 'variants' that use different types of computer hardware. The name of a variant is 'cpu' if it uses a single CPU. Otherwise it's the name of a plan class. There might be variants for 1 CPU, for N CPUs, and for various GPU types.

User file sandbox

The BUDA tools use the user file sandbox for uploading files to the BOINC server. To access it, go to Computing / File sandbox in the project menu bar.

Managing science apps and variants.

In the menu bar of the BOINC project's web site, select Computing / Job Submission. Then click on BUDA This shows a list of existing science apps and their variants. You can

  • add or delete a science app;
  • add or delete a variant;
  • submit jobs to a variant.

Adding a variant

The form for adding a variant includes

  • A plan class name (leave blank for 1-CPU variants)
  • Select (from your file sandbox) a set of 'app files'. This includes:
    • a Dockerfile
    • a main program to run in the container
    • other app files if needed
  • list of input files names
  • list of output files names

The Dockerfile should specify a directory app/. For example:

FROM debian
WORKDIR /app
CMD ./main_2.sh

This specifies an image based on the latest Debian (from Docker Hub) and a main program (in this case a shell script) main_2.sh. This file (and the executables it presumably runs) are included in the set of app files.

Submitting jobs

The form for submitting a batch of jobs has you select (from the file sandbox) a 'batch file'. The batch file is a zip file containing one directory per job. This directory contains the input files for that job, and an optional file cmdline containing command-line argments to be passed to the main program.

If there are input files shared among all jobs, these can be put at the top level of the batch file. This can save disk space in some cases.

For example, suppose that the app takes input files file1, file2, and file3, and that all jobs use the same file1 but different file2 and file3. In this case the batch file could contain


file1
jobname1/
    [cmdline]
    file2
    file3
    ...
jobname2/
    [cmdline]
    file2
    file3
    ...
...

The file names at the top level and in each job directory must match the variant's list of input file names. Each job must have all input files.

Monitoring a batch

After submitting a batch of jobs, you're taken to a web page for the batch. This shows, among other things, how many of the jobs have completed. Reload it to update this information.

You can click on a job to see its status (and if it failed, the stderr output). You can view or download its input files.

On the batch page, you can click to download a zip file of the output files of all completed jobs. These filenames have the form

batch_<batchid>__job_<jobname>__file_<filename>

where

  • batchid is the (integer) ID of the batch;
  • jobname is the job name (the directory name from the batch file);
  • filename is the name of the output file as written by the app.

When you're done with the batch, you can 'retire' it. This removes its intput and output files from the server.

Clone this wiki locally