Skip to content
jrquirk edited this page Aug 11, 2014 · 8 revisions

Goal

The goal of the production scripts (located in analyzer/batch/jobscripts) is to help us keep track of files as they are processed by updating a production database. Additionally, we would like it to be relatively easy to start a new production.

Location

All scripts discussed here are located at analyzer/batch/jobscripts, except for the SGE scripts located at analyzer/batch/scripts

Minimum requirements

You must have your netrc file setup.

run_production.py

If run with no arguments you get this help message

Usage: ./run_production.py --production=PROD_TYPE [optional arguments...]
----------------------------------------------------------------------------
This program checks the production database located at
$HOME/data/production.db and takes care of downloading, submitting,
and cleanup, and updates the database accordingly so multiple people can
process at the same time.
----------------------------------------------------------------------------
Required:
    --production=PROD_TYPE
                    The production type (alcapana or rootana).
Optional:
    -h, --help        Print this help then exit.
    --usage           Print this help then exit.
    --version=#       Version number to produce. Default is highest
                      production version number in production table. If
                      issued with --new and --production=rootana, this flag
                      tells us which alcapana version to produce from. If
                      issued with --new and --production=alcapana, an
                      exception is thrown.
    --new=TAG         Start a new production, incrememted once from the
                      highest number present in productions table. The
                      software tag in GitHub must be provided for
                      bookkeeping.
    --pause=#         Wait # seconds between the event loop, which checks
                      for finished jobs and runs to download.
                      (Default: 5)
                                                                            
    --nproc=#         Maximum number of processes to submit to the grid
                      simultaneously.
                      (Default: 10)
    --nprep==#        Maximum number of MOIDAS files to have downloaded in
                      addition to the one running on the grid. This means
                      have, at most, NPROC + NPREP MIDAS files at once.
                      (Default: 5)
    --spacelim=#      The maximum percentage of available space to use up
                      (as returned from the mmlsquota command).
                      (Default: 90)
    --display         Use a ncurses display.
    --modules=FILE    Use modules file FILE for rootana production.
                      For alcapana still uses hardcoded MODULES file.
                      (Default: production.cfg)
    --dataset=SET     Only process files from dataset SET. Valid values are
                      Al100, Al50awithoutNDet2, Al50awithNDet2, Al50b, SiR2,
                      SiR21pct, SiR23pct, Si16P, and Common. Multiple can be
                      specified with multiple --dataset=SET.
                      (Default: All datasets.)
    --database=DB     SQLite3 database to use for production. If it doesn't
                      exist, it will not be created.
                      (Default: ~/data/production.db)

The production type tells the script whether to submit alcapana or rootana jobs, and should be a string indicating such. The version number serves two puposes. If there are multiple productions of a certain type going on, this specifies which one. If omitted, the production matching PROD_TYPE with the highest version number is used. If a new production needs to be started, the new argument paired with the GitHub software tag should be used. The new argument changes the behavior of the version argument. Each production is based on a version of something else. For rootana, you need to tell it which version of alcapana to run on. The version number is incremented for a new production from the highest version number in the database for the requested production type.

The nproc and nprep are if you would like to do more or less production at a time. The MIDAS files for an alcapana run are downloaded ahead of job submission, and nprep tells us how many to have ready. The default is 5. For nproc the default is 10. Whether or not these numbers are unreasonably small or large is not known; we could probably bump nproc up to 100 as the job submission system should take care of prioritizing the jobs if we submit too many.

There is a loop that runs every pause seconds checking for things to download, submit, finish, etc. We need to put that in so we don't waste resources going through the loop, but a small number shouldn't cause a problem since the script itself is light. The default is 5.

There is a finite amount of space on the Merlin cluster, and when running this script if the amount of space starts dwindling, new files are no longer downloaded and only whatever has been submitted to the grid so far is continued with. If space opens up as old files are deleted, the script will download more. The fraction of space you want to keep available on Merlin is about 100%-spacelim%. That is, once per cycle the script checks if the disk usage is less than spacelim%. The default is 90 (percent).

The modules argument is just for rootana production, and tells the submitted jobs which modules file to load. For alcapana the MODULES file is a compilation list and its location is hardcoded elsewhere anyway.

The dataset argument means submit jobs only from that dataset. If a new production is started with this argument, the production itself contains all datasets. This argument only effects this instance of the job scripts and which files it will run on.

The database argument indicates the run database to use. The default is the shared ~/data/production.db. This can be useful if you want to do your own private production without messing around with the main database, though extraneous tables in ~/data/production.db shouldn't cause problems. This can also be useful for getting a feel for how the run scripts work. WARNING Since your own copy of the database will, at the time of this writing, use the same data directory as the main one, you could have filename clashes. Be wary.

Finally, the display argument uses a curses display for a nice look at the status of the production. It looks funky if lots of runs fail in a row for some reason.

An example of how to run this and start a new production is (notice the version in the production table does not have to line up with the tagged code name)

./run_production.py --production=rootana --new=rootana_v2.1 --version=3

And you'll see something like

INFO: Starting new production, version 8
Claimed run: 2091
Claimed run: 2092
Claimed run: 2093
Claimed run: 2094
Claimed run: 2095
Claimed run: 2096
Claimed run: 2097
Claimed run: 2098
Claimed run: 2099
Claimed run: 2100
Claimed run: 2101
Claimed run: 2102
Claimed run: 2103
Claimed run: 2119
Claimed run: 2120
Staging runs: [2091, 2092, 2093, 2094, 2095, 2096, 2097, 2098, 2099, 2100, 2101, 2102, 2103, 2119, 2120]
Downloading run: 2091
... Success!
Submitting run: 2091
Downloading run: 2092

Error Handling

You can hit Ctrl-C if you need to cancel the jobs for some reason. Doing this

  1. Requests deletion of all jobs on the grid
  2. Marks all runs that were claimed and unfinished as never having been claimed
  3. Deletes most transient files and partially completed files (except for the running programs' ODBs).

All finished runs are okay. If some other error happens because there's a problem with the tape archive or something, the script tries to do the above. You'll see something like

ERROR: There was an uncaught exception. Aborting...
Cancelling job 1025406
Cancelling job 1025407
Cancelling job 1025408
Cancelling job 1025409
Cancelling job 1025410
Cancelling job 1025411
Cancelling job 1025412
Cancelling job 1025413
Cancelling job 1025414
Cancelling job 1025415
Cancelling job 1025417
Aborting runs: [2102, 2103, 2119, 2120, 2091, 2092, 2093, 2094, 2095, 2096, 2097, 2098, 2099, 2100, 2101]
Traceback (most recent call last):

And then you'll see the rest of a normal output from a crash.

FTP Settings

This script requires you have a netrc file at $HOME/.netrc. The layout should be

machine archivftp.psi.ch
        login (FTPLOGIN)
        password (FTPPASSWORD)

Where you replace the (BRACKETED) things with the correct information. And make sure this file is readable only by you

$ chmod 600 $HOME/.netrc

Production Database

The production database is located at /gpfs/home/quirk_j/data/production.db. Everyone in our group has write access to this. You can make a link to it:

$ mkdir ~/data
$ ln -s ~quirk_j/data/production.db ~/data/production.db

The database is composed of two important tables

datasets
run dataset qual
2091 SiR2 gold
2092 SiR2 gold
2093 SiR2 gold
2094 SiR2 gold
... ... ...
productions
type version software start stop
alcapana 1 alcapana_v1 2014-04-30 00:00:00 2014-05-05 00:00:00
alcapana 2 alcapana_v2 2014-05-29 00:00:00 2014-06-02 00:00:00
alcapana 3 alcapana_v3 2014-06-16 00:00:00 2014-06-20 00:00:00

The numbers for these productions' times are rough to the week, but future ones will have timestamps accurate to the microsecond PSI time.

Then, for each production, there is a "productions" table

alcapana_v4
run status user start stop tree hist odb olog elog modules
2091 F quirk_j (timestamp) (timestamp) (long path) (long path) (long path) (long path) (long path) (a lot of text)
2092 R quirk_j (timestamp)
2093 R quirk_j (timestamp)
2094 C quirk_j
2095 C quirk_j
2096 N
2097 N
...

Here, a lot of the actual text has been omitted in lieu of (italicized bracketed) descriptors because it doesn't fit well here.

The status can be any of Not yet claimed, Claimed, Running, and Finished. When claimed, the user is filled out so we know who claimed it. When the job is submitted to the grid, the start column is filled. And finally, when finished, the locations of several files are filled in and have the form

  • tree: /gpfs/home/[user]/data/tree/v[version]/tree[run].root
  • hist: /gpfs/home/[user]/data/hist/v[version]/hist[run].root
  • odb : /gpfs/home/[user]/data/dump/v[version]/dump[run].root
  • olog: /gpfs/home/[user]/data/log/v[version]/alcapana.run[run].out
  • elog: /gpfs/home/[user]/data/log/v[version]/alcapana.run[run].err

The last two are the log files, which are just the standard output and standard error of the grid scripts.

The modules column is a bit less nice to show. The entirety of the MODULES file in analyzer/work/production/ is stored here (it's not uncommon to store entire files, called BLOBs, in a database apparently). There is a fair amount of repetitiveness here, and maybe we could just store a single instance in a [user]/data/modules/alcapana/ directory later and just point to it here. But, because the MODULES file doesn't have such a natural resting place, this is where it exists for now. The rootana production tables are similar. They have an out column and no tree, hist, or odb column.

Production executable

This is a natural place to mention the batch scripts have been modified to run the alcapana executable in $DAQdir/analyzer/work/production/. This is the same thing we've been doing before with the data_quality directory; the name change is entirely cosmetic but the motivation is that it's a more general name.

Output files

The new location for files is going to be the same as before, except they'll be an additional level deep in a version directory, like data/tree/tree02091.root would become data/tree/v9/tree02091.root, and the same goes for the histograms, logs, and ODB dumps. The motivation for this is that we can later on more easily delete files we know are old. This means we can tell rootana what version of alcapana to run on. We'll probably never do this, but the option is there.

Also the log files have been moved from any version controlled directory.

Race conditions

The database has a locking feature, and whenever a user is supposed to make an update based off of information currently in the database, the database is first locked, the info is read, and then the database is updated (for instance, when you read the list of unclaimed runs, you lock the database so that someone doesn't claim the run between your read and your write). The python SQLite library has a default 5 second timeout, but all of our operations take on the order of tenths of a second, so even if you request a lock on the database and every single member of AlCap has a lock request before you, the program should not crash because that's still going to be less than the timeout.

However, if the database gets locked through another interactive session, the production will crash.

Backups

A script to dump and commit a text version of the database to the AlcapDAQ repository may work, but I did not make this yet.

Clone this wiki locally