-
Notifications
You must be signed in to change notification settings - Fork 14
Metadata
The pilot maintains, parses, and trims a number of metadata files and objects. The list below gives details about each of them.
The jobReport.json file is a metadata file created by most ATLAS payloads. If it exists (all production jobs create it but not all user analysis jobs), the pilot will add it to the final server update. Currently, it re-uses the 'metaData' field in the server update to send the JSON information as text.
The payload / transform in a production job is expected to create a job report (json dictionary) containing several fields that are needed by the pilot and by Harvester (on HPCs). In ATLAS, it contains many additional fields that are not used by the pilot or Harvester, but is used by other components so the pilot sends the entire file along with the final server update ('metaData' field). The default file name is "jobReport.json" but can be defined in the pilot configuration file (pilot/util/default.cfg, "jobreport"). The pilot expects to find the following fields:
exitCode - the payload exit code
exitMsg - the payload exit message
(the fields expected by Harvester may be documented elsewhere).
For ATLAS there are several other fields used, including:
nevents - the number of processed events
Avg - average memory info
Max - max memory info
dbData
dbTime
cpuTime - consumed CPU time
Furthermore, the dictionary format (relevant for the above fields) is:
{
..
"exitCode": [integer],
"exitMsg": "[string]",
"resource": {"executor": {"nevents": <int>, "memory": {"Avg": .., "Max": ..}, "cpuTime": <int>, "dbData": <int>, "dbTime": <float>}
}
Note: on (at least) Titan, the "logfileReport" is reset before jobReport is uploaded.
Metadata information for log and output files is reported to the server using the 'xml' field (reused, there is no 'son' field). The Pilot creates this metadata (internally uses the 'fileinfo' job object field).
The (older) metadata.xml file is optionally produced by the transform but should now be replaced by the job report. If it exists it is uploaded to the server with the 'metaData' field, assuming that the more modern job report does not exist. Internally used filename: metadata-<jobId>.xml. Example:
<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\" ?>
<!-- ATLAS file meta-data catalog -->
<!DOCTYPE POOLFILECATALOG SYSTEM \"InMemory\">
<POOLFILECATALOG>
<File ID=\"cfe820f3-e08b-48f0-9d7e-965463696c6d\">
<logical>
<lfn name=\"79d9d938-09e1-4986-af64-ae4f4fc0909e_84526.1.job.log.tgz\"/>
</logical>
<metadata att_name=\"surl\" att_value=\"srm://someurl.org:8443/srm/managerv2?SFN=/pnfs/aglt2.org/atlasdatadisk/rucio/hc_test/c5/94/79d9d938-09e1-4986-af64-ae4f4fc0909e_84526.1.job.log.tgz\"/>
<metadata att_name=\"fsize\" att_value=\"203545\"/>
<metadata att_name=\"adler32\" att_value=\"493331fd\"/>
</File>
</POOLFILECATALOG>
The Pilot creates a metadata file for the input files called PoolFileCatalog.xml. This is used by the transform and contains file ID (GUID) and PFN value. Example:
<?xml version=\"1.0\" ?>
<!-- Edited By the PanDA Pilot -->
<!DOCTYPE POOLFILECATALOG SYSTEM \"InMemory\">
<POOLFILECATALOG>
<File ID=\"57FEAAE3-3E5F-224C-B281-6182329FB27E\">
<physical>
<pfn filetype=\"ROOT_All\" name=\"AOD.11106916._005324.pool.root.1\"/>
</physical>
<logical/>
</File>
..
</POOLFILECATALOG>
.. (agis_ddmendpoints.json)
.. (agis_schedconf.json)
The pilot sends detailed information about file transfers to Rucio. A list with the different fields contained in the trace report can be found in the Pilot 2 wiki.
The traces are sent by the Pilot directly to the Rucio server.
.. (memory_monitor_summary.json)
A Harvester job definition file (pandaJobData.out) is copied from the Pilot's home directory and renamed to job_definition.json, and placed in the job's work directory for later reference (i.e. it gets stored in the log).
.. (worker_attributes.json)
? (HPCJobs.json)
.. (worker_pandaids.json)
- Introduction
- Pilot Architecture
- Pilot Workflows
- Event service
- Metadata
- Direct Access
- Signal Handling
- Error Codes
- Containers
- Special Algorithms
- Pilot Configuration
- Timing Measurements
- Copy Tools
- Pilot release procedure