Skip to content

Limit System JQ

Mark Feit edited this page Jun 27, 2018 · 10 revisions

NOTE: This is legacy content that has been migrated into the perfSONAR documentation.

Introduction

pScheduler's limit system includes several features that use the jq language as a way to add programmability to the system. Jq's forté is for processing the JSON data used internally by pScheduler and is applied in much the same way awk(1)` would be used to process text.

IMPORTANT NOTE: Some of the scripts in the examples on this page contain line breaks. These were added for readability and would not be considered valid JSON. Scripts need to be a single line when used in a limit configuration.

Resources for Learning About jq

  • Home page
  • JQPlay - A place to experiment with jq.
  • Reference Material
    • Manual - For version 1.5, which is what is currently available on systems that run pScheduler.
    • Hyperpolyglot - Brief examples using the jq command-line interface.

Formatting jq scripts as JSON

Because pScheduler's limit configuration is itself a JSON file, certain characters used within jq scripts, notably double quotation marks ("), backslashes (\) and newlines must be escaped with a backslash (e.g., \", \\ and \n) to be considered valid.

Beginning in perfSONAR 4.1, multi-line scripts can be specified as an array of strings, which makes editing and reading them much easier.

Examples

Quoting literal strings:

"transform": {
    "script": "if 1 > 2 then \"Not bloody likely.\" else \"Okay.\" end"
}

Evaluation of expressions in literal strings:

"transform": {
    "script": "1234 as $value | \"Value is \\($value)\""
}

Specification of the script with multiple lines:

"transform": {
    "script": [
        "1234 as $value",
        "| \"Value is \\($value)\""
    ]
}

Identifiers

jq

This identifier allows decisions to be made based on hints about the task provided by the system.

Input

The script is provided with a single JSON object containing pairs for each of the hints pScheduler provides. For example:

{
    "requester": "198.51.100.19",    IP making the request
    "server": "192.0.2.202"          IP on which the request arrived
}

Output

The script should return a single boolean value, true to indicate that an identification was made or false otherwise. Return of non-boolean values will be treated as false.

Examples

Check to see if the requesting IP is a single IP that should not be allowed to use the system. (Note that the ip-cidr-list identifier is a better choice for this example.)

{
    "name": "do-not-want",
    "description": "One IP we really, really dislike.",
    "type": "jq",
    "data": {
        "transform": {
            "script": ".requester == \"198.51.100.86\"",
        }
    }
}

Identify requests not being made to an address that's not considered one of the management interfaces:

{
    "name": "non-management-if",
    "description": "Requests not arriving on a management interface(s)",
    "type": "jq",
    "data": {
        "transform": {
            "script": "[.server == $management_ips[]] | any | not",
            "args": {
                "management_ips": ["127.0.0.1", "198.51.100.46"]
            }
        }
    }
}

ip-cidr-list-url

The ip-cidr-list-url identifier can be instructed to pre-process the data it fetches before it is interpreted. This allows the identifier to understand JSON data without having to rely on additional software by converting it into a plain-text, newline-separated list of CIDR blocks.

The transformation is accomplished by adding a transform section:

{
    "name": "example",
    "description": "Example",
    "type": "ip-cidr-list-url",
    "data": {
        "source": "https://www.example.com/our-cidr-blocks.json",
        "transform": {
            "script": "...JQ Script...",
            "raw-output": true
        }
    }
}

The script The value returned by the script is a plain-text list of IPv4 and/or IPv6 addresses or CIDR blocks (e.g, 198.51.100.46 or dead::beef/64) separated by newlines. For that reason, the raw-output parameter must be true. Output not matching the required format will be treated as a retrieval failure.

Examples

Identify addresses in IP blocks named in the list published by Amazon Web Services:

{
    "name": "aws",
    "description": "Amazon Web Services",
    "type": "ip-cidr-list-url",
    "data": {
        "source": "https://ip-ranges.amazonaws.com/ip-ranges.json",
        "transform": {
            "script": ".prefixes[].ip_prefix, .ipv6_prefixes[].ipv6_prefix",
            "raw-output": true
        },
        "update": "P1D",
        "retry": "PT4H",
        "exclude": [ "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16" ],
        "fail-state": false
    }
}

Limits

jq

The jq limit

Input

Input to the script is a single JSON object containing two or three pairs:

  • type - A string that names the type of test being proposed
  • spec - A JSON object containing the test's parameters
  • schedule - An optional JSON object containing an ISO8601 timestamp (start) and ISO8601 duration (duration) specifying when the run is proposed to start and how much time it will spend running. (Note that the latter is usually greater than the test's duration parameter if it has one.) This object will not be present if a new task is being evaluated but will be for evaluation of runs.

For example:

{
    "type": "throughput",
    "spec": {
        "dest": "ps.example.com",
        "bandwidth": "200M",
        "duration": "PT1M"
    },
    "schedule": {
        "start": "2017-08-19T12:34:56",
        "duration": "PT1M8S"
    }
}

Output

The script should produce one of the following values:

  • Boolean (true or false) - Signifies that the proposed task passes or does not pass the limit. If the value is false, the limit system's diagnostic output will indicate an unspecified reason for the failure.
  • String - Signifies that the proposed task does not pass the limit and uses the contents of the string as the reason for the failure in diagnostic output.

Non-boolean or non-string output will be treated as if the limit did not pass and a suitable diagnostic message will be provided.

Examples

Limit the length parameter of any test to 256:

{
    "name": "big-packets",
    "description": "Limit packet size for all tests",
    "type": "jq",
    "data": {
        "transform": {
            "script": "256 as $max_length
                       | if .spec.length > $max_length
                         then \"Packets are limited to \\($max_length) bytes\"
                         else true
                         end"
        }
    }
}

Limit any the number of hops in a trace test to 20:

{
    "name": "trace-hops",
    "description": "Limit trace hops",
    "type": "jq",
    "data": {
        "transform": {
            "script": "20 as $max_hops
                       | if .type == \"trace\" and .spec.hops > $max_hops
                         then \"No more than \\($max_hops) hops allowed.\"
                         else true
                         end"
        }
    }
}

Limit the bandwidth of throughput tests to 500 Mb/s:

{
    "name": "throughput-low-bandwidth",
    "description": "Limit throughput test bandwidth",
    "type": "jq",
    "data": {
        "transform": {
            "script": "import \"pscheduler/si\" as si;                                                                             
                       "500M" as $max_bandwidth                                                                                    
                       | if .type == \"throughput\"                                                                                
                           and si::as_integer(.spec.bandwidth) > si::as_integer($max_bandwidth)                                    
                         then \"Bandwidth is limited to \\($max_bandwidth)\"                                                                      
                         else true                                                                                                 
                         end"
        }
    }
}

Task Rewriting

NOTE: This feature will be available with the release of perfSONAR 4.1.

If a rewrite pair is present in a limit configuration where the schema is 2 or later and the submission is on a system that is the lead participant, it specifies a jq transform (see JQTransformSpecification in the JSON dictionary) applied to the task immediately after initial validation and prior to limit enforcement and tool selection. NOTE: The rewriter provides a set of functions that are inserted into the script, all import and include statements are extracted and relocated in order to the top to maintain correct jq syntax.

Input to the transform's script is a JSON object containing the contents of the task as it was submitted to the server. The rewriter adds a private pair for its own internal use (currently named __REWRITER_PRIVATE__) which should not be examined or modified.

Changes to the task are made by modifying the JSON in place (e.g., .test.spec.bandwidth = 100000000) and must be followed by a call to the change() function (described below) with a message that will be meaningful to the end user (e.g., Limited bandwidth to 100 Mb/s).

Conditions that would require that the incoming task be rejected may be dealt with by calling the reject() function (described below) with a message that will be meaningful to the end user (e.g., Cannot use tools whose names contain the letter T). Tasks rejected in this way will not be screened by other limits that might have allowed it to proceed, so use this feature carefully. Also note that rewriting takes place only on the node which is the lead participant, so this mechanism should not be used as a way of enforcing limits.

Should the script fail when it was run, the incoming task will be rejected with a suitable diagnostic message.

Rewriter Built-In Functions

The following functions will be made available to rewriting scripts:

change(message) - Signals that a change has been made to the task and adds the string message to the set of diagnostics added to the task's details. This function must be called at least once if the script modifies the JSON in any way. Any non-string value for message will be passed through jq's tostring function. A value of null will result in no message being appended to the diagnostics, although this is strongly discouraged.

classifiers - Returns an array of the classifiers into which the node requesting the task were grouped (e.g., [ "friendlies", "partners" ]).

classifiers_has(value) - Returns a boolean indicating whether or not the string value is one of the classifiers.

reject(message) - Signals that the task should be rejected for the reason described by message. Any non-string value for message will be passed through jq's tostring function.

Examples

Force certain tests to operate from a specific interface:

{
    ...
    "rewrite": {
        "script": [
            "import \"pscheduler/iso8601\" as iso;",                                                                                

            "# Recommended so the pipeline statements all begin with |.",
            ".",

            "# Hold this in a variable for use where it's not in-context",
            "| .task.type as $tasktype",

            "# Force latency onto a specific interface",
            "| if ( [\"latency\", \"latencybg\" ] | contains([$tasktype]) )",
            "  then",
            "    .task.spec.source = \"ps7-latency.example.org\"",
            "    | change(\"Forced use of interface reserved for latency\")",
            "  else",
            "    .",
            "  end",

            "# The end.  (This takes care of the no-comma-at-end problem)"
        ]
    },
    ...
}

Throttle the bandwidth parameter of throughput tests for all but certain groups to 50 Mb/s:

{
    ...
    "rewrite": {
        "script": [
            ".",

            "# Throttle non-friendlies to 50 Mb/s for throughput",
            "| if .task.type == \"throughput\"",
            "    and (",
            "      (.task.spec.bandwidth == null)",
            "      or (.task.spec.bandwidth > 50000000)",
            "    )",
            "    and (.classifiers | contains([\"friendlies\"]) | not)",
            "  then",
            "    .task.spec.bandwidth = 50000000",
            "    | change(\"Throttled bandwidth to 50 Mb/s\")",
            "  else",
            "    .",
            "  end",

            "# The end."
        ]
    },
    ...
}

Force the minimum duration for certain tests that specify one to 5 seconds:

{
    ...
    "rewrite": {
        "script": [
            "import \"pscheduler/iso8601\" as iso;",                                                                                

            ".",

            "# Hold this in a variable for use where it's not in-context",
            "| .task.type as $tasktype",

            "# Make some tests run a minimum of 5 seconds",
            "| if ( [\"idle\", \"idlebgm\", \"idleex\", \"latency\", \"latencybg\", \"throughput\" ]",
            "       | contains([$tasktype]) )",
            "    and iso::duration_as_seconds(.task.spec.duration) < 5",
            "  then",
            "    .task.spec.duration = \"PT5S\"",
            "    | change(\"Bumped duration to 5-second minimum\")",
            "  else",
            "    .",
            "  end",

            "# The end."
        ]
    },
    ...
}

Force the repeat interval, if specified, to a minimum of one minute:

{
    ...
    "rewrite": {
        "script": [
            "import \"pscheduler/iso8601\" as iso;",                                                                                

            ".",
            "| if .schedule.repeat != null"
            "    and iso::duration_as_seconds(.schedule.repeat) < 60",
            "  then",
            "    .schedule.repeat = \"PT1M\"",
            "    | change(\"Bumped repeat to one-minute minimum\")",
            "  else",
            "    .",
            "  end",

            "# The end."
        ]
    },
    ...
}