Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polyhedral Contract Syntax #129

Closed
2 tasks done
NicolasRouquette opened this issue Feb 9, 2023 · 46 comments · Fixed by #312
Closed
2 tasks done

Polyhedral Contract Syntax #129

NicolasRouquette opened this issue Feb 9, 2023 · 46 comments · Fixed by #312
Assignees
Labels
enhancement New feature or request user experience
Milestone

Comments

@NicolasRouquette
Copy link
Collaborator

NicolasRouquette commented Feb 9, 2023

With:

from pacti.utils.string_contract import StrContract
from pacti.terms.polyhedra.loaders import string_to_polyhedra_contract

We can write contracts in a human-friendly way like this:

# PWR:
# Parameters:
# - s: start index of the timeline variables
# - tstart: scheduled task instance start time
# - duration: scheduled task instance duration
# - generation: rate of battery charge during the task instance
def PWR_contract(s: int, tstart: float, duration: float, generation: float) -> tuple[int, list[IoContract]]:
  e = s+1
  spec = StrContract(
    inputs = [
      f"t{s}",    # Scheduled start time
      f"soc{s}",  # initial battery SOC
      f"d{s}",    # initial data volume
      f"e{s}",    # initial trajectory error
      f"r{s}",    # initial relative distance
    ],
    outputs = [
      f"t{e}",    # Scheduled end time
      f"soc{e}",  # final battery SOC
      f"d{e}",    # final data volume
      f"e{e}",    # final trajectory error
      f"r{e}",    # final relative distance
    ],
    assumptions = [
      # Scheduled task instance start time
      # t{s} == tstart
      f"t{s} <= {tstart}",
      f"-t{s} <= -{tstart}",
    ],
    guarantees = [
      # Scheduled task instance end time
      # t{e} - t{s} = duration
      f"t{e} - t{s} <= {duration}",
      f"t{s} - t{e} <= -{duration}",
    
      # Battery SOC charge
      # soc{e} - soc{s} >= duration*generation 
      f"soc{e} - soc{s} >= {duration*generation}",

      # no change to data volume
      # d{e} = d{s}
      f"d{e} - d{s} <= 0",
      f"d{s} - d{e} <= 0",

      # no change to trajectory error
      # e{e} = e{s}
      f"e{e} - e{s} <= 0",
      f"e{s} - e{e} <= 0",

      # no change to relative distance
      # r{e} = r{s}
      f"r{e} - r{s} <= 0",
      f"r{s} - r{e} <= 0",
    ])
  return e, string_to_polyhedra_contract(spec)

This approach delegates parsing the string expressions to Sympy, which accepts expressions that are not Polyhedral.
The resulting error messages can be difficult for users to understand.

This issue is about defining the syntax of Polyhedra constraints for parsing and rendering sketched as follows:

  1. LHS <= RHS

where:

  • RHS is a numerical constant
  • LHS is of the form: \sigma c V where c is a constant, V is a variable.

For convenience, the following forms would be supported by rewrite as conjunctions of (1) above:

  1. | LHS | <= RHS

Rewrite as the conjunction of:

  • LHS <= RHS
  • -(LHS) <= RHS
  1. | LHS | = 0

Rewrite as the conjunction of:

  • LHS <= 0
  • -(LHS) <= 0
  1. LHS = RHS

Rewrite as the conjunction of:

  • LHS <= RHS
  • - (LHS) <= -(RHS)

In the above:

  • -(RHS) means negating the sign of the RHS constant
    e.g. -(4) becomes -4 and -(-3) becomes 3 and -(+2) becomes -2
  • -(LHS) means negating the effective sign of each atomic c V term in the LHS expression
    e.g., -(x-y) becomes -x+y

The parser would reject any string expression that does not match any of (1,2,3,4).

This ticket entails two parts:

  • Parser

    This involves updating pacti.terms.polyhedra.loaders.string_to_polyhedra_contract to implement Regex-based parsing for the forms (1,2,3,4) above instead of using Sympy.

  • Pretty-printer

    Mapping the pacti.iocontract.IoContract internal representation of a Polyhedra contract according to the forms (1,2,3,4) above with a new function: pacti.terms.polyhedra.loaders.polyhedra_contract_to_string

@NicolasRouquette
Copy link
Collaborator Author

FYI: I made a branch with WIP for this issue: https://github.com/FormalSystems/pacti/tree/polyhedra-contracts

Currently, it handles the canonical polyhedral term syntax: LHS <= RHS

I tried running the test suite but I got this error:

(pacti-3.9) nfr@nfr-desktop:/opt/local/github.formalsystems/pacti$ make test
pdm run duty test
Inside an active virtualenv /opt/local/github.formalsystems/pacti/.venv, reusing it.
✗ Running tests (4)
  > pytest -c config/pytest.ini -n auto -k "" tests
  ERROR: usage: pytest [options] [file_or_dir] [file_or_dir] [...]
  pytest: error: unrecognized arguments: --cov --cov-config -n auto tests
    inifile: /opt/local/github.formalsystems/pacti/config/pytest.ini
    rootdir: /opt/local/github.formalsystems/pacti/config


make: *** [Makefile:80: test] Error 4

Any suggestions?

@iincer
Copy link
Collaborator

iincer commented Feb 11, 2023

Hi Nicolas,
I just ran make test in this wip branch and got a successful output:

$ make test
pdm run duty test 
✓ Running tests

The error you get seems to indicate that coverage support for pytest is not installed. I don't know how that could be the case since pyproject includes directives to install coverage support:

tests = [
    "pytest>=6.2",
    "pytest-cov>=3.0",
    "pytest-randomly>=3.10",
    "pytest-xdist>=2.4",
]

Would rerunning pdm install make sense? My local installation has the folder pacti\.venv\Lib\site-packages\pytest_cov, which is, I guess, where the coverage support goes. I find this error strange because the rest of the installation seems to be okay for you.

@NicolasRouquette
Copy link
Collaborator Author

Thanks Inigo!

pdm install && make test

Success!

So far, I believe that this achieves the 1st part of the ticket: parser support.

The next step is the pretty-printer...

@iincer
Copy link
Collaborator

iincer commented Feb 12, 2023

Great to hear that 😃
And thanks so much for improving the parser!

@NicolasRouquette
Copy link
Collaborator Author

NicolasRouquette commented Feb 12, 2023

In testing against the space_mission case study, I found that there were some bugs in the regex patterns that I'm fixing.

I also noticed that the previous pt_from_string would effectively map x > 0 to x = 0 to x <= 0 because it ignored the comparison operator!

@iincer
Copy link
Collaborator

iincer commented Feb 12, 2023

I am not surprised... The original intent was to only enter expressions of the form 'ax + by <= c' (which is what the low-level representation supports), so the '<=' was assumed to be fixed and was probably ignored.

@iincer iincer added the enhancement New feature or request label Feb 12, 2023
@NicolasRouquette
Copy link
Collaborator Author

I pushed a commit that implements the pretty printer described above.

It would be nice if pretty-printing the parsing of an input contract string would result in the same output string as the input string.

Unfortunately, this is not the case for our rules.

Rule Pattern Pos constraint Neg constraint
1 LHS <= RHS Not applicable Not applicable
2 | LHS | <= RHS LHS <= RHS -(LHS) <= RHS
3 | LHS | = 0 LHS <= 0 -(LHS) <= 0
4 LHS = RHS LHS <= RHS -(LHS) <= -(RHS)

Rule 1 is the canonical pattern; there is no rewrite.
The patterns of rules 2,3,4 lead to a rewrite as a pair of positive and negative canonical patterns.

Consider the input string: x = 0.

For parsing, we apply rule 4 and rewrite it as: [ x <= 0, -x <= 0 ]

For pretty-printing, we then have an ambiguity, as this pair could match the pos/neg results of rules 2,3,4 depending on the order in which we look for them.

Is this really a big deal? In principle no because x = 0 is equivalent to |x| <= 0 and to |x| = 0.

What do you think?

@NicolasRouquette
Copy link
Collaborator Author

FYI: Here is an example of the pretty-printer:

https://github.com/FormalSystems/pacti/blob/polyhedra-contracts/case_studies/polyhedra_contracts/polyhedral-contract-example.py

It produces this:

IoContract rendering:
InVars: [<Var t1>, <Var soc1>, <Var d1>, <Var e1>, <Var r1>]
OutVars:[<Var t2>, <Var soc2>, <Var d2>, <Var e2>, <Var r2>]
A: 1.0*t1 <= 0.0, -1.0*t1 <= -0.0, -1.0*soc1 <= -375.0, -1.0*d1 <= -1.0
G: -1.0*t1 + 1.0*t2 <= 10.0, 1.0*t1 + -1.0*t2 <= -10.0, -1.0*soc1 + 1.0*soc2 <= 300.0, 1.0*d2 <= 0.0, -1.0*d2 <= 0.0, -1.0*e1 + 1.0*e2 <= 0.0, 1.0*e1 + -1.0*e2 <= 0.0, -1.0*r1 + 1.0*r2 <= 0.0, 1.0*r1 + -1.0*r2 <= 0.0

Polhedra Contract pretty-printing
Inputs: [t1, soc1, d1, e1, r1]
Outputs: [t2, soc2, d2, e2, r2]
A: [
        | t1 | = 0,
        -soc1 <= -375.0,
        -d1 <= -1.0
   ]
G: [
        -t1 + t2 = 10.0,
        -soc1 + soc2 <= 300.0,
        | d2 | = 0,
        | -e1 + e2 | = 0,
        | -r1 + r2 | = 0
   ]

@NicolasRouquette
Copy link
Collaborator Author

I am wondering whether we need to augment IoContract with a pretty-printer function: IoContract --> str.

There are several places where IoContract raises a ValueError that requires formatting an IoContract; e.g.:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[17], line 1
----> 1 c3 = c2.compose(tcm1)
      2 print("Contract c2.compose(tcm1):")
      3 print(polyhedra_contract_to_string(c3))

File /opt/local/github.formalsystems/pacti/src/pacti/iocontract/iocontract.py:458, in IoContract.compose(self, other)
    456     new_a = other.a.abduce_with_context(self.a | self.g, assumptions_forbidden_vars)
    457     if list_intersection(new_a.vars, assumptions_forbidden_vars):
--> 458         raise ValueError(
    459             "The guarantees \n{}\n".format(self.g)
    460             + "were insufficient to abduce the assumptions \n{}\n".format(other.a)
    461             + "by eliminating the variables \n{}".format(assumptions_forbidden_vars)
    462         )
    463     assumptions = new_a | self.a
    464 elif other_helps_self and not self_helps_other:

ValueError: The guarantees 
-1.0*t1 + 1.00000000000000*t4 <= 9.00000000000000, 1.0*t1 + -1.00000000000000*t4 <= -9.00000000000000, -1.00000000000000*d4 <= -50.0000000000000, 1.0*e1 + -1.00000000000000*e4 <= -40.0000000000000, -1.0*r1 + 1.00000000000000*r4 <= 0, 1.0*r1 + -1.00000000000000*r4 <= 0
were insufficient to abduce the assumptions 
1.0*t4 <= 9.0, -1.0*t4 <= -9.0, -1.0*soc4 <= -40.0, 1.0*e4 <= 4.0
by eliminating the variables 
[<Var t4>, <Var soc4>, <Var d4>, <Var e4>, <Var r4>, <Var t5>, <Var soc5>, <Var d5>, <Var e5>, <Var r5>]

This is hard to read...

So I'm proposing to change the IoContract constructor from this:

    def __init__(
        self, assumptions: TermList, guarantees: TermList, input_vars: List[Var], output_vars: List[Var]
    ) -> None:

to add optional pretty-printer functions based on this:
https://docs.python.org/3/library/typing.html#callable

For IoContract, we need 3 kinds of optional printer functions:

    def __init__(
        self, assumptions: TermList, guarantees: TermList, input_vars: List[Var], output_vars: List[Var],
        pretty_printer: Callable[[IoContract], str] = lambda x: x.__str__(),
        termlist_printer: Callable[[TermList], str] = lambda x: x.__str__(),
        varlist_printer: Callable[[list[Var]], str] = lambda x: x.__str__()
    ) -> None:

Now, I get something more readable:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 c3 = c2.compose(tcm1)
      2 print("Contract c2.compose(tcm1):")
      3 print(polyhedra_contract_to_string(c3))

File /opt/local/github.formalsystems/pacti/src/pacti/iocontract/iocontract.py:465, in IoContract.compose(self, other)
    463     new_a = other.a.abduce_with_context(self.a | self.g, assumptions_forbidden_vars)
    464     if list_intersection(new_a.vars, assumptions_forbidden_vars):
--> 465         raise ValueError(
    466             "The guarantees \n{}\n".format(self.termlist_printer(self.g))
    467             + "were insufficient to abduce the assumptions \n{}\n".format(self.termlist_printer(other.a))
    468             + "by eliminating the variables \n{}".format(self.varlist_printer(assumptions_forbidden_vars))
    469         )
    470     assumptions = new_a | self.a
    471 elif other_helps_self and not self_helps_other:

ValueError: The guarantees 
[
-t1 + t4 = 9.00000000000000,
 soc1 - soc4 <= -96.0000000000000,
 -d4 <= -50.0000000000000,
 e1 - e4 <= -40.0000000000000,
 | -r1 + r4 | = 0
]
were insufficient to abduce the assumptions 
[
t4 = 9.0,
 -soc4 <= -40.0,
 e4 <= 4.0
]
by eliminating the variables 
[d4, d5, e4, e5, r4, r5, soc4, soc5, t4, t5]

@iincer
Copy link
Collaborator

iincer commented Feb 12, 2023

Thanks, Nicolas, for implementing this! It makes the tool more user-friendly. It just struck me that reporting results (including plotting) is an important application of a tool that aims to help users to deal with specifications. Regarding your questions:

For pretty-printing, we then have an ambiguity, as this pair could match the pos/neg results of rules 2,3,4 depending on the order in which we look for them.
Is this really a big deal? In principle no because x = 0 is equivalent to |x| <= 0 and to |x| = 0.
What do you think?

I agree that there is ambiguity. I would say that our higher priority rule to check in this case is equality (rule 4). It is better to output x = 0 than |x| = 0.

It produces this: [...] | t1 | = 0

The output looks great. It would be nice to remove spaces after an opening | and before a closing |, e.g., |t1| = 0 instead of | t1 | = 0

I'm proposing to change the IoContract constructor from...

Right now, we print contracts by defining a __str__(self) function in the IoContract class. This function calls the str function on the input and output variables and assumptions and guarantees termlists:

https://github.com/FormalSystems/pacti/blob/75d0c05d6ca40e8c2c9c68d95a23a503d8f193ad/src/pacti/iocontract/iocontract.py#L332-L343

I agree that the output you get is the way to go (it seems the constants need to be floating-point formatted--there are some long strings: 9.00000000000000). Instead of modifying the constructor of IoContract, however, we could just modify the __str__ routines of the Term class. These routines would invoke your pretty print. What do you think? Do you see advantages to doing this by extending the signature of the constructor with the optional printer functions?

Something to think about is that someone at some point may be using small numbers in constraints. Then scientific notation would be appropriate. We could decide for the user and output all numbers in scientific notation with a couple of fractional digits by default, or we could give the user the ability of passing a format string to Pacti.

@iincer iincer added this to the v0.0.0 milestone Feb 13, 2023
@NicolasRouquette
Copy link
Collaborator Author

I think it would be preferable to have the __str__ functions produce human-friendly output.

I started w/ the optional pretty-printing functions because it was simple and I could see how this works in my case study. After running into a few errors from PolyhedralTermList functions, I realized that this approach is too awkward: better make sure that every __str__ does the right thing.

I agree about the higher priority of rule 4 for printing and about removing the spaces for the | signs.

Can you elaborate about formatting floating-point numbers?

For the pattern-based rewriting, I used numpy.isclose(..) to approximate the notion of floating point equality up to "unit in the last place" (ULP). It is unclear to me whether we should use hardcoded constants or make that tunable in the API.

@iincer
Copy link
Collaborator

iincer commented Feb 13, 2023

Can you elaborate about formatting floating-point numbers?

What I had in mind is that the internal representation of a contract may be {"coefficients":{"i":1.0000001}, "constant":0.0000002}.
If we go for a fixed fractional length for a floating-point number, say length = 2, we would get 1.00 i \le 0.00 as the output.
In this case, we do violence to the constant term. Alternatively, we could always write scientific notation with a couple of fractional digits: 1.00E0 i <= 1.00 E -7. Doing this would allow us to always return something sensible, but in the cases when scientific notation is not needed, the output would not look ideal, e.g., 1.00E0 i <= 1.00E0 instead of i \le 1.

For the pattern-based rewriting, I used numpy.isclose(..) to approximate the notion of floating point equality up to "unit in the last place" (ULP). It is unclear to me whether we should use hardcoded constants or make that tunable in the API.

This is tricky, and may depend on the application...

@iincer
Copy link
Collaborator

iincer commented Feb 13, 2023

From previous discussion, we have to

  • Modify __str__ of the PolyhedralTermList class to use your pretty print functionality
  • Better understand whether it is okay to use numpy.isclose(..) to approximate the notion of floating point equality using hardcoded constants

Something else that would be useful here is to

  • Consider getting rid of the notion of an StrContract. We could overload the constructor of PolyhedralTermList to either take a list of the form {"coefficients":{"i":1.0000001} or a list of strings that we parse with your parser.

@NicolasRouquette
Copy link
Collaborator Author

Regarding numpy.isclose(...), the reason for this approximate operations stems from matching LHS with -(LHS) and RHS with -(RHS). In either case, we need to compare whether two numeric values are equal, which is a well-known source of problems. In Pacti's case, algebraic operations on PolyhedralTerm can introduce slight differences numbers due to floating point approximation and rounding phenomena. This, we cannot use strict equality, we instead need to use some kind of approximate equality criterion. There are two approximate equality criteria: relative error and 'units in the last place' (aka ULP). See section 1.2 here: https://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.102.244&rep=rep1&type=pdf

In Python, there are at least two different libraries providing this kind of support:
https://docs.python.org/3/library/math.html#math.isclose
https://numpy.org/doc/stable/reference/generated/numpy.isclose.html

Which one should we use?

@NicolasRouquette
Copy link
Collaborator Author

Regarding isClose, I put some constants in polyhedra:

RelativeTolerance: float = 1e-05
AbsoluteTolerance: float = 1e-08

def areNumbersApproximativelyEqual(v1: numeric, v2: numeric) -> bool:
   if isinstance(v1, int) & isinstance(v2, int):
      return v1 == v2
   else:
      f1=float(v1)
      f2=float(v2)
      return np.isclose(f1, f2, rtol=RelativeTolerance, atol=AbsoluteTolerance, equal_nan=True)

Since Polyhedra uses numpy instead of math, then perhaps it makes sense to use np.isclose.

@NicolasRouquette
Copy link
Collaborator Author

Overloaded PolyhedralTermList constructor to handle parsing a list of strings according to the 4 patterns.

This allows making as significant simplification of string_to_polyhedra_contract.
Wondering whether to eliminate StrContract as Inigo suggests...

@iincer
Copy link
Collaborator

iincer commented Feb 13, 2023

This issue is related to #144, which talks about improving the user experience of importing Pacti's functionality.
Our discussion has centered around (a) pretty printing and (b) nice constructors. We should talk more about where to place this functionality. I believe it would make sense to leave polyhedra.py focused on low-level operations and have the pretty print and user-friendly constructors in external files. The exception is the function __str__, which we need to modify to support nicer error messages.

We should probably chat about the use of numpy.isclose(...) and alternatives...

@NicolasRouquette
Copy link
Collaborator Author

Following a discussion with Inigo, there are some API-breaking changes due to the polyhedral contract syntax.

The unit tests pass and show examples of the new API usage.

In the spirit of #144, it should suffice to have the following import:

from pacti.terms.polyhedra import *

The biggest breaking change stems from this kind of imports/code that will not work:

from pacti.terms.polyhedra.loaders import read_contract, write_contract

Previously, read_contract could read either a single dict or a list of dict.
Now, one can use the PolyhedralContract API instead in one of 3 ways:

  • Construct PolyhedralTermList for A/G and invoke the constructor directly.
  • Call the static method: PolyhedralContract.readFromString(assumptions,guarantees,InputVars,OutputVars) where all arguments are lists of strings. Inigo says there's Python syntax to call such a method using a dict, something like PolyhedralContract.readFromString(** dict) -- not tested.
  • Call the static method: PolyhedralContract.readFromString(contract: dict). Note that unlike the old read_contract, this accepts a single dict object!

The above is just a sketch of a doc that we should write somewhere...

Finally, I added unit tests for the 4 variants of PolyhedralContract string syntax. See test_polyhedral_contract_syntax.py.

@NicolasRouquette
Copy link
Collaborator Author

Made a fix to the serialization of numbers -- it turns out that there were two kinds of numbers:

  • numpy.float64

    These typically have a compact str(...) representation.

  • sympy.core.numbers.Float

    These have an str(...) representation that can be quite long, e.g. 6.000000000
    There is a more compact representation if we use the .num property.

These cases are handled by a new serializer.number2string method:

def number2string(n: numeric) -> str:
    if isinstance(n, sympy.core.numbers.Float):
        f: sympy.core.numbers.Float = n
        return str(f.num)
    else:
        return str(n)

@iincer
Copy link
Collaborator

iincer commented Feb 14, 2023

Per Google's naming conventions, the method readFromString should probably be called from_string. It makes sense that PolyhedralContract.from_string should yield a contract.

I think the ideal import statement would be

from pacti import PolyhedralContract

Then the user could build contracts by executing PolyhedralContract.from_string(args).

@NicolasRouquette
Copy link
Collaborator Author

I've applied some of the naming conventions:

  • snake_names instead of CamelCase
  • Added internal_ prefix as appropriate.
  • Renamed PolyhedralContract.readFromString => PolyhedralContract.from_string
  • Renamed PolyhedralContract.readFromDict => PolyhedralContract.from_dict
  • Renamed various functions/properties pertaining to the 4 syntaxes using the following terminology in anticipation of using it for generated term IDs (see Traceability of computed contracts #146)
    • canonical
    • absolute_less_than
    • absolute_zero
    • term_equality

@iincer iincer changed the title Polyhedra Contract Syntax Polyhedral Contract Syntax Feb 14, 2023
@NicolasRouquette
Copy link
Collaborator Author

It is unclear to me what kind of refactoring we would need to do to achieve the ideal import objective:

from pacti import PolyhedralContract

Currently, we can do this:

from pacti.terms.polyhedra import *

It works because of the ./src/pacti/terms/polyhedra/__init__.py file:

image

Can you elaborate on what it would take to achieve this ideal import?

@NicolasRouquette
Copy link
Collaborator Author

Fixed the serialization bug that Inigo and I found on case_studies/polyhedra_contracts/polyhedral-contract-example3.py.

@ayush9pandey
Copy link
Collaborator

Just catching up with this thread. The changes to the string API look great. To address #144 and have import statements like from pacti import PolyhedralContract, we may need to create parent classes as user-friendly API under src\pacti\ and then modify the src\pacti\__init__.py to include this new API..

@iincer
Copy link
Collaborator

iincer commented Feb 17, 2023

Hi @ayush9pandey, completely agree. We made some changes to the API, and now we say

from pacti.terms.polyhedra import PolyhedralContract

What used to be read_contract is now PolyhedralContract.from_dict(). Maybe we will do a further reorganization of the import statement to compress a bit further. We also have PolyhedralContract.from_string()

@iincer
Copy link
Collaborator

iincer commented Feb 17, 2023

In the notation below, LHS, MHS, and RHS refer to a linear term (i.e., sum of coefficient multipliers for variables)

The table below maps an input pattern syntax into pairs of positive and negative constraints (i.e., the internal polyhedral representation).

For serialization, if prefixes are present, then matching positive/negative constraints are mapped to the pattern form; otherwise, the constraints are scanned to find corresponding positive/negative constraint pairs that can be mapped to the pattern form.

Optional Prefix [ID:] [ID(RuleLeft):] [ID(RuleRight):]
Rule Pattern Pos constraint Neg constraint
1a LHS <= number Not applicable Not applicable
1b LHS <= RHS LHS -(RHS) <= 0 Not applicable
2a | LHS | <= number LHS <= number -(LHS) <= number
2b | LHS | = RHS LHS -(RHS) <= 0 -(LHS) -(RHS) <= 0
3a LHS = number LHS <= number -(LHS) <= -(number)
3b LHS = RHS LHS -(RHS) <= 0 -(LHS) + RHS <= 0
4a LHS >= number -LHS <= -(number) Not applicable
4b LHS >= RHS -LHS -(RHS) <= 0 Not applicable
5a number1 <= MHS <= number2 number1 -(MHS) <= 0 MHS -(number2) <= 0
5b LHS >= MHS >= RHS -(LHS) + MHS <= 0 RHS -(MHS) <= 0
6a |LHS1| + ... + |LHSn| <= number a1LHS1 + ... + anLHSn <= number Not applicable
6b |LHS1| + ... + |LHSn| <= RHS a1LHS1 + ... + anLHSn -(RHS) <= 0 Not applicable

For 7a and 7b, the positive constraint expansion produces $2^n$ constraints where a1...an correspond to all possible combinations of -1 and 1.

Rules 1b, 2b, 3b, 4b, 5, and 6b could have optional numbers on either or both sides of the comparison.

It also makes sense to use similar rules for pretty print, i.e., output x >= 0 instead of -x <= 0. What do you think?

Yes; see rule 5a.

Finally, I don't see why we should not support a general inequality: 3x + 4y + 3 <= 5z + 2x - 2w + 2. We should be able to rewrite this expression (using sympy, perhaps) and convert into something Pacti accepts.

Yes, see rule 1b. Moreover, in the above, whenever there is a combination of 2 linear terms (e.g., LHS -(RHS)), this will require simplifying the linear terms. In this example: 3x - 2x + 4y - 5z + 2w will simply to: x + 4y - 5z + 2w.

@iincer
Copy link
Collaborator

iincer commented Feb 17, 2023

#161 is relevant to this discussion. The idea is to also support

|LHS1| + |LHS2| + ... + |LHSN| <= RHS

This would translate into $2^n$ constraints

a_1 * LHS1 + a_2 * LHS2 + ... + a_n * LHSN <= RHS

where a_i are all $2^n$ combinations of -1 or 1.

@ayush9pandey

This comment was marked as resolved.

@iincer
Copy link
Collaborator

iincer commented Feb 22, 2023

Yes, could you please run pdm add typing_extensions? This will add the dependency to the project, and we should all have it when you merge to main.
I'm surprised I didn't get the error. Maybe this package is standard in recent python versions? I am running 3.10 and 3.11.

@ayush9pandey

This comment was marked as off-topic.

@iincer
Copy link
Collaborator

iincer commented Feb 22, 2023

Thanks for the additional details. It sounds like we should add it as a dependency.

@NicolasRouquette
Copy link
Collaborator Author

This issue requires a parser capability well beyond what is reasonable to tackle with the current regular expression approach.

Defining the syntax of Pacti's Polyhedral terms as a formal grammar makes sense, which means defining a PEG parser and looking for available options here:
https://wiki.python.org/moin/LanguageParsing

The top two PEG libraries are pyparsing and parsimonious.

They both have very high stars on GitHub: 1.8K and 1.6K, respectively; however, there are a few reasons to choose pyparsing over parsimonious:

  • parsimonious lacks a dedicated documentation website
  • parsimonious' claimed speed is unlikely to be a significant factor because Polyhedral terms are reasonably short texts.
  • pyparsing can generate railroad diagrams for the grammar, which can be helpful to understand/debug the grammar.

For these reasons, I propose using pyparsing to specify the Polyhedral term grammar and implement a corresponding parser.

@iincer
Copy link
Collaborator

iincer commented Apr 10, 2023

Thanks for doing this research, @NicolasRouquette!
Implementing our own parser sounds very attractive, particularly as we support additional syntax in the future. This could allow us to show more consistency to the user.
Moving forward with this would be great. Do you think the time investment would be manageable?

@NicolasRouquette
Copy link
Collaborator Author

Creating the grammar to parse a generalized version of the above table is easy with pyparsing, which allows defining grammar in a similar way to a PEG parser.

If you are interested in following progress, see this branch:
https://github.com/pacti-org/pacti/tree/issue-129

The remaining work involves the following:

  • During parsing, reduce the intermediate representation
    e.g., adding constants and factors for the same variable within a list of terms
  • Mapping the intermediate representation to PolyhedralTerm

The time investment is about a day of work.

@iincer
Copy link
Collaborator

iincer commented Apr 10, 2023

This will be a great addition. Thanks, @NicolasRouquette!

@NicolasRouquette
Copy link
Collaborator Author

NicolasRouquette commented Apr 10, 2023

FYI: I generated the railroad diagrams:

https://github.com/pacti-org/pacti/blob/issue-129/docs/expression.html

Open this file in a browser; it should look like this:

image

Generating this diagram is, unfortunately, awkward; I tried to import the grammar from a separate file that would create the diagram, but this resulted in circular Python import errors. Also, the generated HTML file is missing a style that shows shapes with a different color than the background; without the style, we see black rounded rectangles without the internal details.

Despite these difficulties, using this diagram as a reference for the Polyhedral term grammar might be helpful.

@NicolasRouquette
Copy link
Collaborator Author

This commit concludes the first step described above.

The remaining step involves mapping the grammar.Expression intermediate representation to construct corresponding pacti.terms.polyhedra.PolyhedralContract.

@NicolasRouquette
Copy link
Collaborator Author

This commit is WIP for the last step.

The functionality for mapping grammar.Expression to polyhedra.PolyhedralContract is complete; however, 37 tests passed and 41 tests failed. This needs additional work...

@NicolasRouquette
Copy link
Collaborator Author

This commit fixes all unit tests; however, I realize that we need to review the mapping. The table is incomplete because it does not address the general case supported by the grammar, which, in a compact form, is as follows:

term = only_variable | number_and_variable | only_number
signed_term = pp.Optional(symbol, default="+") + term

terms = signed_term + pp.ZeroOrMore(symbol + term)

abs_term = pp.Optional(floating_point_number) + "|" + terms + "|"

abs_or_term = abs_term | term

equality_expression = abs_or_terms + equality_operator + abs_or_terms
leq_expression = abs_or_terms + pp.OneOrMore("<=" + abs_or_terms)
geq_expression = abs_or_terms + pp.OneOrMore(">=" + abs_or_terms)
expression = equality_expression | leq_expression | geq_expression

The mapping is defined in https://github.com/pacti-org/pacti/blob/issue-129/src/pacti/terms/polyhedra/serializer.py

There are some examples here: https://github.com/pacti-org/pacti/blob/issue-129/examples/grammar/from_string.py

@NicolasRouquette
Copy link
Collaborator Author

Based on a recent discussion/review with @iincer, we concluded that:

  1. For documentation purposes, we will make a markdown file with just the grammar rules instead of the railroad diagrams.

  2. For the grammar rules:

  • Add support for parentheses
  • Restrict the syntax of absolute values to positive coefficients.

@NicolasRouquette
Copy link
Collaborator Author

The absolute value restriction may be too restrictive; consider this inequality:

|x+y| - |x-y| <= k

Geometrically, the solutions are in either 1 or 2 regions.
For example:

image

image

image

We can calculate the number of solution regions using without creating a plot:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.label.html#scipy.ndimage.label

Here is an example:

import numpy as np
from scipy import ndimage

# Create a meshgrid for x and y values
x = np.linspace(-10, 10, 100)
y = np.linspace(-10, 10, 100)
X, Y = np.meshgrid(x, y)

k = -2

# Calculate the inequality
Z = np.abs(X + Y) - np.abs(X - Y) - k

# Create a new Z array for the colorscale
# 0 = green
# 1 = red
Z_color = np.where(Z <= 0, 0, 1)

# Label contiguous regions in the Z_color array
labeled_array, num_features = ndimage.label(Z_color == 0)

# Print the number of contiguous green regions
print("Number of contiguous green regions:", num_features)

This prints:

Number of contiguous green regions: 2

We could incorporate this test to ensure that the solutions to each PolyhedralTerm are within a single contiguous geometrical region. What do you think @iincer ?

@iincer
Copy link
Collaborator

iincer commented Apr 15, 2023

I like the use of ndimage.label to get the regions :-)
I agree that requiring the coefficients of absolute values to be nonnegative is restrictive. However, it is a way in which we are guaranteed get convex polyhedra from formulas. The examples you show give us nonconvex regions. This is equivalent to having specifications with logical ORs in them, which is currently not fully supported.

@NicolasRouquette
Copy link
Collaborator Author

Ah, if the requirement is a single convex region, then we should verify that the resulting set of linear inequalities yields a single convex region after expanding all absolute terms. We can use scipy.spatial.ConvexHull to iteratively keep track of the convexity of the inequalities as we add them one at a time.

@iincer
Copy link
Collaborator

iincer commented Apr 15, 2023

I am pretty sure that requiring the nonnegativity of the coefficients multiplying the absolute values is sufficient, since the only place where non-convexity pops up is when we have |exp| >= exp2. This is evaluated to (exp >= exp2) or (exp <= -exp2).

@NicolasRouquette
Copy link
Collaborator Author

NicolasRouquette commented Apr 15, 2023

So it is OK to have something like this:

$$ \Sigma_i k_i| a_i X + b_i Y + c_i Z | <= d $$

for arbitrary values of $$a_i,b_i,c_i$$

as long as $$d&gt;=0$$ and all $$k_i&gt;=0$$

@iincer
Copy link
Collaborator

iincer commented Apr 15, 2023

The $a_i$, $b_i$, $c_i$, and $d$ can be arbitrary. We only need $k_i \ge 0$.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request user experience
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants