-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull in updated version of SANY to support Unicode specs #2995
Comments
Oh, we have a bunch of those :) apalache/tla-io/src/main/scala/at/forsyte/apalache/tla/imp/OpApplTranslator.scala Lines 627 to 667 in 34bdc61
|
Do you know how stable this table of constants going to be? It's autogenerated. I assume that the names will not change most likely? |
@ahelwer do I understand it right that the unicode operators have their own names such as |
It has been stable for about 10 years as far as I know, no plans to change it in the future.
No, they do not. They are all wrapped up into the same |
If you want a quick shortcut you could also just add all the different text-based unicode variants, defined here: https://github.com/tlaplus/tlaplus-standard/blob/main/unicode/tla-unicode.csv Although this might make it more work if any of these mappings are ever changed. |
I find I wrote a simple Python script to do lookups in this table (see below). Surprisingly, the table contains codes both for The Python script #!/usr/bin/env python
import sys
pattern = sys.argv[1]
all_tokens = ["<EOF>", "<BEGIN_MODULE>", "\"--->\"", "<_BM1>", "<token of kind 4>", "<CASE0>", "<CASE1>", "<CASE1b>", "<CASE1c>", "<CASE2>", "<CASE2b>", "<CASE2c>", "<CASE3>", "<CASE6>", "<CASE6b>", "<CASE6c>", "<CASEN>", "<LETTER>", "<DIGIT>", "<NUMBER>", "<_BM2>", "<token of kind 21>", "\" \"", "\"\\t\"", "\"\\n\"", "\"\\r\"", "<token of kind 26>", "\"\\\\*\"", "<token of kind 28>", "\"*)\"", "\"*)\"", "<token of kind 31>", "<token of kind 32>", "<token of kind 33>", "<_BM0>", "<SEPARATOR>", "<END_MODULE>", "<ACTION>", "\"ASSUME\"", "\"[]ASSUME\"", "<ASSUMPTION>", "\"CASE\"", "\"CHOOSE\"", "<CONSTANT>", "\"ELSE\"", "\"EXCEPT\"", "<EXISTS>", "\"EXTENDS\"", "<FORALL>", "\"IF\"", "\"INSTANCE\"", "\"LET\"", "\"IN\"", "\"LOCAL\"", "\"MODULE\"", "\"NEW\"", "\"OTHER\"", "<PROPOSITION>", "\"SF_\"", "\"\\\\EE\"", "\"\\\\AA\"", "\"THEN\"", "\"BY\"", "\"ONLY\"", "\"DEFINE\"", "<DF>", "\"THEOREM\"", "\"USE\"", "\"HIDE\"", "\"HAVE\"", "\"OBVIOUS\"", "\"OMITTED\"", "\"LAMBDA\"", "\"TAKE\"", "\"PROOF\"", "\"PROVE\"", "\"[]PROVE\"", "\"QED\"", "\"RECURSIVE\"", "\"STATE\"", "<TEMPORAL>", "\"PICK\"", "\"WITNESS\"", "\"SUFFICES\"", "<VARIABLE>", "\"WF_\"", "\"WITH\"", "\",\"", "\":\"", "\"::\"", "\".\"", "\"_\"", "\"==\"", "\"(\"", "\")\"", "\"-|-\"", "\"[\"", "\"]_\"", "\"]\"", "\"{|\"", "\"|}\"", "\"{\"", "\"}\"", "\"<<\"", "\">>_\"", "\">>\"", "\"!\"", "\"->\"", "\"<-\"", "\"|->\"", "<NUMBER_LITERAL>", "<STRING_LITERAL>", "<BAND>", "<BOR>", "\"\\'\"", "\"^+\"", "\"^*\"", "\"^#\"", "\"-.\"", "\"\\\\lnot\"", "\"\\\\neg\"", "\"~\"", "\"[]\"", "\"<>\"", "\"ENABLED\"", "\"UNCHANGED\"", "\"SUBSET\"", "\"UNION\"", "\"DOMAIN\"", "\"//\"", "\"/\\\\\"", "\"/=\"", "\"/\"", "\"\\\\/\"", "\"\\\\approx\"", "\"\\\\asymp\"", "\"\\\\bigcirc\"", "\"\\\\bullet\"", "\"\\\\cap\"", "\"\\\\cdot\"", "\"\\\\circ\"", "\"\\\\cong\"", "\"\\\\cup\"", "\"\\\\div\"", "\"\\\\doteq\"", "\"\\\\equiv\"", "\"\\\\geq\"", "\"\\\\gg\"", "\"\\\\in\"", "\"\\\\intersect\"", "\"\\\\union\"", "\"\\\\land\"", "\"\\\\leq\"", "\"\\\\ll\"", "\"\\\\lor\"", "\"\\\\o\"", "\"\\\\odot\"", "\"\\\\ominus\"", "\"\\\\oplus\"", "\"\\\\oslash\"", "\"\\\\otimes\"", "\"\\\\prec\"", "\"\\\\preceq\"", "\"\\\\propto\"", "\"\\\\sim\"", "\"\\\\simeq\"", "\"\\\\sqcap\"", "\"\\\\sqcup\"", "\"\\\\sqsubset\"", "\"\\\\sqsupset\"", "\"\\\\sqsubseteq\"", "\"\\\\sqsupseteq\"", "\"\\\\star\"", "\"\\\\subset\"", "\"\\\\subseteq\"", "\"\\\\succ\"", "\"\\\\succeq\"", "\"\\\\supset\"", "\"\\\\supseteq\"", "\"\\\\uplus\"", "\"\\\\wr\"", "\"\\\\\"", "\"~>\"", "\"=>\"", "\"=<\"", "\"=|\"", "\"=\"", "\"##\"", "\"#\"", "\"^^\"", "\"^\"", "\"--\"", "\"-|\"", "\"-+->\"", "\"-\"", "\"**\"", "\"*\"", "\"++\"", "\"+\"", "\"<=>\"", "\"<:\"", "\"<=\"", "\"<\"", "\">=\"", "\">\"", "\"...\"", "\"..\"", "\"||\"", "\"|\"", "\"|-\"", "\"|=\"", "\"&&\"", "\"&\"", "\"$$\"", "\"$\"", "\"??\"", "\"%%\"", "\"%\"", "\"@@\"", "\"!!\"", "\":>\"", "\":=\"", "\"::=\"", "\"(+)\"", "\"(-)\"", "\"(.)\"", "\"(/)\"", "\"(\\\\X)\"", "\"\\\\notin\"", "\"\\\\times\"", "\"\\\\X\"", "<IDENTIFIER>", "<ProofStepLexeme>", "<ProofImplicitStepLexeme>", "<ProofStepDotLexeme>", "<BareLevelLexeme>", "<UnnumberedStepLexeme>"]
for (position, token) in enumerate(all_tokens):
if pattern in token:
print("Position %d: token '%s'" % (position, token)) |
This looks like the easiest path! Thanks! |
Ugh, the latest package published on There is no package on maven. So it seems there is no automatic way for us to consume a newer version of SANY via |
I don't have time to resurrect publishing to maven central or Github's equivalent -- PR welcome. Can't you grab the tla2tools.jar from https://github.com/tlaplus/tlaplus/releases/tag/v1.8.0? |
@konnov does Apalache pull in SANY using the maven package? |
Apalache uses sbt, which by default pulls a package from maven. We are also using sonatype as a secondary repo: Line 26 in 169d142
I am not sure whether Github provides their own maven-compatible repo. |
Tricky thing is that I see maven has the ability to use sha1 as build version (see https://maven.apache.org/maven-ci-friendly.html) which we would need to use to distinguish the continually-updated 1.8.0 pre-release, but I can't find any details on whether various public maven repositories support that feature. |
I think this practice of updating a release file without increasing the patch number does not work well with open source software engineering practices we see on Github, Linux, etc. You could add the build number or the current date to distinguish between the builds. If you keep overwriting releases, it does not work with maven (since it caches the packages), nix, you name it. Some projects produce nightly builds and then garbage collect them, e.g., Foundry. |
Well, we're operating in the Java packaging ecosystem here which I'm unfamiliar with - personally I'm a big fan of content-addressed versioning like with nix. I agree there are drawbacks to the SNAPSHOT way of doing things but it does map well to how we are currently just uploading tla2tools.jar builds to the 1.8.0 release and overwriting the current ones. What do you think we should do, @konnov? I do think this is a problem we should solve because I like the idea of other projects being able to pull in the tla2tools.jar as a dependency. Currently there's Apalache and also tlaplus-formatter by @FedericoPonzi. |
I understand it's not cool, but the tla2tools.jar file is only 4MB in size (could be trimmed). Since it's a stable dependency for Apalache (with SANY being rarely updated), the simplest solution might be to commit the jar directly to the Apalache Git repository. |
I would also be happy, if there was a simple way to pull Alternatively, we could do something like z3-turnkey. |
Related to apalache-mc/apalache#2995 [Build] Signed-off-by: Markus Alexander Kuppe <[email protected]>
This would be even more work but in the future I think it would be nice to split out the SANY and PlusCal parsers into separate packages. They are self-contained with minimal dependencies, and projects consuming the tools as a dependency will probably be using them alone instead of pulling in the rest of the interpreter and model-checking code. |
Related to apalache-mc/apalache#2995 [Build] Signed-off-by: Markus Alexander Kuppe <[email protected]>
…ADME) instead of authentication-only Github maven packages. Related to apalache-mc/apalache#2995 [Build] Signed-off-by: Markus Alexander Kuppe <[email protected]>
Painful, but tla2tools.jar will from now on be published at https://oss.sonatype.org/content/repositories/snapshots/org/lamport/tla2tools/1.8.0-SNAPSHOT/ |
Thanks Markus! Wrangling the github CI is never a fun day. |
I recommend using the new snapshot dependency from Sonatype until the version of SANY with Unicode support is fully integrated into Apalache. Once this integration is complete, we can manually tag the TLA+ Git repository and create a permanent release on Sonatype. |
Thanks a lot @lemmy! |
Does this mean cutting a final release for 1.8.0 then doing a version bump for snapshot versions? If Unicode is going to be part of that release perhaps we should consider adding ℕ, ℤ, and ℝ to the standard modules. I will create an issue to consider the best way to do that. |
No, I didn't mean to suggest a full-scale tla2tools.jar and Toolbox release. What I had in mind was more like a monthly or quarterly code snapshot (not -SNAPSHOT) that would be permanently stored in the Maven repository. |
Now that SANY supports parsing Unicode TLA+ specs, it should be fairly straightforward to support Unicode here by updating to a newer version of SANY. Possible bugs include if your translation code looks at actual string representation (for example,
"\\E"
instead ofTLAplusParserConstants.EXISTS
) which was the case in SANY itself.The text was updated successfully, but these errors were encountered: