Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

golf tool should offer import/export #16

Open
Artoria2e5 opened this issue Feb 6, 2021 · 1 comment
Open

golf tool should offer import/export #16

Artoria2e5 opened this issue Feb 6, 2021 · 1 comment

Comments

@Artoria2e5
Copy link

Artoria2e5 commented Feb 6, 2021

I am sorry to post this here, but I can't find a repo for the regex golf tool. The thing is that I am trying to find a regex for all ~420 valid (actually used in the language) Mandarin syllables (as in http://pinyin.info/rules/initials_finals.html) and nothing else out of the 22*35=770 possible combinations. This task has historically lead to monster regexes, and I am quite interested in how RG works on it.

By the descriptions the golf tool seemed to be the optimal tool for such a use-case. I can manually construct a "dataset", but that's a bit weird.

@MaLeLabTs
Copy link
Owner

MaLeLabTs commented Feb 15, 2021

The Regex Golf application project is undocumented code that is not designed for release, it is also quite old code that surely needs refresh in order to work with properly. I am sorry but we have no plan for releasing it, nor updating it with new features.

I suppose you can try with this RegexGenerator code (this project) by structuring your problem like a text-extraction problem, and generate a dataset accordingly, i.e., that extracts completely a set of examples (one correct syllable) and nothing from other examples, or in alternative, by mixing, in the same examples, Mandarin syllables (extracted) with no-Mandarin syllabes (not extracted). Please note that Regex Golf algorithm is quite different from RegexGenerator one and, for the latter, you need to refer to the "Inference of Regular Expressions for Text Extraction from Examples" Article. For dataset structure and usage please refer to the RegexGenerator documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants