You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am sorry to post this here, but I can't find a repo for the regex golf tool. The thing is that I am trying to find a regex for all ~420 valid (actually used in the language) Mandarin syllables (as in http://pinyin.info/rules/initials_finals.html) and nothing else out of the 22*35=770 possible combinations. This task has historically lead to monster regexes, and I am quite interested in how RG works on it.
By the descriptions the golf tool seemed to be the optimal tool for such a use-case. I can manually construct a "dataset", but that's a bit weird.
The text was updated successfully, but these errors were encountered:
The Regex Golf application project is undocumented code that is not designed for release, it is also quite old code that surely needs refresh in order to work with properly. I am sorry but we have no plan for releasing it, nor updating it with new features.
I suppose you can try with this RegexGenerator code (this project) by structuring your problem like a text-extraction problem, and generate a dataset accordingly, i.e., that extracts completely a set of examples (one correct syllable) and nothing from other examples, or in alternative, by mixing, in the same examples, Mandarin syllables (extracted) with no-Mandarin syllabes (not extracted). Please note that Regex Golf algorithm is quite different from RegexGenerator one and, for the latter, you need to refer to the "Inference of Regular Expressions for Text Extraction from Examples" Article. For dataset structure and usage please refer to the RegexGenerator documentation.
I am sorry to post this here, but I can't find a repo for the regex golf tool. The thing is that I am trying to find a regex for all ~420 valid (actually used in the language) Mandarin syllables (as in http://pinyin.info/rules/initials_finals.html) and nothing else out of the 22*35=770 possible combinations. This task has historically lead to monster regexes, and I am quite interested in how RG works on it.
By the descriptions the golf tool seemed to be the optimal tool for such a use-case. I can manually construct a "dataset", but that's a bit weird.
The text was updated successfully, but these errors were encountered: