Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the regex library for extended syntax and flags #500

Open
slevithan opened this issue Aug 21, 2024 · 1 comment
Open

Use the regex library for extended syntax and flags #500

slevithan opened this issue Aug 21, 2024 · 1 comment
Assignees

Comments

@slevithan
Copy link

slevithan commented Aug 21, 2024

Love the project! 😊

What do you think about adding the regex package to offer extended JS regex syntax (atomic groups, possessive quantifiers, subroutines, etc.)? A couple of options:

  • Use it by default. This might be a good approach since regex uses a strict superset of JS regex syntax, and all of its extended syntax is an error in native JS regexes.
    • Potentially add an option to enable free spacing mode (flag x), which is on by default in regex but could be disabled by default in RexReplace to avoid changing existing behavior.
  • Add an option (or potentially reuse the existing -E/--engine) to switch from native JS syntax to regex.
    • In this case, presumably you'd leave all of regex's implicit flags enabled.

Note: Since RexReplace would need to call regex with dynamic input rather than using it with backticks as a template tag, that would work like this: regex({raw: [<pattern-string>]}) or regex(…)({raw: [<pattern-string>]}). Something like the following options might work best if regex was used by default:

regex({
  flags: 'gim',
  subclass: true,
  disable: {
    n: true,
    x: true,
  },
})({raw: [pattern]});

Edit: If you also wanted to disable flag v's changes to escaping rules within character classes (to avoid breaking changes), you could add the options disable: {v: true} and unicodeSetsPlugin: null. You can see more details about all of regex's options here, but essentially, option disable: {v: true} means to use flag u even in environments that support v natively, and unicodeSetsPlugin: null tells regex not to apply flag v's escaping rules when using flag u. Note that regex always uses flag u or v, so using the library would implicitly set RexReplace's -u/--unicode option. But I think it might be a good breaking change to always implicitly use Unicode-aware mode anyway, since Unicode-unaware mode can silently introduce many Unicode-related bugs, doesn't get the benefit of strict errors for weird legacy syntax, and doesn't support \u{…}, \p{…}, or \P{…}. (Flag u has been supported since ES6/ES2015.)

@mathiasrw
Copy link
Owner

Its a great idea. It was actually how I came across your regex lib. I have been considering changing the default flags and got sucked into the world of why regex is like it is. However, it will demand a version bump as things might break. And that is ok.

And thank you so much for providing the current default behaviour also. Ill dive into this during september after my current project.

@mathiasrw mathiasrw self-assigned this Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants