Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client side compilation of other flavours #273

Open
zikaari opened this issue Jul 7, 2018 · 13 comments
Open

Client side compilation of other flavours #273

zikaari opened this issue Jul 7, 2018 · 13 comments

Comments

@zikaari
Copy link

zikaari commented Jul 7, 2018

With support for WebAssembly catching up pretty fast, oniguruma regex engine has been successfully ported to the web. And it's amazing.

WebAssembly port: NeekSandhu/onigasm

Here's a list of all the regex syntaxes it can handle in the browser itself:

  • Oniguruma (native)
  • POSIX
  • Grep
  • GNU Regex
  • Perl
  • Java
  • Ruby
  • Emacs

At the moment, all other syntaxes are "locked" but really are just a flip of a switch away.

Hope this interests Regexr community 🙂

@zikaari
Copy link
Author

zikaari commented Jul 7, 2018

Maybe extends #32

@gskinner
Copy link
Owner

gskinner commented Jul 7, 2018

That's really interesting. In theory we could run it all in a worker as well. The wasm file is pretty large - about 600kb over the wire - or 163kb if our server is smart enough to compress it (I'm not sure if it is). It could completely get rid of the need to do server side execution though, and improve our ability to show live previews.

Adding syntaxes would require more than just flipping a switch, since we need our lexer to support them for syntax highlighting, and we need to document them properly.

Conceivably, we could start by shifting PCRE to this, and then adding other flavours as time permits (or as people contribute pull requests).

@zikaari
Copy link
Author

zikaari commented Jul 7, 2018

Although atom/node-oniguruma repo seems pretty inactive, but just to be safe I have proposed "flipping a switch" thing over there.

atom/node-oniguruma#81

If it doesn't get a reply, I'm gonna assume an implicit green light and move forward with this change.

@gskinner
Copy link
Owner

gskinner commented Jul 8, 2018

Is it easy / possible to strip down the wasm file at all by removing features that aren't needed? For example, flavours we don't immediately support, functionality we don't need (ex. not sure we'd need the Scanner). Just curious.

@zikaari
Copy link
Author

zikaari commented Jul 8, 2018

Wasm file is pure 100% virgin* libonig.so, which is literally source files of original kkos/oniguruma repo run through C compiler.

We can't control what goes in the final binary unless someone is willing to fork kkos/oniguruma and trim that one down instead.

But in my opinion, it's not really worth the effort. WASM can load asynchronously and while it is loading, we can have server driven regexing as usual. Once loaded, we switch to onigasm for subsequent operations.

* libonig.so + some glue code

@gskinner
Copy link
Owner

gskinner commented Jul 8, 2018

I figured that was the case, but it never hurts to ask. We'd probably just toast the server side solving if we implemented this. The js for the whole app is about the same size, so we could just load the wasm when the user changes flavours, and keep it loaded. The first solve will be a bit slow, but shouldn't be too bad.

@zikaari
Copy link
Author

zikaari commented Jul 8, 2018

Just noticed this:

... or 163kb if our server is smart enough to compress it (I'm not sure if it is)

If Regexr server doesn't support gzip, you can pull wasm from jsDelivr CDN (159KB)

Pseudo implementaion

import { loadWASM, OnigRegExp, OnigString } from 'onigasm'

(async () => {
    await loadWASM('https://cdn.jsdelivr.net/npm/[email protected]/lib/onigasm.wasm')

	let text = new OnigString(textInput.getValue())
    let regexp = new OnigRegExp(regexpInput.getValue(), { syntax: 'perl' })

	const updateHighlights = () => {
        const match = regexp.searchSync(text)
        console.log(match)
        /*
        [
            {index: 0, start: 1, end: 4, match: 'abc', length: 3}, // entire match
            {index: 1, start: 2, end: 3, match: 'b', length: 1}    // first capture group
        ]
        */
    }
    
    regexpInput.on('change', () => {
        regexp = new OnigRegExp(regexpInput.getValue(), { syntax: 'perl' })
        updateHighlights()
    })
    textInput.on('change', () => {
        text = new OnigString(textInput.getValue())
        updateHighlights()
    })
})()

Regarding,

... We'd probably just toast the server side solving if we implemented this

I think you might wanna leave that in for another 2-3 years, until WebAssembly supporting browsers become mainstream and you decide to no longer support older browsers after certain point.

@zikaari
Copy link
Author

zikaari commented Jul 8, 2018

Queued up zikaari/onigasm#14

@Eitz
Copy link

Eitz commented Aug 14, 2018

Any news on this, @NeekSandhu / @gskinner ?

@gskinner
Copy link
Owner

Not yet. I have one more update in the queue before I look at this in more detail.

@gskinner
Copy link
Owner

We just pushed v3.5, and I plan to evaluate this for v3.6

@gskinner
Copy link
Owner

Quick update: This is on my list to look at over the next little bit.

@goyalyashpal
Copy link

hi! just was curious if there are any updates on this. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants