Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

first-mate not taking advantage of caching in Oniguruma #93

Closed
winstliu opened this issue Apr 18, 2017 · 2 comments
Closed

first-mate not taking advantage of caching in Oniguruma #93

winstliu opened this issue Apr 18, 2017 · 2 comments

Comments

@winstliu
Copy link
Contributor

winstliu commented Apr 18, 2017

I've been investigating recently why first-mate takes so long to tokenize files with very long lines. For reference, here's the current performance (in milliseconds):

Tokenizing jQuery v2.0.3
1341

Tokenizing jQuery v2.0.3 minified
1403255

Tokenizing Bootstrap CSS v3.1.1
523

Tokenizing Bootstrap CSS v3.1.1 minified
20760

As you can see, it takes around 23 minutes to fully tokenize jquery.min.js, which is absolutely unacceptable.

It turns out the reason for this is that we haven't been utilizing the caching that Oniguruma offers. Here's a breakdown of the history:

In order to enable caching, it appears that we need to send Oniguruma an OnigString of the line we want to tokenize, rather than a JavaScript String. Unfortunately, I have thus far been unable to make this work, as I get differing results depending on whether I pass in an OnigString or a String.

/cc: @nathansobo

@nathansobo
Copy link
Contributor

/cc @maxbrunsfeld @as-cii

@winstliu
Copy link
Contributor Author

❤️ Thanks @maxbrunsfeld!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants