-
-
Notifications
You must be signed in to change notification settings - Fork 34.2k
esm: use wasm version of cjs-module-lexer #60663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Review requested:
|
|
I am somewhat puzzled why we are using WASM for this, it seems there's a lot of hoops being jumped in lexer.js that could've been saved if we just parse it natively (e.g. copying strings into a buffer as UTF16), as far as I can tell @guybedford |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #60663 +/- ##
==========================================
- Coverage 88.55% 88.54% -0.01%
==========================================
Files 703 703
Lines 208077 208082 +5
Branches 40083 40086 +3
==========================================
- Hits 184254 184247 -7
- Misses 15841 15844 +3
- Partials 7982 7991 +9
🚀 New features to boost your workflow:
|
31c1883 to
7465b8b
Compare
|
Updated the benchmark again to split the measurement - one measures loading the lexer (which is now faster), one measures actually doing the parsing after the lexer is already loaded (which is now actually slower, indicating the WASM version is actually slower than the JS version). |
The original benchmark uses a not very realistic fixture (it has a huge try-catch block that would throw on the first line and then export at the end, hardly representative of real-world code). Also, it measures the entire import including evaluation, not just parsing. This updates the name to import-cjs to be more accurate, and use the typescript.js as the fixture which has been reported to be slow to import, leading users to use require() to work around the peformance impact. It splits the measurement into two different types: parsing CJS for the first time (where the overhead of loading the lexer makes a difference) and parsing CJS after the lexer has been loaded.
The synchronous version has been available since 1.4.0.
7465b8b to
19b3154
Compare
|
This sounds correct to me, the Wasm gain is avoiding the warm up, and that was always the story for es module lexer this came from. Doing a safe C++ or Rust port would be beneficial. @anonrig started some Rust work here previously in https://github.com/anonrig/commonjs-lexer. The code would need to be carefully vetted for safety properties for a full C++ inclusion, but that could also very much be a good approach to follow. |
I'm extremely close to convince @lemire to revive that work. Maybe we should do it sooner |
|
FWIW I think if we want to rewrite it to native, the native API should take UTF16 (or +Latin1) for input and try not to assume the data comes in UTF8 to avoid the transcoding. |
|
Landed in 2388991...04a086a |
The original benchmark uses a not very realistic fixture (it has a huge try-catch block that would throw on the first line and then export at the end, hardly representative of real-world code). Also, it measures the entire import including evaluation, not just parsing. This updates the name to import-cjs to be more accurate, and use the typescript.js as the fixture which has been reported to be slow to import, leading users to use require() to work around the peformance impact. It splits the measurement into two different types: parsing CJS for the first time (where the overhead of loading the lexer makes a difference) and parsing CJS after the lexer has been loaded. PR-URL: #60663 Refs: #59913 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
The original benchmark uses a not very realistic fixture (it has a huge try-catch block that would throw on the first line and then export at the end, hardly representative of real-world code). Also, it measures the entire import including evaluation, not just parsing. This updates the name to import-cjs to be more accurate, and use the typescript.js as the fixture which has been reported to be slow to import, leading users to use require() to work around the peformance impact. It splits the measurement into two different types: parsing CJS for the first time (where the overhead of loading the lexer makes a difference) and parsing CJS after the lexer has been loaded. PR-URL: #60663 Refs: #59913 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
benchmark: use typescript for import cjs benchmark
The original benchmark uses a not very realistic fixture (it has
a huge try-catch block that would throw on the first line and then
export at the end, hardly representative of real-world code).
Also, it measures the entire import including evaluation, not just
parsing. This updates the name to import-cjs to be more accurate,
and use the typescript.js as the fixture which has been reported
to be slow to import, leading users to use require() to work around
the peformance impact. It splits the measurement into two different
types: parsing CJS for the first time (where the overhead of
loading the lexer makes a difference) and parsing CJS after the
lexer has been loaded.
esm: use wasm version of cjs-module-lexer
The synchronous version has been available since 1.4.0.
Refs: #59913