esm: use wasm version of cjs-module-lexer #60663

joyeecheung · 2025-11-10T12:43:56Z

benchmark: use typescript for import cjs benchmark

The original benchmark uses a not very realistic fixture (it has
a huge try-catch block that would throw on the first line and then
export at the end, hardly representative of real-world code).
Also, it measures the entire import including evaluation, not just
parsing. This updates the name to import-cjs to be more accurate,
and use the typescript.js as the fixture which has been reported
to be slow to import, leading users to use require() to work around
the peformance impact. It splits the measurement into two different
types: parsing CJS for the first time (where the overhead of
loading the lexer makes a difference) and parsing CJS after the
lexer has been loaded.

esm: use wasm version of cjs-module-lexer

The synchronous version has been available since 1.4.0.

Refs: #59913

                              confidence improvement accuracy (*)   (**)  (***)
esm/import-cjs.js type='cold'        ***     22.09 %       ±3.65% ±5.00% ±6.81%
esm/import-cjs.js type='warm'        ***     -3.93 %       ±1.69% ±2.29% ±3.06%

nodejs-github-bot · 2025-11-10T12:44:01Z

Review requested:

@nodejs/loaders
@nodejs/performance

joyeecheung · 2025-11-10T12:59:25Z

I am somewhat puzzled why we are using WASM for this, it seems there's a lot of hoops being jumped in lexer.js that could've been saved if we just parse it natively (e.g. copying strings into a buffer as UTF16), as far as I can tell @guybedford

codecov · 2025-11-10T13:55:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.54%. Comparing base (bd3a202) to head (19b3154).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #60663      +/-   ##
==========================================
- Coverage   88.55%   88.54%   -0.01%     
==========================================
  Files         703      703              
  Lines      208077   208082       +5     
  Branches    40083    40086       +3     
==========================================
- Hits       184254   184247       -7     
- Misses      15841    15844       +3     
- Partials     7982     7991       +9

Files with missing lines	Coverage Δ
lib/internal/modules/esm/translators.js	`92.98% <100.00%> (+0.05%)`	⬆️

... and 30 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

joyeecheung · 2025-11-10T14:27:52Z

Updated the benchmark again to split the measurement - one measures loading the lexer (which is now faster), one measures actually doing the parsing after the lexer is already loaded (which is now actually slower, indicating the WASM version is actually slower than the JS version).

The original benchmark uses a not very realistic fixture (it has a huge try-catch block that would throw on the first line and then export at the end, hardly representative of real-world code). Also, it measures the entire import including evaluation, not just parsing. This updates the name to import-cjs to be more accurate, and use the typescript.js as the fixture which has been reported to be slow to import, leading users to use require() to work around the peformance impact. It splits the measurement into two different types: parsing CJS for the first time (where the overhead of loading the lexer makes a difference) and parsing CJS after the lexer has been loaded.

The synchronous version has been available since 1.4.0.

guybedford · 2025-11-10T17:51:48Z

This sounds correct to me, the Wasm gain is avoiding the warm up, and that was always the story for es module lexer this came from.

Doing a safe C++ or Rust port would be beneficial. @anonrig started some Rust work here previously in https://github.com/anonrig/commonjs-lexer.

The code would need to be carefully vetted for safety properties for a full C++ inclusion, but that could also very much be a good approach to follow.

anonrig · 2025-11-11T00:12:14Z

This sounds correct to me, the Wasm gain is avoiding the warm up, and that was always the story for es module lexer this came from.

Doing a safe C++ or Rust port would be beneficial. @anonrig started some Rust work here previously in https://github.com/anonrig/commonjs-lexer.

The code would need to be carefully vetted for safety properties for a full C++ inclusion, but that could also very much be a good approach to follow.

I'm extremely close to convince @lemire to revive that work. Maybe we should do it sooner

joyeecheung · 2025-11-11T02:57:54Z

FWIW I think if we want to rewrite it to native, the native API should take UTF16 (or +Latin1) for input and try not to assume the data comes in UTF8 to avoid the transcoding.

nodejs-github-bot · 2025-11-11T03:40:17Z

CI: https://ci.nodejs.org/job/node-test-pull-request/70138/

nodejs-github-bot · 2025-11-12T18:35:01Z

Landed in 2388991...04a086a

The original benchmark uses a not very realistic fixture (it has a huge try-catch block that would throw on the first line and then export at the end, hardly representative of real-world code). Also, it measures the entire import including evaluation, not just parsing. This updates the name to import-cjs to be more accurate, and use the typescript.js as the fixture which has been reported to be slow to import, leading users to use require() to work around the peformance impact. It splits the measurement into two different types: parsing CJS for the first time (where the overhead of loading the lexer makes a difference) and parsing CJS after the lexer has been loaded. PR-URL: #60663 Refs: #59913 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>

The synchronous version has been available since 1.4.0. PR-URL: #60663 Refs: #59913 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>

The original benchmark uses a not very realistic fixture (it has a huge try-catch block that would throw on the first line and then export at the end, hardly representative of real-world code). Also, it measures the entire import including evaluation, not just parsing. This updates the name to import-cjs to be more accurate, and use the typescript.js as the fixture which has been reported to be slow to import, leading users to use require() to work around the peformance impact. It splits the measurement into two different types: parsing CJS for the first time (where the overhead of loading the lexer makes a difference) and parsing CJS after the lexer has been loaded. PR-URL: #60663 Refs: #59913 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>

The synchronous version has been available since 1.4.0. PR-URL: #60663 Refs: #59913 Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com> Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com>

nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Nov 10, 2025

joyeecheung changed the title ~~Cjs lexer init~~ esm: use wasm version of cjs-module-lexer Nov 10, 2025

joyeecheung force-pushed the cjs-lexer-init branch from 31c1883 to 7465b8b Compare November 10, 2025 14:26

joyeecheung added 2 commits November 10, 2025 15:30

esm: use wasm version of cjs-module-lexer

19b3154

The synchronous version has been available since 1.4.0.

joyeecheung force-pushed the cjs-lexer-init branch from 7465b8b to 19b3154 Compare November 10, 2025 14:30

GeoffreyBooth approved these changes Nov 11, 2025

View reviewed changes

anonrig approved these changes Nov 11, 2025

View reviewed changes

joyeecheung added the request-ci Add this label to start a Jenkins CI on a PR. label Nov 11, 2025

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Nov 11, 2025

lpinca approved these changes Nov 11, 2025

View reviewed changes

joyeecheung added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-rebase Add this label to allow the Commit Queue to land a PR in several commits. labels Nov 12, 2025

nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Nov 12, 2025

nodejs-github-bot closed this Nov 12, 2025

richardlau mentioned this pull request Dec 10, 2025

es-module/test-typescript timing out and failing on AIX on v25.x-staging #61017

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

esm: use wasm version of cjs-module-lexer #60663

esm: use wasm version of cjs-module-lexer #60663

Uh oh!

joyeecheung commented Nov 10, 2025 •

edited

Loading

Uh oh!

nodejs-github-bot commented Nov 10, 2025

Uh oh!

joyeecheung commented Nov 10, 2025

Uh oh!

codecov bot commented Nov 10, 2025 •

edited

Loading

Uh oh!

joyeecheung commented Nov 10, 2025

Uh oh!

guybedford commented Nov 10, 2025

Uh oh!

anonrig commented Nov 11, 2025

Uh oh!

joyeecheung commented Nov 11, 2025

Uh oh!

nodejs-github-bot commented Nov 11, 2025

Uh oh!

nodejs-github-bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

esm: use wasm version of cjs-module-lexer #60663

esm: use wasm version of cjs-module-lexer #60663

Uh oh!

Conversation

joyeecheung commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

benchmark: use typescript for import cjs benchmark