esm: use undici/fetch data url parser #54748

KhafraDev · 2024-09-03T20:49:28Z

Using the fetch parser, rather than a regex, should fix most of these edge cases.

It would probably be better to export the parser directly so we can use it for the sync parsing too. Just wanted to gauge how correct or welcome this change would be.

fetch should also be used for blob: urls, whenever support is added for them, as it handles edge cases regarding those as well. If someone ever brings back http/https imports, fetch should probably be used there as well. :)

Fixes #53775
Fixes #42890
Closes #51324

Fixes nodejs#53775

nodejs-github-bot · 2024-09-03T20:49:33Z

Review requested:

@nodejs/loaders

KhafraDev · 2024-09-03T20:51:07Z

test/es-module/test-esm-data-urls.js

-    } catch (e) {
-      assert.strictEqual(e.code, 'ERR_INVALID_URL');
-    }
+    await assert.rejects(import(plainESMURL), { code: 'ERR_UNKNOWN_MODULE_FORMAT' })


data:invalid,null is a valid data url afaik

aduh95 · 2024-09-03T20:52:45Z

One of the problem of this approach is that Undici is not at all written in a way to be robust against prototype mutation, which seems to be a deal breaker when it comes to loading modules.

KhafraDev · 2024-09-03T20:54:41Z

that's true, maybe I could rip the data url parser from undici and implement it here? Could be useful for the node:util or something as well.

KhafraDev · 2024-09-04T01:47:45Z

I took undici's data url parser and added primordials, it's entirely possible I missed things or misused them. ~~It's also likely possible to replace the mimetype parsing with the built-in one.~~

mcollina · 2024-09-04T07:13:57Z

@KhafraDev can you also port the tests for the tests for the data url parser?

ljharb

feel free to ignore these, but they might make things simpler and faster

lib/internal/data_url.js

KhafraDev · 2024-09-04T16:45:47Z

I took the WPTs for fetching data urls, if there's a more complete dataset, or if there's one specifically for imports, let me know.

lib/internal/data_url.js

test/parallel/test-data-url.js

mcollina

lgtm

codecov · 2024-09-04T19:44:55Z

Codecov Report

Attention: Patch coverage is 91.16022% with 32 lines in your changes missing coverage. Please review.

Project coverage is 87.61%. Comparing base (5949e16) to head (86d6382).
Report is 344 commits behind head on main.

Files with missing lines	Patch %	Lines
lib/internal/data_url.js	91.76%	26 Missing and 3 partials ⚠️
lib/internal/modules/esm/load.js	70.00%	3 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main   #54748    +/-   ##
========================================
  Coverage   87.60%   87.61%            
========================================
  Files         650      651     +1     
  Lines      182829   183289   +460     
  Branches    35379    35434    +55     
========================================
+ Hits       160173   160591   +418     
- Misses      15928    15950    +22     
- Partials     6728     6748    +20

Files with missing lines	Coverage Δ
lib/internal/modules/esm/load.js	`92.54% <70.00%> (-0.42%)`	⬇️
lib/internal/data_url.js	`91.76% <91.76%> (ø)`

... and 34 files with indirect coverage changes

jasnell · 2024-09-05T16:14:05Z

lib/internal/data_url.js

+  return encoder;
+}
+
+const ASCII_WHITESPACE_REPLACE_REGEX = /[\u0009\u000A\u000C\u000D\u0020]/g // eslint-disable-line


Doesn't need to be in this PR, but having a benchmark for parsing these would be good. Would like to revisit to see if there's a more efficient approach than using the regex.

nodejs-github-bot · 2024-09-06T23:16:55Z

CI: https://ci.nodejs.org/job/node-test-pull-request/62084/ 🟡

aduh95 · 2024-09-07T08:22:05Z

Landed in 6c85d40

Fixes: #53775 PR-URL: #54748 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

targos · 2024-09-30T10:15:28Z

I tried to include this in v20.x, but it broke HTTP imports (which were removed from main/v22.x)

See https://github.com/nodejs/node/actions/runs/11102823645/job/30843465228?pr=55170

KhafraDev · 2024-09-30T13:44:54Z

The test that failed is:

it('data: URL can always import other data:', async () => {
        const data = new URL('data:text/javascript,');
        data.searchParams.set('body',
                              'import \'data:text/javascript,import \'data:\''
        );
        // doesn't throw
        const empty = await import(data.href);
        assert.ok(empty);
      });

Where empty is 'data:text/javascript,?body=import+%27data%3Atext%2Fjavascript%2Cimport+%27data%3A%27'. The search gets parsed as part of the body (as data: urls cannot have search params). The test is faulty, there were some I changed in this PR that were similarly incorrect.

Fixes: nodejs#53775 PR-URL: nodejs#54748 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

esm: use undici/fetch data url parser

113b17a

Fixes nodejs#53775

nodejs-github-bot added errors Issues and PRs related to JavaScript errors originated in Node.js core. esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Sep 3, 2024

KhafraDev commented Sep 3, 2024

View reviewed changes

RedYetiDev added fetch Issues and PRs related to the Fetch API and removed errors Issues and PRs related to JavaScript errors originated in Node.js core. labels Sep 3, 2024

fixup! take undici's data url parser

884152d

fixup! use MIMEType

59bafe8

ljharb reviewed Sep 4, 2024

View reviewed changes

aduh95 reviewed Sep 4, 2024

View reviewed changes

lib/internal/data_url.js Show resolved Hide resolved

anonrig reviewed Sep 4, 2024

View reviewed changes

lib/internal/data_url.js Show resolved Hide resolved

RedYetiDev reviewed Sep 4, 2024

View reviewed changes

lib/internal/data_url.js Show resolved Hide resolved

KhafraDev added 2 commits September 4, 2024 11:58

fixup! apply suggestions

73d0399

fixup! add tests

0bcc730

KhafraDev marked this pull request as ready for review September 4, 2024 16:41

ljharb approved these changes Sep 4, 2024

View reviewed changes

lib/internal/data_url.js Outdated Show resolved Hide resolved

RedYetiDev reviewed Sep 4, 2024

View reviewed changes

test/parallel/test-data-url.js Outdated Show resolved Hide resolved

fixup! apply suggestions

02c4d5f

jasnell reviewed Sep 4, 2024

View reviewed changes

test/parallel/test-data-url.js Outdated Show resolved Hide resolved

fixup! apply suggestions

9910b30

mcollina approved these changes Sep 4, 2024

View reviewed changes

aduh95 approved these changes Sep 4, 2024

View reviewed changes

jasnell reviewed Sep 5, 2024

View reviewed changes

jasnell approved these changes Sep 5, 2024

View reviewed changes

This comment was marked as outdated.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as outdated.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as outdated.

Sign in to view

This comment was marked as off-topic.

Sign in to view

aduh95 merged commit 6c85d40 into nodejs:main Sep 7, 2024
38 of 57 checks passed

RedYetiDev mentioned this pull request Sep 14, 2024

Data URLs can't have query params #54944

Closed

RafaelGSS mentioned this pull request Sep 16, 2024

v22.9.0 proposal #54966

Merged

KhafraDev deleted the data-url-fetch-parser branch September 19, 2024 03:18

targos added the backport-requested-v20.x PRs awaiting manual backport to the v20.x-staging branch. label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

esm: use undici/fetch data url parser #54748

esm: use undici/fetch data url parser #54748

KhafraDev commented Sep 3, 2024 •

edited

Loading

nodejs-github-bot commented Sep 3, 2024

KhafraDev Sep 3, 2024

aduh95 commented Sep 3, 2024

KhafraDev commented Sep 3, 2024

KhafraDev commented Sep 4, 2024 •

edited

Loading

mcollina commented Sep 4, 2024

ljharb left a comment

KhafraDev commented Sep 4, 2024

mcollina left a comment

codecov bot commented Sep 4, 2024 •

edited

Loading

jasnell Sep 5, 2024

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

nodejs-github-bot commented Sep 6, 2024 •

edited by aduh95

Loading

aduh95 commented Sep 7, 2024

targos commented Sep 30, 2024 •

edited

Loading

KhafraDev commented Sep 30, 2024

esm: use undici/fetch data url parser #54748

esm: use undici/fetch data url parser #54748

Conversation

KhafraDev commented Sep 3, 2024 • edited Loading

nodejs-github-bot commented Sep 3, 2024

KhafraDev Sep 3, 2024

Choose a reason for hiding this comment

aduh95 commented Sep 3, 2024

KhafraDev commented Sep 3, 2024

KhafraDev commented Sep 4, 2024 • edited Loading

mcollina commented Sep 4, 2024

ljharb left a comment

Choose a reason for hiding this comment

KhafraDev commented Sep 4, 2024

mcollina left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 4, 2024 • edited Loading

Codecov Report

jasnell Sep 5, 2024

Choose a reason for hiding this comment

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

nodejs-github-bot commented Sep 6, 2024 • edited by aduh95 Loading

aduh95 commented Sep 7, 2024

targos commented Sep 30, 2024 • edited Loading

KhafraDev commented Sep 30, 2024

KhafraDev commented Sep 3, 2024 •

edited

Loading

KhafraDev commented Sep 4, 2024 •

edited

Loading

codecov bot commented Sep 4, 2024 •

edited

Loading

nodejs-github-bot commented Sep 6, 2024 •

edited by aduh95

Loading

targos commented Sep 30, 2024 •

edited

Loading