-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
esm: use undici/fetch data url parser #54748
Conversation
Review requested:
|
test/es-module/test-esm-data-urls.js
Outdated
} catch (e) { | ||
assert.strictEqual(e.code, 'ERR_INVALID_URL'); | ||
} | ||
await assert.rejects(import(plainESMURL), { code: 'ERR_UNKNOWN_MODULE_FORMAT' }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data:invalid,null
is a valid data url afaik
One of the problem of this approach is that Undici is not at all written in a way to be robust against prototype mutation, which seems to be a deal breaker when it comes to loading modules. |
that's true, maybe I could rip the data url parser from undici and implement it here? Could be useful for the node:util or something as well. |
I took undici's data url parser and added primordials, it's entirely possible I missed things or misused them. |
@KhafraDev can you also port the tests for the tests for the data url parser? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feel free to ignore these, but they might make things simpler and faster
I took the WPTs for fetching data urls, if there's a more complete dataset, or if there's one specifically for imports, let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #54748 +/- ##
========================================
Coverage 87.60% 87.61%
========================================
Files 650 651 +1
Lines 182829 183289 +460
Branches 35379 35434 +55
========================================
+ Hits 160173 160591 +418
- Misses 15928 15950 +22
- Partials 6728 6748 +20
|
return encoder; | ||
} | ||
|
||
const ASCII_WHITESPACE_REPLACE_REGEX = /[\u0009\u000A\u000C\u000D\u0020]/g // eslint-disable-line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be in this PR, but having a benchmark for parsing these would be good. Would like to revisit to see if there's a more efficient approach than using the regex.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Landed in 6c85d40 |
I tried to include this in v20.x, but it broke HTTP imports (which were removed from main/v22.x) See https://github.com/nodejs/node/actions/runs/11102823645/job/30843465228?pr=55170 |
The test that failed is: it('data: URL can always import other data:', async () => {
const data = new URL('data:text/javascript,');
data.searchParams.set('body',
'import \'data:text/javascript,import \'data:\''
);
// doesn't throw
const empty = await import(data.href);
assert.ok(empty);
}); Where |
Fixes: nodejs#53775 PR-URL: nodejs#54748 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>
Using the fetch parser, rather than a regex, should fix most of these edge cases.
It would probably be better to export the parser directly so we can use it for the sync parsing too. Just wanted to gauge how correct or welcome this change would be.
fetch should also be used for
blob:
urls, whenever support is added for them, as it handles edge cases regarding those as well. If someone ever brings back http/https imports, fetch should probably be used there as well. :)Fixes #53775
Fixes #42890
Closes #51324