-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: parser instead of regex for preprocessor #6611
Conversation
7d10174
to
24b8d45
Compare
I'm worried about using a parser for this, because we have no idea what sort of syntax people will be using in the original (unpreprocessed) code. For example, we have no idea whether the language that they're trying to preprocess in |
hmm.. fair enough, would it make sense to have an option on the preprocessors to decide whether to use regex or parser? for example: const preprocessor = {
style() {},
markup() {},
script() {},
mode: 'parser' | 'regex' = 'regex',
}; with default as though, i'm not sure to use the word and alternative maybe to specify as version number: const preprocessor = {
style() {},
markup() {},
script() {},
version: 1,
}; |
I like this parser very very much, because it could help us with all sorts of tooling. Preprocessing would get easier, but the ESLint and Prettier plugins could also benefit from this, and language-tools could possibly get rid of some custom top level script extractor logic. So it would be great if the parser could be available for direct use. Regarding the preprocessor enhancement: I propose to not enhance the existing API and instead try to make it part of the new API which is discussed in this RFC: sveltejs/rfcs#56 I therefore propose to split this PR up into two pieces: One which simply makes this new parser available publicly (whether that's under Another thing that would help tooling: Add a <script>
console.log('top level')
</script>
<div>
<script>console.log('not top')</script>
</div> |
@pngwn might have some thoughts about this, he wrote a language-agnostic parser, too, albeit with more sophisticated parsing mechanism for the contents of example |
Regarding the ESLint plugin benefiting from this: I'm not sure what that would look like exactly, but unless it were a synchronous API (which the existing preprocessor API is not), it couldn't be used in ESLint, where all plugins need to run synchronously. |
The parser itself is synchronous and could help ESLint etc to for example find out where the top level script/style tags are - which is why I asked if we could split this PR into two, one for the parser itself because it would be immediately useful in itself, and the preprocessing changes independent of that. |
I may not understand something regarding this PR. I think this process could be made easier by adding a rule that This way, perhaps only regular expressions, or a very simple parser that is completely language-independent, will be able to extract each part correctly. The below (maybe only) case doesn't handle properly, but I think it's enough to put it in the documentation because this is a super edge case. <sctipt>.
const str = `
<script>
const hoge = "hoge";
</script>
`;
</script> It would be nice to have some kind of warning and a way to disable it (like And to do this, we need to traverse an entire code before separating it for finding By the way, ul
+if('posts && posts.length > 1')
+each('posts as post')
li
a(rel="prefetch" href="blog/{post.slug}") {post.title}
+else()
span No posts :boom: |
@baseballyama I'm not sure if checking for Re pug, it shouldn't be an issue since the parser would always run on pure Svelte code. Pug would be preprocessed into html before Svelte parses it. |
Thank you for your comment!
I add an additional explanation of my idea. Point1This is just IMO, but the parser doesn't know anything other than According to GitHub issues, Svelte users sometimes write Point2From another perspective, I think we can recognize For these two reasons, it may make sense to introduce the rule of my idea,
For instance, I had understood that there is a use case where language-tools want to use this before pre-processing. |
@baseballyama I'm still not sure if it's possible with the idea you suggested. I agree that we can find the start tag easily with this assumption, but the closing tag is still hard to match. Re point2, determining whether they are a separator is another tricky one to solve. Markdown's Re language tools, I'm not familiar with the code, but it looks like it's for stripping out the tags before If you have anymore questions (that might be off-topic), It'd be great to have them in the Svelte discord contributing channel. Would be nice to see you there too! |
Closing Svelte 4 PRs as stale — thank you |
Fixes #5900, fixes #5292, fixes #4701
Replace the regex to a simple parser, allow us to fix #5292 and support #4701
expression
instead of reusing the existingscript
Before submitting the PR, please make sure you do the following
[feat]
,[fix]
,[chore]
, or[docs]
.Tests
npm test
and lint the project withnpm run lint