-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds top level initializer block. #73
Conversation
Interesting. How many other people want this, by show of hands, please? Please get the tests to pass, and add yourself to the AUTHORS file. |
@hildjj done and done. I also added some tests. I'll raise my hand and say it would be super useful to me for one. Also, if you look at all the examples in the repo itself, the initializer could (and probably should) be top-level. I honestly never understood why the initialiizer wasn't top-level by default and I was just eagerly waiting for pegjs' successor to make this PR (had it in mind for years now). |
We'll also need documentation updates, including some guidance over what sort of thing goes into each block. |
@hildjj I'll get to it tomorrow! |
@hildjj I put some documentation / guidance in the readme and the documentation.html document. Not sure the parse level initializer example is that good though and I'll take any feedback. |
I don't understand why you don't just put a singleton in the parse block |
because then it can intrude on other namespaces and conflict similar to wrapping it into an iife block |
Because it won't solve the problem of import statements ? Because it's what I have to do right now and it creates unnecessary clutter ?
You mean like you can screw everything up in the parse block with a single assignment or declaration ? |
I apologize. I don't understand this. I'm not saying you're wrong, I just mean I don't get it yet. Could you please explain? .
fair .
Yes, essentially. It's just a more severe case. But if it's useful, it shouldn't be something people can do by accident, so, sure why not |
oh wait. modular grammars. you're right |
Yes, that's what I meant :) My goal here is to make sure we have the whole parse-related code in the grammar itself. Without a top-level initializer, you have to create boilerplate code around the generated parser or awkward / convoluted logic in the parse block. With es-modules, the former is unavoidable without a top-level initializer. From what I've seen in the wild, most initializers just declare some functions and/or global objects that do not change from one parse to another. If anything, I suspect we'll see more top-level initializer than parse-level intializers, if that makes sense. |
Now that I understand you, I am in full agreement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any interactions I'm not seeing with options.dependencies
, other than making them even less useful?
README.md
Outdated
is the ideal location to require, to import or to declare utility functions to | ||
be used in rule actions and semantic predicates. | ||
|
||
The _parse level initializer_ is called everytime and before the generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to emphasize the fact the initializer is called repeatedly but I can live without the "everytime and" ;)
docs/documentation.html
Outdated
parser is loaded (through a <code>require</code> or an <code>import</code> statement for instance). It | ||
is the ideal location to require, to import or to declare utility functions to | ||
be used in rule actions and semantic predicates.</p> | ||
<p>The <em>parse level initializer</em> is called everytime and before the generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same potential typo as above, regardless, let's reword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment in the README.
@@ -21,7 +21,7 @@ | |||
// | |||
// [1] http://www.ecma-international.org/publications/standards/Ecma-262.htm | |||
|
|||
{ | |||
{{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you must have bootstrapped this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I generated the parser with a per-parse initializer first then transformed it into a global initializer and re-generated. It served as a first test ;)
initializer: extractOptional(initializer, 0), | ||
rules: extractList(rules, 0), | ||
location: location() | ||
}; | ||
} | ||
|
||
TopLevelInitializer | ||
= "{" code:CodeBlock "}" EOS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm. are there code blocks that could match this for parse-level initializers that happen to start with {
and end with }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically you could have as many curly braces as you'd want so, yes, you could have a per-parse initializer that has double curly braces... but it serves no purpose whatsoever and would be very weird code formatting.
The only situation it would arise is if someone wanted to scope the entire code and do so without indentation... something like:
{{
let scoped;
// some code
}}
But:
- I expect such code to be indented
- I also expect unscoped code to exist after or before the scoped code
Something like:
{
{
let scoped;
// some code
}
// other code
}
The likelihood of a double curly braced initializer in the wild seems extremely low but, yes, it could happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nod. and there's a work-around if it does break something of putting a {{}}
in front if you really want.
None that I can see... |
let's mention both of those things, then. a) {{}} comes after dependencies, and b) don't use dependencies unless you really need output-format-independent imports. |
(sorry about the back and forth here! I appreciate your effort and willingness to take feedback.) |
I made a first batch of changes. Where should I put the new information? I think it could be confusing to do so where initiaizers are introduced. Maybe where options.dependencies is? |
Yes, that sounds right. I think once you get that change done, I'm ready to merge. Do we have opinions about if this is worth a minor version number change? |
None. |
semver.org says "MINOR version when you add functionality in a backwards compatible manner", which this is, so I think yes. Let's wait a few days before doing a release to see if we can get the rollup change in also -- unless you have an emergent need for a release, @jaubourg ? |
Nope, take your time. I did the dependencies bit as best as I could and rebased the PR (we were touching the same lines in the docs). I took the opportunity to fix a typo and back-port the "es" format where it belongs in documentation.html. |
The initializer is re-executed at each parse call.
This makes it inadequate and wasteful for code and data structures that are constant throughout multiple calls to parse and, if and when an es module output format is added, impossible to use import statements.
It is possible to use the options argument passed to parse as a workaround to the former but it is quite the anti-pattern.
This PR introduces an optional top-level initializer: a code block that is added at the very top of the generated file.
To distinguish the top-level initializer from the normal one, it is surrounded by double curly braces.
A grammar can have both a top-level initializer and a "classic" initializer. The top-level initializer always appears before the initializer.
Let me know what you think.