Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds top level initializer block. #73

Merged
merged 7 commits into from
Apr 19, 2021
Merged

Adds top level initializer block. #73

merged 7 commits into from
Apr 19, 2021

Conversation

jaubourg
Copy link
Contributor

@jaubourg jaubourg commented Apr 18, 2021

The initializer is re-executed at each parse call.

This makes it inadequate and wasteful for code and data structures that are constant throughout multiple calls to parse and, if and when an es module output format is added, impossible to use import statements.

It is possible to use the options argument passed to parse as a workaround to the former but it is quite the anti-pattern.

This PR introduces an optional top-level initializer: a code block that is added at the very top of the generated file.

To distinguish the top-level initializer from the normal one, it is surrounded by double curly braces.

A grammar can have both a top-level initializer and a "classic" initializer. The top-level initializer always appears before the initializer.

{{
   // top level
}}
{
  // function level
}
rule1: <...>
rule2: <...>
<...>
ruleN: <...>

Let me know what you think.

@hildjj
Copy link
Contributor

hildjj commented Apr 18, 2021

Interesting. How many other people want this, by show of hands, please?
Is there anyone who feels strongly against it?

Please get the tests to pass, and add yourself to the AUTHORS file.

@jaubourg
Copy link
Contributor Author

@hildjj done and done. I also added some tests.

I'll raise my hand and say it would be super useful to me for one. Also, if you look at all the examples in the repo itself, the initializer could (and probably should) be top-level. I honestly never understood why the initialiizer wasn't top-level by default and I was just eagerly waiting for pegjs' successor to make this PR (had it in mind for years now).

@hildjj
Copy link
Contributor

hildjj commented Apr 19, 2021

We'll also need documentation updates, including some guidance over what sort of thing goes into each block.

@jaubourg
Copy link
Contributor Author

@hildjj I'll get to it tomorrow!

@jaubourg
Copy link
Contributor Author

@hildjj I put some documentation / guidance in the readme and the documentation.html document. Not sure the parse level initializer example is that good though and I'll take any feedback.

@StoneCypher
Copy link
Contributor

I don't understand why you don't just put a singleton in the parse block

@StoneCypher
Copy link
Contributor

I honestly never understood why the initialiizer wasn't top-level by default

because then it can intrude on other namespaces and conflict

similar to wrapping it into an iife block

@jaubourg
Copy link
Contributor Author

I don't understand why you don't just put a singleton in the parse block

Because it won't solve the problem of import statements ? Because it's what I have to do right now and it creates unnecessary clutter ?

I honestly never understood why the initialiizer wasn't top-level by default

because then it can intrude on other namespaces and conflict

similar to wrapping it into an iife block

You mean like you can screw everything up in the parse block with a single assignment or declaration ?

@StoneCypher
Copy link
Contributor

Because it won't solve the problem of import statements?

I apologize. I don't understand this. I'm not saying you're wrong, I just mean I don't get it yet.

Could you please explain?

.

it creates unnecessary clutter

fair

.

You mean like you can screw everything up in the parse block with a single assignment or declaration ?

Yes, essentially. It's just a more severe case.

But if it's useful, it shouldn't be something people can do by accident, so, sure why not

@StoneCypher
Copy link
Contributor

Because it won't solve the problem of import statements?

oh wait. modular grammars. you're right

@jaubourg
Copy link
Contributor Author

Because it won't solve the problem of import statements?

oh wait. modular grammars. you're right

Yes, that's what I meant :)

My goal here is to make sure we have the whole parse-related code in the grammar itself. Without a top-level initializer, you have to create boilerplate code around the generated parser or awkward / convoluted logic in the parse block. With es-modules, the former is unavoidable without a top-level initializer.

From what I've seen in the wild, most initializers just declare some functions and/or global objects that do not change from one parse to another. If anything, I suspect we'll see more top-level initializer than parse-level intializers, if that makes sense.

@StoneCypher
Copy link
Contributor

Now that I understand you, I am in full agreement.

Copy link
Contributor

@hildjj hildjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any interactions I'm not seeing with options.dependencies, other than making them even less useful?

README.md Outdated
is the ideal location to require, to import or to declare utility functions to
be used in rule actions and semantic predicates.

The _parse level initializer_ is called everytime and before the generated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to emphasize the fact the initializer is called repeatedly but I can live without the "everytime and" ;)

parser is loaded (through a <code>require</code> or an <code>import</code> statement for instance). It
is the ideal location to require, to import or to declare utility functions to
be used in rule actions and semantic predicates.</p>
<p>The <em>parse level initializer</em> is called everytime and before the generated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same potential typo as above, regardless, let's reword.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment in the README.

@@ -21,7 +21,7 @@
//
// [1] http://www.ecma-international.org/publications/standards/Ecma-262.htm

{
{{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you must have bootstrapped this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I generated the parser with a per-parse initializer first then transformed it into a global initializer and re-generated. It served as a first test ;)

initializer: extractOptional(initializer, 0),
rules: extractList(rules, 0),
location: location()
};
}

TopLevelInitializer
= "{" code:CodeBlock "}" EOS {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm. are there code blocks that could match this for parse-level initializers that happen to start with { and end with }

Copy link
Contributor Author

@jaubourg jaubourg Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you could have as many curly braces as you'd want so, yes, you could have a per-parse initializer that has double curly braces... but it serves no purpose whatsoever and would be very weird code formatting.

The only situation it would arise is if someone wanted to scope the entire code and do so without indentation... something like:

{{
  let scoped;
  // some code
}}

But:

  1. I expect such code to be indented
  2. I also expect unscoped code to exist after or before the scoped code

Something like:

{
  {
    let scoped;
    // some code
  }
  // other code
}

The likelihood of a double curly braced initializer in the wild seems extremely low but, yes, it could happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nod. and there's a work-around if it does break something of putting a {{}} in front if you really want.

@jaubourg
Copy link
Contributor Author

Are there any interactions I'm not seeing with options.dependencies, other than making them even less useful?

None that I can see... options.dependencies are placed before the global initializer and accessible from it. So... yeah... apart from meking them obsolete, not much.

@hildjj
Copy link
Contributor

hildjj commented Apr 19, 2021

let's mention both of those things, then. a) {{}} comes after dependencies, and b) don't use dependencies unless you really need output-format-independent imports.

@hildjj
Copy link
Contributor

hildjj commented Apr 19, 2021

(sorry about the back and forth here! I appreciate your effort and willingness to take feedback.)

@jaubourg
Copy link
Contributor Author

let's mention both of those things, then. a) {{}} comes after dependencies, and b) don't use dependencies unless you really need output-format-independent imports.

I made a first batch of changes. Where should I put the new information? I think it could be confusing to do so where initiaizers are introduced. Maybe where options.dependencies is?

@hildjj
Copy link
Contributor

hildjj commented Apr 19, 2021

Yes, that sounds right. I think once you get that change done, I'm ready to merge. Do we have opinions about if this is worth a minor version number change?

@jaubourg
Copy link
Contributor Author

Do we have opinions about if this is worth a minor version number change?

None.

@hildjj
Copy link
Contributor

hildjj commented Apr 19, 2021

semver.org says "MINOR version when you add functionality in a backwards compatible manner", which this is, so I think yes. Let's wait a few days before doing a release to see if we can get the rollup change in also -- unless you have an emergent need for a release, @jaubourg ?

@jaubourg
Copy link
Contributor Author

semver.org says "MINOR version when you add functionality in a backwards compatible manner", which this is, so I think yes. Let's wait a few days before doing a release to see if we can get the rollup change in also -- unless you have an emergent need for a release, @jaubourg ?

Nope, take your time.

I did the dependencies bit as best as I could and rebased the PR (we were touching the same lines in the docs). I took the opportunity to fix a typo and back-port the "es" format where it belongs in documentation.html.

@hildjj hildjj merged commit a822ffe into peggyjs:main Apr 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants