Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for templates #337

Closed
wants to merge 2 commits into from
Closed

Conversation

quadristan
Copy link

@quadristan quadristan commented Feb 14, 2023

Add support for template rules

📘 Glossary

Term Def
Template Describe a rule that can be instantiated. I prefer template to generic but feel free to make me change my mind
Parameter The description of what is needed to instantiate a template
Argument The value passed when instantiating a template
Awesome This library and its maintainers

ℹ️ Description

A template is a rule that is given parameters.

TupleLiteralLiteral = "(" _  Literal _  "," _  Literal _  ")"

can be turned into a template

Tuple<X,Y> = = "("  _  X _  "," _  Y  _  ")"

and then used

TupleLiteralLiteral = Tuple<Literal,Literal>

Example of full templated grammar is in the MR.

💨 Fast fast !

This MR takes performances in mind.
Without templated rules, it will only do a quick pass on existing rules_ref , in a $O(N)$ fashion.
I compared parsing the example javascript.pegjs in main / in my branch, nothing changes, really.

With templated rules, it becomes a bit slower. The complexity in time/space is of the number templates instantiations times the complexity of parsing said rules.

⚙️ Generated code

Generated code must follow javascript standard.
There is no way to write

Map<Key,Value>

But instead we can write, according to this

MapᐊKeyͺValueᐅ

Let me know if that is not acceptable for generated code. I could add IDs on top of symbols, but i believe it will slow down performances for not such gain

💖 Personal motivation

My personal motivation to pass this is to be able to create a grammar for https://github.com/structurizr/dsl/blob/master/docs/language-reference.md . There is a lot of repetition amongst blocks, so i want to use templates to do it

👓 Scope of this PR

The way templates are implemented have room for improvements, i may open additional MRs depending on my bandwith:

  • There is no endless recursion check yet
  • I conditionally add fields in the grammar nodes to avoid breaking existing tests. But that may decrease perfs.
  • We should be able to pass expressions instead of rule_refs as template arguments
  • We may want variadic parameters to allow some kind of recursion ( this may become complex !! )

@Mingun
Copy link
Member

Mingun commented Feb 15, 2023

This is the issue in the original pegjs repo: pegjs/pegjs#45
and a branch with my implementation: https://github.com/Mingun/pegjs/tree/template-rules

@quadristan
Copy link
Author

This is the issue in the original pegjs repo: pegjs/pegjs#45 and a branch with my implementation: https://github.com/Mingun/pegjs/tree/template-rules

I have looked at the issue.
This is actually quite more complex , I did not wish to introduce symbol-lookup tables, or special states.

@hildjj
Copy link
Contributor

hildjj commented Feb 15, 2023

I'm very interested in this work. I'd really like to land @Mingun 's #291 and my #309, then do a release, before working on this, if possible.

Copy link
Member

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing templates is a big task and thanks that you are working on it! I left some comments.

I think, that firstly we should decide, how the templates should be implemented. Should we instantiate rules (then we should think about recursion) or use high-order functions to implement templates? We also could implement both variants.

Another question to discuss is where th template passes should be inserted in current compilation pipeline? It seems valuable to work with templates after some basic checks, like duplicated or undefined rules, but other checks should be done after recognizing templates, like checking for infinite recursion.

Comment on lines +13 to +16
// Javascript does not let us have variables names such as Map<Key,Value>
// Instead, it accepts MapᐊKeyͺValueᐅ
// See https://es5.github.io/x7.html#x7.6
const name = `${template.name}ᐊ${templateArgs.map(a => a.name).join("ͺ")}ᐅ`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a replacement should be done in the generate-js pass

if (node.templateArgs) {
const targetRule = asts.findRule(ast, targetName);
if (!targetRule) {
throw new GrammarError(`Rule "${targetName}" is not defined`, node.location);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be prefer to use session.error() instead of direct throwing GrammarError, because that will allow you to collect many errors at once. You should be prepared, that session.error() will not throw.

lib/compiler/passes/instantiate-templates.js Outdated Show resolved Hide resolved
Comment on lines +108 to +111
const targetRule = asts.findRule(ast, node.name);
if (!targetRule) {
throw new GrammarError(`Rule "${node.name}" is not defined`, node.location);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to run template instantiation after checking for existence of all rules.

Comment on lines +106 to +110
GenericsArgumentsDeclaration
= "<" __
head:IdentifierName __
tail:("," __ @IdentifierName __ )*
">" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have a repeated expression, so it would be good to rewrite using it. I personally also love, if dangling comma would be supported, because this simplifies refactorings and auto-generation.

The location of each parameters is also worth to be saved, so we can report:

  • unused parameters
  • incorrect recursion via parameters

Comment on lines +288 to +295
templateArgs
= "<" __
head:RuleReferenceExpression __
tail:("," __ @RuleReferenceExpression)*
">" {
return [head,...tail]
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth to allow instantiate with any expression. At least, literals could be widely use

Comment on lines +53 to +56
templates: [
instantiateTemplates,
removeNonInstantiatedTemplates,
],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to work with templates after doing some basic checks, like undefined or duplicating rules


while (rulesToRemove.length) {
const name = rulesToRemove.pop();
session.info("Removing dangling generic rule", name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use word "template" then it should be used consistently.

Co-authored-by: Mingun <Alexander_Sergey@mail.ru>
@quadristan quadristan closed this May 29, 2023
@quadristan quadristan reopened this May 29, 2023
@quadristan quadristan closed this May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants