Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Painless as a JSON-friendly templating language #24529

Closed
clintongormley opened this issue May 6, 2017 · 16 comments
Closed

Painless as a JSON-friendly templating language #24529

clintongormley opened this issue May 6, 2017 · 16 comments
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache :Data Management/Indices APIs APIs to create and manage indices and templates >feature help wanted adoptme Meta Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team team-discuss

Comments

@clintongormley
Copy link

The problem

Today our only choice for a templating language is Mustache which, while fine for very simple templates, does not work well in the JSON context where we mostly use it (eg search templates, watches etc).

For instance, if you want to return an array of terms in a query, you can't do:

    "terms": "{{ #toJson }} my_terms {{ /toJson }}"

because that would render as:

    "terms": "[\"term_one\",\"term_two\"]"

Instead, the whole template has to be passed as one big string in order to remove the double quotes:

    "terms":  {{ #toJson }} my_terms {{ /toJson }}

Dealing with big queries as a string instead of as a JSON structure is complicated, finicky, and error prone.

On top of that, Mustache's limited support for defaults, conditionals, etc make anything but variable substitution difficult and hacky. Just see https://www.elastic.co/guide/en/elasticsearch/reference/5.4/search-template.html#_conditional_clauses as an example of how ugly it can get.

Making it JSON friendly

Instead, we need a templating solution where JSON is parsed as JSON, and any string element is treated as a potential template, the result of which can be any JSON-compatible data structure.

For instance:

"<< my_undefined_val >>"  →   null
"<< my_bool >>"           →   true
"<< my_num >>"            →   5.4
"<< my_string >>"         →   "some string"
"<< my_array >>"          →   [ "foo", "bar" ]
"<< my_map >>"            →   { "foo": { "bar": "baz" }}

Additionally, templates can be embedded in strings in combination with non-templated strings:

"query": "status:<< status >> title:( << title >>)"
→
"query": "status:active title:( some search keywords)"

Any JSON-compatible data structure could be stringified to its JSON equivalent when used as an embedded template, eg:

"msg": "Here are the broken nodes: << array_of_nodes >>"
→
"msg": "Here are the broken nodes: [\"first\", \"second\"]"

Templates in JSON keys would always be stringified.

Using Painless

Painless is a fast, safe, and powerful scripting language with strong typing; in other words, parameters retain their types instead of just being considered as strings, in the way Mustache does.

Painless can be used as is with a wrapper implementing the stringification mentioned above. By far the majority of templates use simple value substitution:

{
  "template": {
    "lang": "painless",
    "params": {
      "statuses": [
        "active",
        "pending"
      ]
    },
    "inline": {
      "query": {
        "terms": "<< params.statuses >>"
      }
    }
  }
}

Default values:

    "range": {
        "age": {
          "gte": "<< params.min_age ? params.min_age : 21 >>",
          "lte": "<< params.max_age ? params.max_age : 95 >>",
        }
    }

Conditionals:

    "must": [
        "<< if (params.only_active) { return [ 'term': [ 'active' : true ]] }; return; >>",
        ...
    ]

String manipulation:

    "emails": "<< params.emails.join(', ') >>"

It could also be extended to make common templating use cases more succinct, eg

  • easier default values for undefined variables (eg params.num || 10
  • easier map and filter functions for list manipulation
  • addition of functions to escape URI or HTML etc
  • Support for JSONpath or similar to do, eg:
def new_nodes = jpath(params.nodes_info, '$.nodes[?(@.version > \"5.2.0\")].name' );

Of course, these additions would be backwards compatible, so old templates will keep working while new templates can take advantage of any new syntax.

@clintongormley
Copy link
Author

@rjernst
Copy link
Member

rjernst commented May 6, 2017

I haven't thought through the syntax all the way (although << and >> bug me a little because {{ and }} or ${ and } is common in other languages), but as far as the "wrapper" part, I think this needs to actually be a separate scripting language (eg painless_template), since it will have a different grammar.

@clintongormley
Copy link
Author

when playing around with a new templating language, we started using << and >> because it stood out more than {{ and }} inside JSON, but I'm easy either way.

I think this needs to actually be a separate scripting language (eg painless_template), since it will have a different grammar.

why does it need a different grammar? users already use painless, why do they need to learn a new language for the templating? remember that 90% of use cases will be simple value substitution

@tlrx
Copy link
Member

tlrx commented May 19, 2017

+1, this is something I have in mind since Painless went out and I think @jdconrad thought about it too (it reminds me the JSP with <%...%>, <%=...%> etc).

We added some functions to Mustache like toJson but honestly it was a pain and the library we use does not help when it comes to add new functions for specific needs. I hope Painless might help on this too.

@clintongormley clintongormley added help wanted adoptme and removed discuss labels Jun 9, 2017
@tvernum
Copy link
Contributor

tvernum commented Nov 13, 2017

My concern with the proposal as it stands, is that it's a bit "magical" when it comes to dealing with a template-expression inside of quotes.
Assuming num is the integer 4, then I assume the following are true:

  • "<< num >>" = 4
  • " << num >>" = " 4"
  • "<< num >> " = "4 "
  • "<< num.toString() >>" = "4"

The spec would be something like:

If a string consists of a template-expression and no other text, then the string itself is replaced with the JSON representation of the expression value. Otherwise the template-expression is replaced within the string, by substituting in its String representation (*)

(*) Or should be that be the Stringification of its JSON representation?

It's user friendly 99% of the time, but it feels like a little bit too much magic to describe and explain well.

That said, I don't have an alternative that I like.
An option (that is more precise, but confusing to users) would be to have 2 different syntaxes, one for "this replaces the containing string with JSON representation" and another for "this does in place substitution"

  • "<< num >>" = 4
  • " << num >>" = Error
  • "<< num >> " = Error
  • "<< num.toString() >>" = "4"
  • "$< num >" = "4"
  • " $< num >" = " 4"
  • "$< num > " = "4 "
  • "$< num.toString() >" = "4" or "\"4\"" depending on whether we call toString() or toJson().toString(), I assume the former.

@clintongormley
Copy link
Author

  • " << num >>" = Error
  • "<< num >> " = Error

I thought this could be a solution that would force the user to do something like this instead:

"<< ' ' + num >>" = " 4"

But that makes it harder to write things like:

"The << job >> has failed << num >> times"

@s1monw
Copy link
Contributor

s1monw commented Nov 13, 2017

I haven't thought through the syntax all the way (although << and >> bug me a little because {{ and }} or ${ and } is common in other languages), but as far as the "wrapper" part, I think this needs to actually be a separate scripting language (eg painless_template), since it will have a different grammar.

@rjernst I think what @clintongormley is saying is that we would simply parse JSON and if we see a string sequence << (.*) >> we would interpret the group as a painless script. So each template might have N scripts attached to it. I am not 100% sure how that would scale but it sounds reasonable to me from a 10000ft perspective. I don't think it needs a new syntax?

@rjernst
Copy link
Member

rjernst commented Nov 15, 2017

what @clintongormley is saying is that we would simply parse JSON and if we see a string sequence << (.*) >> we would interpret the group as a painless script.

I realized that much later after my initial comment. But I agree with @tvernum's sentiment that this magic value handling will be error prone. I think instead we should do 2 things:

  • Use a single identifier at the beginning of a string which means "this string will be substituted by the resulting evaluation of the string as painless code". I suggest # as it is not a valid operator or identifier in painless.
  • Extend painless to allow variable substitution in strings as done in other languages (eg ${ }). This is to avoid confusion with an operator like << from painless. By using a different syntax for variable substitution, we avoid the possibility of "accidentally" making the entire script a substitution in a string.

So, with my proposal, the example between @clintongormley and @tvernum above could be done a few ways:

  • "#num" = 4
  • "#num.toString()" = "4"
  • "# num" = 4
  • "# \" ${num}\"" = " 4"
  • "# \" ${num + 1}\"" = " 5"
  • "# num + 1" = 5

Alternatively, if we don't want to add variable substitution in painless strings, we could just use two different characters to mean "replace the node" vs "replace the contents of the string". But for both, I would just use a single character at the beginning of the string (and in both cases, the rest of the string is all painless code).

I am not 100% sure how that would scale

I think we can make this work, by painless knowing handling the entire template. @jdconrad and I thought through this when working on scripting contexts, and I think this can still be achieved by painless taking in the entire json object, and compiling that into a script which makes statements for each substituted element, and then joins them together into a serialized json string at the end.

@clintongormley
Copy link
Author

Use a single identifier at the beginning of a string which means "this string will be substituted by the resulting evaluation of the string as painless code". I suggest # as it is not a valid operator or identifier in painless.

But it is a common character in strings, eg ref numbers. Btw, I'm not tied to << >>, I'd be happy to use {{ }} as well, or even %% %%. I just preferred the angle brackets because they stand out from json more.

Extend painless to allow variable substitution in strings as done in other languages (eg ${ }).

I think this makes the simple case really ugly, eg compare:

"The << job >> has failed << num >> times"

with

"# \"The ${job} has failed ${num} times\""

@rjernst
Copy link
Member

rjernst commented Nov 16, 2017

But it is a common character in strings

No matter which character(s) is/are used, they are possible in strings, and so escaping will be necessary to use it at the beginning of a string.

I think this makes the simple case really ugly

Ugly is better than trappy. But I think if we have string substitution, we don't need to use the node replacement character at the beginning:

"The ${job} has failed ${num} times"

@mbarretta
Copy link
Contributor

++

I ran into exactly this surrounding-quote-when-rendering-lists problem when testing x-pack security role template queries. The resulting stringified JSON query was ugly and hard to maintain.

@clintongormley clintongormley added :Data Management/Indices APIs APIs to create and manage indices and templates and removed :Templates labels Feb 13, 2018
@jasontedor jasontedor mentioned this issue May 31, 2018
23 tasks
@rjernst rjernst added the Team:Data Management Meta label for data/management team label May 4, 2020
@danhermann danhermann added the :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache label Nov 5, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Scripting)

@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Nov 5, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@jdconrad jdconrad added team-discuss and removed needs:triage Requires assignment of a team area label labels Dec 9, 2020
@jdconrad
Copy link
Contributor

jdconrad commented Dec 9, 2020

Since this is part of the indices api project, I'd like to discuss in a broader group to see if this is still useful.

@Bernhard-Fluehmann
Copy link

I have found this issue via #50923 and I really hope to see search templating improved in the future. Just to make sure the following use case will be covered as well, since this is a widely known mustache problem of current implementation, what I was trying to build is someting like this.

aggs:
{{#stats-fields}}
  {{.}}: {"stats": { "field": {{.}} }
{{/stats-fields}}

The problem is that this will not work since the entries are not comma separated and if y comma is inserted in the template then it is added to the last entry as well. The only solution was to turn stats-fields in a list of maps and use a has-next key to add the comma conditionally, inspired by this. This works for the moment but is IMHO more a workaround than a solution.

@Bernhard-Fluehmann
Copy link

As a side note, one of my approaches to solve the problem was to change the content/type and build the whole template in yaml, since yaml does not need comma separation. This approach was not working either. Mustache was partially working by creating escaped output strings instead of hash object and I finally gave up on it.

But again, please consider proper yaml support as well, since yaml is much easier to work with than json, ist human friendlier and supports comments.

@rjernst
Copy link
Member

rjernst commented Mar 22, 2021

This issue has sat open for many years, and while there are several good ideas in it, there are two main problems with moving forward.

The first is that this issue is loosely designed to work with Painless, but in reality it lacks semantic integration with Painless. A performant implementation of this would need to collapse many disparate Painless snippets into a runtime that fills in the holes in the template. Additionally, this would need to join these snippets together after running them, and then adhere to the template script context. And finally, the script context itself really needs to change. Right now it is a String to String method, but to be efficient we really want to have the json parsed (partially?), and passed in. There is quite a bit of technical scoping that would still need to happen here, so it is a very high hanging fruit.

The second reason is a direct result of the above: the benefit of this work compared to the cost is low. While Mustache has its deficiencies, it has worked well enough to get by for many years. This issue would bring improvements, but Mustache would also likely need to change to conform to the template script context changes, thereby further increasing the cost.

Ultimately, this issue has sat relatively untouched for almost 4 years. After discussing recently among the scripting folks, we agreed this is not worth keeping open, since any change in priorities to work on it would likely want to start fresh, and the scope is such that the help wanted tag is grossly inappropriate.

@rjernst rjernst closed this as completed Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache :Data Management/Indices APIs APIs to create and manage indices and templates >feature help wanted adoptme Meta Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team team-discuss
Projects
None yet
Development

No branches or pull requests

10 participants