Skip to content

Scratchpad: Open Search Integration

WindBeneathYourWings edited this page Jan 20, 2022 · 28 revisions

This is the original verti-go elastic search go template for panelpagelistitems.

{ define "panelpages" }}
{
  "query": {
    "bool": {
      "filter": [
        {
            "bool": {
                "must": [
                    {
                        "bool": {
                            "should": [
                                {
                                    "term": {
                                        "entityPermissions.readUserIds.keyword": {
                                            "value": "*"
                                        }
                                    }
                                },
                                {
                                    "term": {
                                        "entityPermissions.readUserIds.keyword": {
                                            "value": "{{ userId .Req }}"
                                        }
                                    }
                                },
                                {
                                    "term": {
                                        "entityPermissions.writeUserIds.keyword": {
                                            "value": "{{ userId .Req }}"
                                        }
                                    }
                                },
                                {
                                    "term": {
                                        "entityPermissions.deleteUserIds.keyword": {
                                            "value": "{{ userId .Req }}"
                                        }
                                    }
                                }
                            ]
                        }
                    }
                    {{ if $.Req.MultiValueQueryStringParameters.path }},
                    {
                        "bool": {
                            "should": [
                                {{ range $index, $value := $.Req.MultiValueQueryStringParameters.path }}{{ if ne $index 0 }},{{ end }}
                                    {
                                        "term": {
                                            "path.keyword": {
                                                "value": "{{ $value }}"
                                            }
                                        }
                                    }
                                {{ end }}
                            ]
                        }
                    }
                    {{ end }}
                ]
            }
        }
      ]
    }
  },
  "size": 1000
}
{{end}}

These are the current two variations being used as an open search template. One is to fetch all panel page list items using a wildcard match. The other is to fetch based on matching terms.

POST classified_panelpages/_search/template
{
 "id": "panelpagelistitems",
 "params": {
   "path": {
     "term": {
       "path.keyword": {
         "value": "/formly/marvel/char/select/v1"
       }
     }
   }
 }
}
POST classified_panelpages/_search/template
{
 "id": "panelpagelistitems",
 "params": {
   "path": {
     "wildcard": {
       "path.keyword": {
         "value": "*"
       }
     }
   }
 }
}

This is the template / script.

POST _scripts/panelpagelistitems
{
  "script": {
    "lang": "mustache",
    "source": "{ \"from\": \"0\", \"size\": \"1000\", \"query\": { \"bool\": { \"filter\": [ { \"bool\": { \"must\": [ { \"bool\": { \"should\": [ {{#toJson}}path{{/toJson}} ] } } ] } } ] } } }"
  }
}

There are a few things here I'm not really a fan of. The first is the template has to be strpiped and escaped to be saved. This significantly reduces readability although a tool could be created to do this but that kind of deviates from the purpose of druid. The second is sending JSON in the query string. This is required at the moment because the crud query mechanism doesn't support array values. The path variable being used is broken up into array segments to match against existing panel page list items. Using this method of sending json tightly couples the query to open search. No longer can keyval be used unless a work around or conditional is implemented to change the behavior. That is a bit of a problem considering I would still like the demo to support discovery of panel page list items using idb / keyval.

Trailing Comma Limitation

Mustache does not provide any elegant means to remove the trailing comma when looping through a list of string and building the should portion of the query by matching using each path as a term.

POST _scripts/panelpagelistitems2
{
  "script": {
    "lang": "mustache",
    "source": "{ \"from\": 0, \"size\": 1000, \"query\": { \"bool\": { \"filter\": [ { \"bool\": { \"must\": [ { \"bool\": { \"should\": [ {{#path}} { \"term\": { \"path.keyword\": { \"value\": \"{{.}}\" } } } {{/path}}, ] } } ] } } ] } } }"
  }
}

I think I may be going about this in the wrong way. Instead of making the open search query compatible with the normalized crud query parsing perhaps it would be better to think about it from another perspective. The crud query for open search currently works. What has been broken is the idb query since the param values pass json compatible with open search template. Crud adaptors have queryMappings. The queryMappings are intended to be used to change the default behavior used to map a query param to a json rules engine rule. A new feature to query mapping can be added that would transform path to an engine rule compatible with the keyval adaptor for crud.

I think this will actually be fairly straightforward. Instead of using the value an optional value transformer can be used. The value transformer accepts the value. The transformer can than transform the value anyway it wants.

This function inside the crud data service helper converts a query to a json rules engine rule.

  buildQueryRule({ params, config }: { params: QueryParams | string, config: CrudEntityConfigurationPlugin }): Observable<{ rule: Rule }> {
    return new Observable<{ rule: Rule }>(obs => {
      // const metadata = this.entityDefinitionService.getDefinition(this.entityName).metadata as CrudEntityMetadata<any, {}>;
      const conditions: Array<NestedCondition> = [];
      // KISS for now - use qs later - move to reusable function probably inside durl. First lets proof it out with one level.
      if (typeof(params) === 'string') {
        const pieces = params.split('&').map(p => p.split('=', 2)).reduce((p, [name, value]) => new Map<string, Array<any>>([ ...Array.from(p).filter(([k, _]) => k !== name) ,[ name, [ ...(p.has(name) ? p.get(name) : []), value ] ] ]), new Map<string, Array<any>>());
        pieces.forEach((values, name) => 
          conditions.push({ 
            any: values.map(value => ({ fact: name === 'identity' ? 'identity' : 'entity', operator: config.queryMappings && config.queryMappings.has(name) && config.queryMappings.get(name).defaultOperator ? config.queryMappings.get(name).defaultOperator : 'equal', value, ...(name === 'identity' ? {} : { path: `$.${name}` }) }))
           })
        );
      }
      const rule = conditions.length > 0 ? new Rule({ conditions: { all: conditions }, event: { type: 'visible' } }) : undefined;
      obs.next({ rule });
      obs.complete();
    });
  }

I think that the moment just implementing a valueQuery as part of the QueryMapping class to optionally select a value is enough. The queryValue will be a json path selector that is implemented using json path plus like in other parts of the platform.

The crud adaptor open search template query also needs to support arrays. Currently it just uses the last value when multiple keys with the same name exists. Case and point as seen here with the optional arguments resulting in a 404 since no longer are the individual segments taken into account.

http://localhost:4000/opensearch/classified_panelpages/_search/template

{"id":"panelpagelistitems","params":{"path":"{\"term\":{\"path.keyword\":{\"value\":\"/formly/kitchensink/v1/blah\"}}}","site":"ipe"}}

The query string is also being included which isn't right either.

{"id":"panelpagelistitems","params":{"path":"{\"term\":{\"path.keyword\":{\"value\":\"/formly/kitchensink/v1/blah?hello","site":"ipe"}}

The express proxies for s3 and open search need to include the bucket name (s3) and domain (open search) as part of the path. Those are currently hard coded in express so can't be used for other environments like prod at the moment.

Another issue is that idb is not supported on the server.

For the demo I'm able to circumvent that issue by only using the idb_keyval adaptor on the browser and excluding it from the server. For the purposes of the demo this works because everything everything is meant to be contained in the browser anyway.

There is a simpler way to do this. Just find the string or int in json using a recursive discover process. I like that. Keep it kiss. Everyone loves kisses.

Another idea which I actually really like is to create a custom operator that knows how to match an elastic search keyword term query and wildcard query. Also a combined operator where it attempts to match against multiple operators using any or any condition. I think this would be elegant way to easily support idb matching and query string in general with elastic search queries used in templates.

Cost Optimization

Create cloud formation template from existing dev stack to easily tear down and bring up service when needed to avoid unnecessary cost. This will also make it possible to spin up parallel stacks for different purposes. For example, creating a domain to create a report, publish the report and after kill the stack.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-opensearchservice-domain.html

Clone this wiki locally