Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: implement fuzzy search API #10184

Merged
merged 3 commits into from
Apr 20, 2021
Merged

api: implement fuzzy search API #10184

merged 3 commits into from
Apr 20, 2021

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Mar 16, 2021

This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy
searching objects in Nomad. The fuzzy search endpoint routes requests
to the Nomad Server leader, which implements the Search.FuzzySearch RPC
method.

Requests to the fuzzy search API are based on the api.FuzzySearchRequest
object, e.g.

{
  "Text": "ed",
  "Context": "all"
}

Responses from the fuzzy search API are based on the api.FuzzySearchResponse
object, e.g.

{
  "Index": 27,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "tasks": [
      {
        "ID": "redis",
        "Scope": [
          "default",
          "example",
          "cache"
        ]
      }
    ],
    "evals": [],
    "deployment": [],
    "volumes": [],
    "scaling_policy": [],
    "images": [
      {
        "ID": "redis:3.2",
        "Scope": [
          "default",
          "example",
          "cache",
          "redis"
        ]
      }
    ]
  },
  "Truncations": {
    "volumes": false,
    "scaling_policy": false,
    "evals": false,
    "deployment": false
  }
}

The API is configurable using the server.search stanza, e.g.

server {
  search {
    fuzzy_enabled   = true
    limit_query     = 200
    limit_results   = 1000
    min_term_length = 5
  }
}

These values can be increased or decreased, so as to provide more
search results or to reduce load on the Nomad Server. The fuzzy search
API can be disabled entirely by setting fuzzy_enabled to false.

@shoenig
Copy link
Member Author

shoenig commented Mar 16, 2021

I'll cover API docs and e2e tests in a followup PR. First I want to make sure @DingoEatingFuzz / @backspace are happy with the response format, of which a larger example is below.

edit: see API docs examples

@shoenig shoenig marked this pull request as ready for review March 16, 2021 15:27
@shoenig shoenig requested a review from tgross March 16, 2021 15:27
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @shoenig. The search/sorting algorithm is really clear and understandable. I've left some usability remarks and a few other odds and ends.


- `fuzzy_enabled` `(bool: true)` - Specifies whether the fuzzy search API is
enabled. If not enabled, requests to the fuzzy search API endpoint will return
an error response.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a link here to the API docs for this new endpoint as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -0,0 +1,129 @@
package structs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ for splitting anything out of structs.go

return s.srv.blockingRPC(&opts)
}

func expandContext(context structs.Context) []structs.Context {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to factor this out of nomad/search_endpoint_ent.go as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allContexts is defined per the _oss and _ent files; this wrapper Just Works with either one


prefix := alloc.ID[:len(alloc.ID)-2]
data := structs.SearchRequest{Prefix: prefix, Context: structs.Allocs}
req, err := http.NewRequest("POST", "/v1/search", encodeReq(data))
assert.Nil(err)
require.NoError(t, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to tidy all these existing tests up, btw.

return fmt.Errorf("fuzzy search is not enabled")
}

// check the query term meets minimum length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ACL check should be before argument validation so that we return ACL errors first. Probably before config validation too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I agree. One thing though - to make namespace=* work, this up front ACL check is basically bypassed, instead defering to iterators wrapped with filters, doing the same ACL checks removing objects not allowed by the given token. So this is really more of a "at least try to give them an error if they specify a namespace they can't access" optimization.

accumulateSet := func(limited bool, set map[structs.Context][]fuzzyMatch) {
for ctx, matches := range set {
for _, match := range matches {
if len(unsorted[ctx]) < limitResults {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to sound nitpicky, but the results limit config talks about a maximum number of results, whereas this is getting us the maximum number of results per context. Should we update this so that we're tracking a count of how many results overall, or just change the docs? (Either seems reasonable.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch; clarified in docs

if b.Search.LimitQuery > 0 {
result.Search.LimitQuery = b.Search.LimitQuery
}
if b.Search.LimitResults > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate a relationship between LimitQuery and LimitResults, even if just to warn in the logs that it's unexpected to have LimitResults > LimitQuery?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could actually be the case where the number of results exceeds LimitQuery - because of the way the jobs type is expanded into its subtypes (groups, tasks, services, images, commands, classes). There could be only 1 job registered, but contains more of any of those subtypes than LimitResults.

@DingoEatingFuzz
Copy link
Contributor

Hey @shoenig, I annotated the mock response with my thoughts in this gist: https://gist.github.com/DingoEatingFuzz/27ac49ea722c4df5011f58b6207fa3ab

The tl;dr is we need to make sure that all the responses have all the requisite info to build links from.

@DingoEatingFuzz
Copy link
Contributor

I poked at the code a bit but had trouble finding answers to my two questions so I'll just ask them:

  1. Will this support cross-namespace searches (i.e., namespace=*)?
  2. Will results include multi-part matches (e.g., bisl still matches command bin/sleep)?

@backspace
Copy link
Contributor

re cross-namespace searches, it’s known to be desired 🤞

@shoenig
Copy link
Member Author

shoenig commented Apr 12, 2021

Hey @DingoEatingFuzz and @backspace I tweaked the response objects to better identify the matching object. I've got API docs in there now with more query examples.

Will this support cross-namespace searches (i.e., namespace=*)?

It should, yes. With ACLs enabled, if namespace=*, a token with limited <kind>:read on one or more specified namespaces will only return results for the <kind> and namespaces in which the token is valid.

Will results include multi-part matches (e.g., bisl still matches command bin/sleep)?

Nope, at least not yet. The substring matches we're doing now are already going to use a questionable amount of resources, so this change includes server config to disable and/or limit queries. Once we get a better understanding of the performance impact maybe we can open up the matching patterns.

@tgross
Copy link
Member

tgross commented Apr 12, 2021

@shoenig I pulled this branch down locally and I'm not getting the results I'd expect to see with this endpoint. But maybe I'm missing something in the docs:

$ nomad job status
ID       Type     Priority  Status   Submit Date
example  service  50        running  2021-04-12T10:50:29-04:00

$ curl -vs -XPOST -d '{"Text": "ex", "Context": "jobs"}' "localhost:4646/v1/search/fuzzy" | jq .
...
> POST /v1/search/fuzzy HTTP/1.1
> Host: localhost:4646
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 33
> Content-Type: application/x-www-form-urlencoded
>
} [33 bytes data]
* upload completely sent off: 33 out of 33 bytes
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Nomad-Index: 17
< X-Nomad-Knownleader: true
< X-Nomad-Lastcontact: 0
< Date: Mon, 12 Apr 2021 14:53:48 GMT
< Content-Length: 89
<
{ [89 bytes data]
* Connection #0 to host localhost left intact
{
  "Index": 17,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {},
  "Truncations": {
    "jobs": false
  }
}

$ curl -vs -XPOST -d '{"Text": "ex", "Context": "all"}' "localhost:4646/v1/search/fuzzy" | jq .
...
> POST /v1/search/fuzzy HTTP/1.1
> Host: localhost:4646
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 32
> Content-Type: application/x-www-form-urlencoded
>
} [32 bytes data]
* upload completely sent off: 32 out of 32 bytes
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Nomad-Index: 18
< X-Nomad-Knownleader: true
< X-Nomad-Lastcontact: 0
< Date: Mon, 12 Apr 2021 14:55:44 GMT
< Content-Length: 284
<
{ [284 bytes data]
* Connection #0 to host localhost left intact
{
  "Index": 18,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "scaling_policy": [],
    "evals": [],
    "deployment": [],
    "volumes": []
  },
  "Truncations": {
    "nodes": false,
    "plugins": false,
    "namespaces": false,
    "deployment": false,
    "volumes": false,
    "allocs": false,
    "jobs": false,
    "evals": false,
    "scaling_policy": false
  }
}


@backspace
Copy link
Contributor

hmm is it possible your queries returned empty results, @tgross? I tried it and got some:

$ curl -XPOST -d '{"Text": "pi", "Context": "all"}' "localhost:4646/v1/search/fuzzy" | jq .

# …

{
  "Index": 30,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "evals": [],
    "deployment": [],
    "allocs": [
      {
        "ID": "ping🥳.webs🥳[0]",
        "Scope": [
          "default",
          "4ad14bd5-119b-5eaf-f36e-01d5a63515e0"
        ]
      },
      {
        "ID": "ping🥳.group with short-run task[0]",
        "Scope": [
          "default",
          "c62bff59-8583-391f-1f93-bc4037e0c403"
        ]
      },
      {
        "ID": "ping🥳.group with short-run task[0]",
        "Scope": [
          "default",
          "f0153309-6b01-38d4-6f78-0e769bbfc76e"
        ]
      }
    ],
    "commands": [
      {
        "ID": "ping",
        "Scope": [
          "default",
          "ping🥳",
          "webs🥳",
          "frontend🥳"
        ]
      },
      {
        "ID": "xxping",
        "Scope": [
          "default",
          "ping🥳",
          "group with short-run task",
          "short-run task"
        ]
      }
    ],
    "volumes": [],
    "scaling_policy": []
  },
  "Truncations": {
    "volumes": false,
    "allocs": false,
    "jobs": false,
    "plugins": false,
    "namespaces": false,
    "scaling_policy": false,
    "evals": false,
    "deployment": false,
    "nodes": false
  }
}

@tgross
Copy link
Member

tgross commented Apr 15, 2021

hmm is it possible your queries returned empty results, @tgross? I tried it and got some:

I'd run the example job from nomad job init first. But I just tried it again with the exact same job and got good results (see below). Doesn't seem to be flappy in any way either. So... I dunno, it's possible I tested against the wrong build or something the first time? That's embarrassing, sorry @shoenig 😊

$ curl -vs -XPOST -d '{"Text": "ex", "Context": "all"}' "localhost:4646/v1/search/fuzzy" | jq .
*   Trying ::1...
* TCP_NODELAY set
* Connection failed
* connect to ::1 port 4646 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 4646 (#0)
> POST /v1/search/fuzzy HTTP/1.1
> Host: localhost:4646
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 32
> Content-Type: application/x-www-form-urlencoded
>
} [32 bytes data]
* upload completely sent off: 32 out of 32 bytes
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Nomad-Index: 18
< X-Nomad-Knownleader: true
< X-Nomad-Lastcontact: 0
< Date: Thu, 15 Apr 2021 12:32:04 GMT
< Content-Length: 426
<
{ [426 bytes data]
* Connection #0 to host localhost left intact
{
  "Index": 18,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "volumes": [],
    "scaling_policy": [],
    "allocs": [
      {
        "ID": "example.cache[0]",
        "Scope": [
          "default",
          "8b8794e6-d2ae-870c-e017-a7b1de642ff7"
        ]
      }
    ],
    "jobs": [
      {
        "ID": "example",
        "Scope": [
          "default"
        ]
      }
    ],
    "evals": [],
    "deployment": []
  },
  "Truncations": {
    "volumes": false,
    "allocs": false,
    "nodes": false,
    "plugins": false,
    "namespaces": false,
    "evals": false,
    "scaling_policy": false,
    "jobs": false,
    "deployment": false
  }
}

@backspace
Copy link
Contributor

backspace commented Apr 15, 2021

Hey, I was doing some preliminary experimentation with using this in the UI and it looks like CORS isn’t available on /v1/search/fuzzy, that’ll be something we need 🤓💞

ETA specifically the pre-flight OPTIONS request 405ed:

image

@backspace
Copy link
Contributor

Maybe I’m wrong? Maybe I’m missing something in local development 🤔 I’ll look more into this and let you know, sorry.

@backspace
Copy link
Contributor

yes, I was wrong, disregard 😳

@backspace
Copy link
Contributor

I don’t know that this needs to be blocking, but now that I’m trying this, case-insensitivity would be nice to have. Compy is named Tethys and searching for te doesn’t return a node, but Te does.

Probably not possible to include now but a future improvement could be returning the substring indices matched for bolding in the UI.

@backspace
Copy link
Contributor

Something for jobs that’s required for the UI is knowing the name alongside the ID, would it be possible to add that in the scopes list?

image

Let me know if there’s a better place for these messages 🤔

@shoenig
Copy link
Member Author

shoenig commented Apr 16, 2021

case-insensitivity would be nice to have

I think we can slip this in without much of a performance impact; I'll add it.

would it be possible to add that in the scopes list

Yup can definitely add Name. Currently the search is performed on Name, and then ID is used in the ID part of the return struct; which is confusing 🙂 . I'm thinking we should change it so that the search is performed on Name, the Name is used in the ID field, and the job ID is appended to the scope. That would be most consistent with how all the other context types are structured.

Let me know if there’s a better place for these messages

Here is great!

This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy
searching objects in Nomad. The fuzzy search endpoint routes requests
to the Nomad Server leader, which implements the Search.FuzzySearch RPC
method.

Requests to the fuzzy search API are based on the api.FuzzySearchRequest
object, e.g.

{
  "Text": "ed",
  "Context": "all"
}

Responses from the fuzzy search API are based on the api.FuzzySearchResponse
object, e.g.

{
  "Index": 27,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "tasks": [
      {
        "ID": "redis",
        "Scope": [
          "default",
          "example",
          "cache"
        ]
      }
    ],
    "evals": [],
    "deployment": [],
    "volumes": [],
    "scaling_policy": [],
    "images": [
      {
        "ID": "redis:3.2",
        "Scope": [
          "default",
          "example",
          "cache",
          "redis"
        ]
      }
    ]
  },
  "Truncations": {
    "volumes": false,
    "scaling_policy": false,
    "evals": false,
    "deployment": false
  }
}

The API is tunable using the new server.search stanza, e.g.

server {
  search {
    fuzzy_enabled   = true
    limit_query     = 200
    limit_results   = 1000
    min_term_length = 5
  }
}

These values can be increased or decreased, so as to provide more
search results or to reduce load on the Nomad Server. The fuzzy search
API can be disabled entirely by setting `fuzzy_enabled` to `false`.
@shoenig
Copy link
Member Author

shoenig commented Apr 16, 2021

Those 2 additions should work now @backspace

$ curl -s -XPOST localhost:4646/v1/search/fuzzy -d '{"Context":"jobs", "Text": "yex"}' | jq .
{
  "Index": 17,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "jobs": [
      {
        "ID": "MyExampleJob",
        "Scope": [
          "default",
          "MyExampleJob"
        ]
      }
    ]
  },
  "Truncations": {
    "jobs": false
  }
}

In this case the JobID and JobName are the same but it should be correct

@backspace
Copy link
Contributor

Total success, thanks!

image

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's ship it!

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants