-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Website documentation recommending GET with body to do search is invalid according to http 1.1 spec #16024
Comments
This has been raised many times.
The RFC for HTTP 1.1 RFC2616 says this:
So yes, the recommendation (note: "SHOULD") is that a GET body shouldn't have any effect on server processing. But GET is a better semantic fit for search than POST:
Also mentioned in the spec:
Users often allow just GET requests to their servers in order to allow a range of non-destructive actions. The problem is that a search request (with the query DSL, aggs, etc) can easily overflow the max URL length, so we allow sending it in the body instead. Of course, some clients don't allow GET requests with a body, so we support two other ways of sending a search request:
The Query DSL has so many options, restricting yourself to just the basic query string parameters would be a waste of time. For instance, you wouldn't be able to use aggregations. If you don't like using |
The "source" parameter needs to be documented. We only discovered this on some google search on one forum mentioned by someone way down the line as a suggestion. I can't even find that link right now it was so obscure and difficult to find. Once we discovered the source parameter existed and could accept our json payload this was quite easier to resolve. Regardless of limitations on query string parameters, how to send all options as as query string parameters should be documented, not just a select few. Using a POST to do a search "feels" wrong and would be unacceptable regardless since we'd then have to open our Elasticsearch endpoint (Amazon ES) up to allow post to the public and would thus require additional logic to prevent destructive operations as you mentioned earlier such as additional authentication (not sure we can do this in Amazon ES) or a proxy server. TLDR: more documentation is never a bad thing. Thanks! |
It is documented under API conventions > Common options, along with other request parameters that apply across all or most APIs: https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#_request_body_in_query_string
You are very likely to run into problems with having your search request truncated - it is too easy to run into server URL limits.
I'm not 100% certain, but I don't believe that highlighting is supported via query string parameters (other than encoding the body in The query DSL (plus the rest of the search options) is just too rich to replicate purely with query string params. |
I understand where you're coming from. Thanks for that link but that definitely needs to be placed somewhere in the SEARCH API docs. Google isn't able to index properly when one searches for "elasticsearch uri search source" or similar searches because the content isn't falling on one page. You may or may not care, but the majority of people don't have time to read all documentation for a piece of software - they have to rely on Google to index it properly so they can find exactly what they need quickly. At the very least, the source parameter should be listed here (please): Edit: please understand that we've placed our Elasticsearch service behind API Gateway (may or may not be the best decision) and it does NOT support a GET request with a request body. It is outright refused and the request is flagged as invalid. That's why our hands are somewhat tied without redesigning / rebuilding our infrastructure. |
@clintongormley, I completely agree with @eric-tucker. So does Roy Fielding:
I think it is a clear violation of the Layered System constraint of REST, as well as section 4.3 of RFC 2616:
On top of that, section 4.3.1 of RFC 7231 states:
So I don't think there's anything to argue; this is not a recommended practice. If I put a Varnish server in front of Elasticsearch, I should expect it to cache the response of all I would actually rather use the drafted |
This is a very good point regarding cache of GETs in Varnish. You've now made a GET behave differently based on a payload but anything obeying RFC standards is completely valid in ignoring that payload and treating two (supposedly) different GETs as the same request. Just because this is the way ES was setup and you don't want to change it doesn't mean it's correct and doesn't need to change long term. |
@clintongormley If @jasnell finishes his Internet Draft of the |
I will look at getting this moved forward. I had left it in the hands of a couple others but it looks like the interest / momentum faded out. If there's interest in implementing this, I can definitely work on getting it advanced. |
+1 |
@asbjornu sure - makes sense |
What's the status? |
I'm not sure where else to report this...
All throughout the website, you give examples of how to perform various operations related to search but the examples are GET requests with a data body.
This is wrong according to the Http 1.1 spec that dictates the body of a GET request should not change the response contents. In other words, you should be able to ignore the request body and get the same result.
As an alternative, you should document the query string parameters better. For example, it's not documented how to do highlighting with query string parameters, only a GET payload.
Ultimately, calling GET with a payload should be deprecated and removed from future releases of ES. This is not recommended by anyone and I can find no support for doing things this way.
The text was updated successfully, but these errors were encountered: