Skip to content
This repository has been archived by the owner on Mar 17, 2021. It is now read-only.

Wsagent ping is getting HTTP 200 even if the route is not returning consistent HTTP 200 responses. #479

Closed
l0rd opened this issue Dec 11, 2017 · 15 comments

Comments

@l0rd
Copy link
Contributor

l0rd commented Dec 11, 2017

That is the long term solution for openshiftio/openshift.io#1599 (comment)

@l0rd
Copy link
Contributor Author

l0rd commented Dec 11, 2017

More details on the problem:

When a route is created on osio it go through 3 different states:

Not ready state
When trying to ping the route the response status code is 503 and it's consistent:

HTTP response: 503
HTTP response: 503
HTTP response: 503
HTTP response: 503
HTTP response: 503
HTTP response: 503

Flapping state
After a few seconds pinging the route we start getting some response status code 200 but these are still not steady and alternates with some 503s:

HTTP response: 200
HTTP response: 503
HTTP response: 503
HTTP response: 200
HTTP response: 503
HTTP response: 200

Ready state
Only response status code 200 are returned fore every ping

HTTP response: 200
HTTP response: 200
HTTP response: 200
HTTP response: 200
HTTP response: 200
HTTP response: 200

To verify that we can use hello-openshift.yaml (oc create -f hello-openshift.yaml) to create a route and test-curl.sh to verify the different response status codes returned by the route.

The problem is that Che is not able to detect the flapping state of the route and think that the route is ready even if it's not yet. Che only gets response status code 200s when curl is able to see some flapping 200s and 503s. This can be verified running this simple Java class HttpURLConnectionTest

@ibuziuk ibuziuk self-assigned this Dec 12, 2017
@ibuziuk
Copy link
Member

ibuziuk commented Dec 12, 2017

have not tested yet, but we probably could have smth. like this [1] in place (use okhttp for verification)
(tests are currently failing). But it is still pretty weird that currently Che is not able to detect route flapping.
Connection details are implemented on quite a low level in DefaultHttpJsonRequest [2] via HttpURLConnection, but I have not spotted anything suspicious so far.

image - ibuziuk/che-server:wsagent
[1] ibuziuk/che@52c4a49
[2] https://github.com/eclipse/che/blob/master/core/che-core-api-core/src/main/java/org/eclipse/che/api/core/rest/DefaultHttpJsonRequest.java

@ibuziuk
Copy link
Member

ibuziuk commented Dec 12, 2017

It is also does not look like a caching problem since https requests are never cached AFAIK

@l0rd
Copy link
Contributor Author

l0rd commented Dec 13, 2017

@ibuziuk in the sample I've provided okttp is used too. And I confirm you that this doesn't solve the problem: using HttpURLConnection or okhttp has the same exact result.

@ibuziuk
Copy link
Member

ibuziuk commented Dec 13, 2017

Headers from curl:

image

Headers form java test running from IDE are not caught by Fiddler for some reason, so I plan to package the test with okhttp to jar and execute it from terminal.

As you can see from the request headers the only thing that seems to be added by curl is Connection: keep-alive: header which is used mainly for performance improvement. AFAIK, okhttp is also using keep alive requests by default [1], but need to verify this

[1] square/okhttp#2031

@ibuziuk
Copy link
Member

ibuziuk commented Dec 13, 2017

Adding Connection: keep-alive header does not change anything - OkHttp requests still do not detect route flapping

@ibuziuk
Copy link
Member

ibuziuk commented Dec 13, 2017

hmmm... all this becoming even more weird - after adding Connection: close header okhttp detects flapping correctly:

Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
200
200
200
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
200
200
200
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
200
200
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
Unexpected code Response{protocol=http/1.0, code=503, message=Service Unavailable, url=https://hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com/}
503
200
200

@ibuziuk
Copy link
Member

ibuziuk commented Dec 13, 2017

it is also quite interesting to take a look at response headers:

HTTP/1.1 200 OK
Date: Wed, 13 Dec 2017 14:23:53 GMT
Content-Length: 17
Content-Type: text/plain; charset=utf-8
Set-Cookie: 453440e08ff71c1921ff8fac010f08bd=add8477f387bc83859ab9c17a1f632d0; path=/; HttpOnly; Secure
Cache-control: private


HTTP/1.0 503 Service Unavailable
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache, no-store
Connection: close
Content-Type: text/html

only 503 contains Connection: close header, whereas 200 has Cache-control: private - do not use shared cache

@ibuziuk
Copy link
Member

ibuziuk commented Dec 13, 2017

So, it is now becoming clear that java clients add some header (most likely Connection: keep-alive), because in curl request headers does not contain this information by default:

> GET / HTTP/1.1
> Host: hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com
> User-Agent: curl/7.47.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
< Pragma: no-cache
< Cache-Control: private, max-age=0, no-cache, no-store
< Connection: close

> GET / HTTP/1.1
> Host: hello-openshift-route-ibuziuk-che.8a09.starter-us-east-2.openshiftapps.com
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Wed, 13 Dec 2017 14:31:57 GMT
< Content-Length: 17
< Content-Type: text/plain; charset=utf-8
< Set-Cookie: 453440e08ff71c1921ff8fac010f08bd=ce2d79a2000c3a3afcc5272ca957a950; path=/; HttpOnly; Secure
< Cache-control: private

@l0rd l0rd added this to the Sprint #142 milestone Dec 13, 2017
@ibuziuk
Copy link
Member

ibuziuk commented Dec 13, 2017

After some investigation I think the problem is related to Connection: keep-alive header. Here are the demos of running the sample [1] which uses HttpURLConnection for pinging route on api.starter-us-east-2.openshift.com. By default Connection: keep-alive is added to the transport headers and route flapping is not detected:
connection-keep-alive
Basically, Java does not immediately close the underlaying TCP connection when the input stream is closed. Instead it keeps it open and tries to reuse it for the next HTTP request to the same server (More details [2]).
In order to disable this behavior System.setProperty("http.keepAlive","false"); can be used. After disabling keep-alive flapping was successfully detected:
connection-close

@l0rd I guess the easiest way to fix it would be adding -Dhttp.keepAlive=false during che-server start-up on osio. WDYT ?

[1] https://github.com/ibuziuk/route-flapping
[2] https://stackoverflow.com/questions/4767553/safe-use-of-httpurlconnection

@l0rd
Copy link
Contributor Author

l0rd commented Dec 13, 2017 via email

@ibuziuk
Copy link
Member

ibuziuk commented Dec 15, 2017

@l0rd after prod cluster update to 3.7.9 route flapping is not reproducible anymore

@ibuziuk
Copy link
Member

ibuziuk commented Dec 15, 2017

PR has been sent eclipse-che/che#7898

@ibuziuk
Copy link
Member

ibuziuk commented Dec 18, 2017

@l0rd going to close this one after PR[1] to che6 branch would be merged

[1] eclipse-che/che#7949

@ibuziuk
Copy link
Member

ibuziuk commented Dec 19, 2017

PR to che6 is merged - closing

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants