Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download/download.go: try to fix long polling 304 and error #3926

Merged

Conversation

floriangasc
Copy link
Contributor

Signed-off-by: Gasc Florian florian.gasc@gmail.com
Fixes: #3923

@floriangasc
Copy link
Contributor Author

floriangasc commented Oct 25, 2021

Hi @ashutosh-narkar

Thanks for adding the fix @floriangasc! What do you think about adding a new field on the Downloader object to keep of the long poll status ? We would set it in the oneShot method based on the response from the download method. Then on a 304 we would return longPoll: d.longPoll instead of longPoll: isLongPollSupported(resp.Header).

Thank's for quick reply.

  • Your solution look's good if you think Downloader object is right place/object to store this.
  • Moreover, the oneShot method seem's better place to handle this problem than download method which seems more focused on network/infrastructure.
  • I don't known if you have seen the issue but i found another wrong behavior in error case. the oneShot should handle this nicely. In case of error return d.longPoll, err. What do you think ?

PS:

  • We use store disk in development environement for some week now and it's give good result. Thank a lot :)
  • I have rename the branche : little-work-on-bunle-downloader to little-work-on-bundle-downloader. And create another P.R. Please execuse me for this. It's only the second time a make P.R.

m := metrics.New()
resp, err := d.download(ctx, m)
resp, err := d.download(ctx, m, longPoll)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is we keep the long poll status on the downloader and then we can update oneShot to return only an error. We would then update the long poll status on the downloader in the oneShot method like we do for etag. In the download method on a 304, we can simply return d.longPoll instead of isLongPollSupported(resp.Header). Let me know if you have more questions.

Copy link
Contributor Author

@floriangasc floriangasc Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems look good.

  • i need to go out, but i do that tomorrow mornining.
  • For testing this, have you any advice ? I need run an embed web server for integration testing or something like that ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the test cases here for reference.

@floriangasc
Copy link
Contributor Author

floriangasc commented Oct 26, 2021

Before i write test, i prefere to be sure i have good behavior:

i have little modified the code:

  • in the loop method i have remove the code erease config.Polling.LongPollingTimeoutSeconds otherwise we loose the information if long polling is configured.
  • in the oneShot i added
	if d.config.Polling.LongPollingTimeoutSeconds != nil {
			d.longPollingEnabled = resp.longPoll
	}

the if enabled long poll only if long polling is enabled in configuration( and not if is supported)

I have make some manual test:

  • bundle server stop, opa stop -> start opa, start bundle server -> all seem good
  • bundle server stop, opa stop -> start opa, start bundle server -> all seem good
  • bundle server stop, opa stop -> start opa, start bundle server-> stop bundle server, wait then start bundle server -> all seem good
  • and so on

i have not detect wrong behavior to fallback between regular and long polling.

If it's ok for you, i start a test.

@floriangasc floriangasc force-pushed the little-work-on-bundle-downloader branch 2 times, most recently from 431db7a to 70c357b Compare October 26, 2021 08:48
Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to be on the right track with these changes. Added some comments.

@@ -206,16 +208,11 @@ func (d *Downloader) loop(ctx context.Context) {
if err != nil {
delay = util.DefaultBackoff(float64(minRetryDelay), float64(*d.config.Polling.MaxDelaySeconds), retry)
} else {
if !longPoll {
if d.config.Polling.LongPollingTimeoutSeconds != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to keep this check.

Comment on lines 251 to 253
if d.config.Polling.LongPollingTimeoutSeconds != nil {
d.longPollingEnabled = resp.longPoll
if d.longPollingEnabled != resp.longPoll {
d.logger.Warn(fmt.Sprintf("Long polling mode switch from %t to %t", d.longPollingEnabled, resp.longPoll))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just set here. Below should be enough.

d.longPollingEnabled = resp.longPoll

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you talk about a logger:

  • The goal of the this logger: Because the capability of switch at runtime between regular and long pooling, it provide a feedback for user to see in what opa is currently.
  • But if you think, it's not good i can remove. No problem

if you talk about if d.config.Polling.LongPollingTimeoutSeconds != nil { see my next global comment

}

func (d *Downloader) download(ctx context.Context, m metrics.Metrics) (*downloaderResponse, error) {
d.logger.Debug("Download starting.")

d.client = d.client.WithHeader("If-None-Match", d.etag)

if d.config.Polling.LongPollingTimeoutSeconds != nil {
if d.longPollingEnabled {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we need to change this loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my next global comment

@floriangasc
Copy link
Contributor Author

floriangasc commented Oct 27, 2021

hi @ashutosh-narkar

Thank's for your comment. I try to justify my choice according what i have understand (Don't hesitate to tell me if i have miss something).

Fact

  • the user can enable long polling form config file
  • If long polling is enable opa will try to long poll on bundle server, if bundle server doesn't support, opa will fallback on regular polling if configured, or no poll at all.

for handle the fallback in one way and reverse (as i understand the inital code) to 2 states are needed:

  • «the end user has configure long polling ?» if the config object is not rewrite, we can check if config contains a longpooling value
  • «the current mode is regular or long polling ?» is handle by Downloader.longPollingEnabled that can be switch at runtime

There are some points to highlight:

  1. It's not because bundle server can support long polling (cf func isLongPollSupported(header http.Header)) that long pooling should be enabled
  2. We don't have to rewrite config object, other wise we loose initial intention of user
  3. In the old code, i think there is mismatch between what should be check according current mode (long vs regular) and if long polling is enable or not in config.

Code explanation

for the code inside download function (line ~ 268)

  • The if d.longPollingEnabled should be done according effective poll mode, otherwise prefer header will be send event when fallback on regular mode has occured.
  • because d.client is re-assigned d.client = d.client.WithHeader("Prefer",..., not add this header is not enough.(And may be, also, WithHeader shallow copy client: i don't know how array a recopy or not in go). In other word, client is save with this header in download object.
  • So in fallback occured the d.longPollingEnabled is false so we must remove prefer header.

For the code inside oneShot function (line ~ 251)

  • in case of long poll is supported by bundle server, but no configured, we don't enabled long polling.
  • this what the if d.config.Polling.LongPollingTimeoutSeconds != nil says.
  • inside this if it where the fallback append. (That why i have put the logger)

For the code inside loop function (line ~ 211)

  • The only goal i have found for this if
if d.config.Polling.LongPollingTimeoutSeconds != nil {
     d.config.Polling.LongPollingTimeoutSeconds = nil
}

Is for control how download function should work according the effective long polling mode after fallback.

  • i have the 2 state: what it' configured/user intention, and what is enable at runtime this code is no longueur useful
  • Morevoer this code loose the initial intention of the user as put config file, so it's no very good i think.

Once again, if i have miss something, or if something is wrong don't hesitate. Thank for you reponse.

@floriangasc floriangasc force-pushed the little-work-on-bundle-downloader branch 2 times, most recently from 7404b5a to e83bad2 Compare October 28, 2021 08:13
@floriangasc
Copy link
Contributor Author

floriangasc commented Oct 28, 2021

hi @ashutosh-narkar

I finished to read/learn downloader_test.go. I was able to make tests. According what i have understand how the download_test are organized. i have add two 3 functions, instead of modify existing function. This function as small as possible and try to target at maximum the if that i have added/change.

I am available for any comments.

download/download_test.go Outdated Show resolved Hide resolved
download/download_test.go Outdated Show resolved Hide resolved
download/download_test.go Outdated Show resolved Hide resolved
download/download.go Outdated Show resolved Hide resolved
@@ -2,6 +2,7 @@
// Use of this source code is governed by an Apache2
// license that can be found in the LICENSE file.

//go:build slow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'd be OK with keeping it. 🤷 It'll come in whenever someone's touching the code with go 1.17, and is thus unavoidable... and the right way forward.

download/download.go Show resolved Hide resolved
@floriangasc floriangasc force-pushed the little-work-on-bundle-downloader branch 3 times, most recently from 8d10f56 to d7062d9 Compare November 6, 2021 09:01
Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting close. Few small comments. Thanks for the changes @floriangasc!

Comment on lines 251 to 253
if d.config.Polling.LongPollingTimeoutSeconds != nil {
d.longPollingEnabled = resp.longPoll
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can simply be

d.longPollingEnabled = resp.longPoll

Copy link
Contributor Author

@floriangasc floriangasc Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before i remove the if, i prefere to be sure of something.

Without this if d.config.Polling.LongPollingTimeoutSeconds != nil d.longPollingEnabled will be match if server support. This if says: if user configure longPolling in opa. In case of user don't want/configure longpolling, there is no reason to change d.longPollingEnabled (no matter of server support or not). In other word, the server support long polling should be considered only d.config.Polling.LongPollingTimeoutSeconds is defined.

The if can be remove only if this two values (server support long polling, and opa is long polling configured) should be the same. I don't think mix this two notions is good. But if you are sure no problem, let me know, i will remove the if (and change the test i have added).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're using the value of d.longPollingEnabled in other parts of the code. If user switches from regular to long to regular polling again you'll need to update d.longPollingEnabled accordingly.

Copy link
Contributor Author

@floriangasc floriangasc Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok i have change the code. It's seem's give good result. Can you juste confirme one thing:

  1. The media type application/vnd.openpolicyagent.bundles is only allowed for long polling ? 
  2. The media type application/gzip can't be used for long polling ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the server responds with Content-Type: application/vnd.openpolicyagent.bundles, OPA will use long polling.

Copy link
Contributor Author

@floriangasc floriangasc Nov 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oki thanks. (I understood correctly)

download/download.go Show resolved Hide resolved
@floriangasc floriangasc force-pushed the little-work-on-bundle-downloader branch from d7062d9 to b48a797 Compare November 11, 2021 10:51
download/download.go Outdated Show resolved Hide resolved
@floriangasc floriangasc force-pushed the little-work-on-bundle-downloader branch 4 times, most recently from 277efe0 to 81ac1a3 Compare November 13, 2021 17:30
Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@floriangasc thanks for working through this. Few comments. Can you also please squash/rebase your changes and update the commit message.

download/download.go Show resolved Hide resolved
download/download_test.go Outdated Show resolved Hide resolved
download/download_test.go Show resolved Hide resolved
download/download_test.go Show resolved Hide resolved
download/download_test.go Outdated Show resolved Hide resolved
server *httptest.Server
etagInResponse bool
longPoll bool
opaVendorMediaTypeEnabled bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need opaVendorMediaTypeEnabled ? Can we do without it ?

Copy link
Contributor Author

@floriangasc floriangasc Nov 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

longPoll is good. I think opaVendorMediaTypeEnabled should be a rest of some test i have try. Done.

@@ -663,6 +766,13 @@ func (t *testServer) handle(w http.ResponseWriter, r *http.Request) {
}
}

if (t.opaVendorMediaTypeEnabled) || (t.longPoll && !notModified) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we simplify this ? Also !notModified seems a bit weird so we can change that too.

if  !notModified {
    if t.longPoll {
          w.Header().Add("Content-Type", "application/vnd.openpolicyagent.bundles")
    } else {
         w.Header().Add("Content-Type", "application/gzip")  
    }
} 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to introduce contentTypeShouldBeSend which is better naming. and i a remove opaVendorMediaTypeEnabled
Let me known if it's ok for you.

…ling

download/download_test.go: testing about oneShot and download method

When 304 content-type should not be send. Fix the download code for switch to regular polling and keep the previous value.

Fixes: open-policy-agent#3923
Signed-off-by: Gasc Florian <florian.gasc@gmail.com>
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's just one last thing (see inline in download_test.go) and then we could merge this.

Thanks for your contribution 👏

@@ -2,6 +2,7 @@
// Use of this source code is governed by an Apache2
// license that can be found in the LICENSE file.

//go:build slow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'd be OK with keeping it. 🤷 It'll come in whenever someone's touching the code with go 1.17, and is thus unavoidable... and the right way forward.

@floriangasc
Copy link
Contributor Author

@srenatus
"I think there's just one last thing (see inline in download_test.go) and then we could merge this."

I don't understand what the last thing should be done. Can i give me more precision ?

@srenatus
Copy link
Contributor

srenatus commented Dec 9, 2021

Sure thing, sorry: #3926 (comment) from what I understood, the field is no longer needed, is it?

@srenatus srenatus merged commit bbe8afd into open-policy-agent:main Dec 10, 2021
@srenatus
Copy link
Contributor

@floriangasc I hope you don't mind that I've edited the commit message a bit while squashing. Thanks again for this contribution, and thanks for being so persistent, bearing with us for more than a month on this 🚀 👍

@floriangasc
Copy link
Contributor Author

floriangasc commented Jan 2, 2022

@srenatus i was very busy during last 2/3 weeks, sorry for delay.

« Thanks again for this contribution, and thanks for being so persistent, bearing with us for more than a month on this »: No problem. You have also lot of work. I appreciate all comments: be sure is it is most right/good decision/code :). Feel free to modify commit/code as any you want. I just have no response about the nil pointer in opa when there is a miss configuration between bundle sever and opa. I think better to have a more sefl explain error than a segfault.

Thank's to you for all ou work you have done. The code of opa is very easy to read and learn.

Opa software solve lot of problem for our in very clean and easy way. In some days/week i will probably come back to you. I wait some feedback from production environment before.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

information request: long polling seem's bugged on 304 http status code.
3 participants