Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry configuration is ignored after call to get_available_regions() #1698

Closed
nielslaukens opened this issue Mar 12, 2019 · 4 comments · Fixed by #1719
Closed

Retry configuration is ignored after call to get_available_regions() #1698

nielslaukens opened this issue Mar 12, 2019 · 4 comments · Fixed by #1719
Labels
bug This issue is a confirmed bug.

Comments

@nielslaukens
Copy link
Contributor

I'm having trouble to use a non-default retry-configuration. After some debugging, I've encountered some unexpected behaviour that I consider a bug.

Code to reproduce the issue:

import botocore.session
from botocore.config import Config

# Any valid Elastic Beanstalk Environment ARN to query
arn = "TODO: FILL IN"


botocore_session = botocore.session.get_session()

# When this call is made, retry-config is ignored...
_ = botocore_session.get_available_regions('elasticbeanstalk')

client = botocore_session.create_client(
    'elasticbeanstalk',
    region_name="eu-west-1",
    config=Config(retries={'max_attempts': 2}),
)

while True:
    res = client.list_tags_for_resource(
        ResourceArn=arn,
    )
    if res['ResponseMetadata']['RetryAttempts'] > 2:
        print(res)
        break

The code needs to run with valid credentials configured (I used environment variables for my tests, but I don't thing it matters). Also, you need to fill in a valid ARN of an Elastic Beanstalk Environment to query. (I think the bug is not related to Beanstalk, but that is what I've used to debug the issue on).

The expected behaviour is for this code to hammer the API, and hit the Throttle limit. The list_tags_for_resource() call will then raise a ClientError explaining the issue.

The actual behaviour is that this code enters the final if, and prints out a successfull response that was retried more than 2 times, while the config is set to try at most 2 times.

Commenting out the get_available_regions() call, yields the expected behaviour. So it seams that this call changes some internal state of the session that causes later issues.

@JordonPhillips
Copy link
Contributor

Interesting. I can reproduce. Looking into it.

@JordonPhillips JordonPhillips added investigating This issue is being investigated and/or work is in progress to resolve the issue. bug This issue is a confirmed bug. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Mar 26, 2019
@JordonPhillips
Copy link
Contributor

Looks like the problem is that get_available_regions calls get_service_datawhich ends up
emitting an event that results in retry handlers being registered which don't respect the config. We'll need to take a closer look to see what the impact of removing that would be

@jamesls
Copy link
Member

jamesls commented Apr 16, 2019

Just some additional info on the comment in the get_service_data handler

  • There's one operation specific configuration in our _retry.json file (kinesis DescribeStream) (https://github.com/boto/botocore/blob/develop/botocore/data/_retry.json) We could just make this a __default__ for kinesis as LimitExceededException doesn't seem specific to DescribeStream.
  • The CLI uses clients to make API calls so we'd run into the same problem if we ever decide to expose something like max_attempts in ~/.aws/config. get_service_data() is used in the CLI just to power the command tables and CLI parser.

A couple of other options we have if we don't want to remove the retry handler registration in handlers.py:

  • Both handler.py and client.py use the same unique id so we can call unregister() in the client before we register() our handler.
  • We could use a different unique_id in the client.py and use register_first().

However, my preference would be to remove the retry code in handlers.py. Operation level retries already don't work on clients, so to be affected you'd have to have a custom _retry.json that had operation specific retries, ensure it always gets loaded first via some call (even if implicitly called) to get_service_data(). Seems like we'd be fixing a consistency issue here.

What do you all think?

@jamesls
Copy link
Member

jamesls commented Apr 17, 2019

The fix to remove the retry handler in handlers.py was pretty simple so I went ahead and put a PR together (#1719). Probably needs a bit more investigation but the tests are passing and seems to be working ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants