Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to ignore some requests from httplib #263

Merged
merged 8 commits into from
Jan 4, 2021

Conversation

jonathangreen
Copy link
Contributor

Description of changes:
This generalizes the test that the httplib patch was doing to ignore requests to xray being made by botocore. I am using cloudwatch logging in my application, and these calls occur outside of the normal application flow, so I don't have a segment open for them.

This adds a new function add_ignored that allows requests to be ignored based on the hostname, url or subclass. It defaults to ignoring the same things as before, but now additional requests can be ignored.

I also added some tests for the new functionality.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-io
Copy link

codecov-io commented Dec 17, 2020

Codecov Report

Merging #263 (ff94da3) into master (a16ba30) will increase coverage by 0.19%.
The diff coverage is 96.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #263      +/-   ##
==========================================
+ Coverage   79.18%   79.37%   +0.19%     
==========================================
  Files          82       82              
  Lines        3223     3248      +25     
==========================================
+ Hits         2552     2578      +26     
+ Misses        671      670       -1     
Impacted Files Coverage Δ
aws_xray_sdk/ext/httplib/patch.py 81.66% <96.15%> (+5.87%) ⬆️
aws_xray_sdk/ext/httplib/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a16ba30...ff94da3. Read the comment docs.

@srprash
Copy link
Contributor

srprash commented Dec 30, 2020

@jonathangreen Thanks for this contribution! I'll give it a quick try. Could you also add a section in the README with an example for using the add_ignored method for ignoring requests?

@jonathangreen
Copy link
Contributor Author

@srprash Thanks for reviewing! I've added some documentation to the README with an example of using the add_ignored method.

Copy link
Contributor

@willarmiros willarmiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, thanks for the contribution! Couple small comments.

README.md Outdated

### Ignoring httplib requests

If you want to ignore certain httplib requests you can do so based on the hostname or URL that is being requsted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explicitly call out that this supports Unix-style regex or fnmatch regex (as opposed to re)? Also, can we add an example using subclass matching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willarmiros updated to add both of those items.

Comment on lines +45 to +47
def _ignored_add_default():
# skip httplib tracing for SDK built-in centralized sampling pollers
add_ignored(subclass='botocore.awsrequest.AWSHTTPConnection', urls=['/GetSamplingRules', '/SamplingTargets'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm conflicted here - I know that using Python's built-in string matching as we are now will be faster than using fnmatch, but it would be awkward to not use this ignore mechanism and special-case /GetSamplingRules and /SamplingTargets.

I guess the added latency is kinda peanuts compared to the actual network request, so it's probably ok, but what do you think @srprash?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel using fnmatch for ignoring /GetSamplingRules and /SamplingTarget is okay since it fits well in the overall mechanism to ignore any URL. I'm not really aware of the latency of fnmatch but my guess is that special casing the sampling urls and then matching the user urls would be roughly equivalent and won't cause much of a difference here.

@srprash
Copy link
Contributor

srprash commented Jan 4, 2021

@jonathangreen httplib/http.client is a low-level module and most of the people would use requests or urllib for making HTTP calls. I think the urllib module uses the low-level http.client underneath but I'm not sure about requests. Do you think this ignore mechanism would work for someone using requests or urllib as well?

@jonathangreen
Copy link
Contributor Author

@srprash I don't believe requests calles httplib under the hood, so this method likely won't work there.

It would be a nice future improvement though to add a similar method for ignoring calls that come from requests.

Copy link
Contributor

@willarmiros willarmiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We should expand this mechanism to other libs like requests but doesn't need to be in this PR probably.

@srprash
Copy link
Contributor

srprash commented Jan 4, 2021

@jonathangreen
Yeah, requests patch should have its own handling for this mechanism.
The urllib module is automatically patched if the httplib is patched since urllib builds upon httplib/http.client. I did a quick test with the following piece of code and it works! :)

from aws_xray_sdk.core import patch
from aws_xray_sdk.ext.httplib import add_ignored

libraries = (['httplib'])
patch(libraries)

import urllib.request
add_ignored(hostname="www.amazon.com")
with urllib.request.urlopen('http://www.amazon.com') as response:
    html = response.read()

Copy link
Contributor

@srprash srprash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@srprash srprash merged commit e121217 into aws:master Jan 4, 2021
Hargrav3s pushed a commit to Gavant/aws-xray-sdk-python that referenced this pull request Mar 22, 2022
* Expand ability to ignore some httplib calls.

* Add tests.

* Add glob match to httplib ignore hostname.

* Clean up httplib tests.

* Use full module path for subclass.

* Add documentation for ignoring httplib requests

* Code review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants