-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SegmentNotFoundException patching boto3 #4
Comments
A segment represents a full request/response cycle and all the local work done will be recorded as subsegments and attached to this segment. SegmentNotFound means the SDK cannot correlate the S3 upload to a request for your Django app. The library django-storages you are using utilizes boto3's S3 multi-part upload to increase throughput. And this multi-part upload functionality uses a thread pool. You can see the code here. https://github.com/boto/s3transfer/blob/develop/s3transfer/manager.py This is a high-level API which internally uses multiple threads to upload to S3. When it creates a new thread and use that thread to make a call to S3, the SDK is still trying to create a subsegment which records useful information of this downstream call, but it lost the context due to in the new thread the local storage is empty. In order to work with S3 transfer manager, this manager also has to be patched so that whenever it creates a new thread it injects thread local storage to the new thread. More general, the SDK today is incompatible with any library that manages its own thread pool. Sorry for the confusion and we will update the doc for these special cases and looking into support them later. |
Feel free to re-open if further assistant needed. |
From EC2/Container-style queue-reading backend process using: from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all
xray_recorder.configure(context_missing='LOG_ERROR', plugins=('EC2Plugin', 'ECSPlugin', 'ElasticBeanstalkPlugin'))
patch_all()
...
xray_recorder.begin_segment(name="Test")
...
s3 = boto3.resource('s3', region_name='us-east-1')
bucket = s3.Bucket('bucket_name_here')
...
bucket.download_file(obj.key, filename)
...
xray_recorder.end_segment() fails with:
However, using ...
# Ensure that no threads are used.
xfer_config = TransferConfig(use_threads=False)
bucket.download_file(obj.key, filename, Config=xfer_config)
... corrects the exception, but is not an acceptable work around for performance reasons. Any insight on how to work around this issue with boto3 and downloading files using it's default threaded transfer manager? I see @haotianw465 mentions "this manager also has to be patched so that whenever it creates a new thread it injects thread local storage to the new thread". How do we go about this? |
Experiencing this problem at the moment... Has anyone found any workaround to go through this? |
From reading the S3 resource API We will find some time look deeper on this issue but we don't have capacity right now to provide a fully working solution. I suggest to start from monkey patching https://github.com/boto/boto3/blob/develop/boto3/s3/transfer.py#L142 By default this library If you monkey patch the Ideally we could have provide a single line opt-in patch and everything automatically works so you don't have to care about the underlying details. But the challenge of context binding for the SDK is that every single library utilizes ThreadPool needs to be configured (or worse case like this requires monkey patching) separately. |
Another bug revealed by @ShibaBandit 's stack trace is here https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/recorder.py#L312. |
may want to follow what's done in ddtrace, haven't heard of issues yet: https://github.com/DataDog/dd-trace-py/blob/master/ddtrace/contrib/botocore/patch.py, and https://github.com/DataDog/dd-trace-py/blob/master/ddtrace/contrib/boto/patch.py. There's also this opentracing movement, haven't looked too deeply into it yet. |
For X-Ray SDK has stronger semantic requirements on data format. It assumes an AWS call is a subtask from a node in the call chain thus always requires a parent. By X-Ray definition subsegments recording outbound http requests, sql queries and AWS calls cannot exist independently and it must be part of a logical task (processing an event, serving an incoming request etc). This topic has some overlap with opentracing discussion regarding backend data format. Please feel free to share your thoughts here https://forums.aws.amazon.com/thread.jspa?threadID=272498&tstart=0. This is a more general discussion which will have an impact to all of our SDKs. You are also welcome to open a new issue here about opentracing to discuss more specific questions. |
The bug is fixed as part of 0.96 release so that the exception doesn't pop up when your |
Thank you @haotianw465 . |
Close in favor of #52. |
I don't believe this issue is resolved. I'm having the same issue. |
I have the following requirements:
Django==1.11.6
boto3==1.4.7
django-storages==1.6.5
aws-xray-sdk==0.93
I want to store media files in an S3 bucket. To trace that I do:
After that, when I try to upload file to my site, a SegmentNotFoundException is raised.
If I do:
then all is good, but, obviously, there are no traces for boto3.
The xray middleware is in the list of middleware classes, and the xray app is in the INSTALLED_APPS list.
The text was updated successfully, but these errors were encountered: